IBM Granite 3.0 is IBM’s latest suite of open-source AI language models tailored for enterprise applications. Released in October 2024, these models are designed to deliver high performance across various tasks, including code generation, time-series forecasting, and text processing.
Key Features of IBM Granite 3.0
Model Variants
Granite 3.0 offers diverse model variants to cater to various business needs:
- General Purpose Models: These include “Granite 3.0 8B Instruct,” “2B Instruct,” “8B Base,” and “2B Base.” They are designed for tasks such as Retrieval Augmented Generation (RAG), classification, summarization, entity extraction, and tool use.
- Guardrails & Safety Models: The Granite Guardian 3.0 8B and Granite Guardian 3.0 2B provide comprehensive risk and harm detection capabilities to ensure safe and trustworthy AI applications.
- Mixture-of-Experts Models: Models like “Granite 3.0 3B-A800M Instruct” and “1B-A400M Instruct” offer efficient inference and low latency, suitable for CPU-based deployments and edge computing.
Model Variant | Description | Use Cases |
---|---|---|
8B Instruct | General-purpose model optimized for core NLP tasks like summarization, entity extraction, and classification. | NLP tasks such as summarization, entity extraction, and classification. |
2B Instruct | Smaller variant of 8B Instruct, used for similar NLP tasks with lower resource requirements. | NLP tasks like summarization, entity extraction, and classification for resource-constrained environments. |
8B Base | General-purpose foundational model for NLP-related applications. | Foundational model used for pre-training and downstream NLP applications. |
2B Base | Lighter version of the 8B Base model. | Suitable for NLP model training and fine-tuning in resource-constrained environments. |
Granite Guardian 8B | Enhanced model with risk and harm detection capabilities. | Ensuring safe and compliant AI usage, trust, and risk reduction. |
Granite Guardian 2B | Lighter version of the Granite Guardian 8B model. | Risk and harm detection in environments with limited computational resources. |
3B-A800M Instruct | Mixture-of-Experts model for efficient and low-latency inference. | Ideal for CPU-based deployments and edge computing. |
1B-A400M Instruct | Compact model with efficient inference capabilities. | Suitable for CPU-based deployment and edge-based applications. |
Training and Data
Granite 3.0 was trained on an extensive dataset comprising 12 trillion tokens. This dataset spans 12 natural languages and 116 programming languages, ensuring robust multilingual and multi-programming language support. The training process employs a two-stage method, which enhances data quality and selection, ultimately leading to superior model performance.
Licensing
IBM Granite 3.0 is available under the permissive Apache 2.0 license. This licensing model provides businesses with the flexibility to customize and utilize the models freely, fostering innovation and adaptability in various applications.
Performance
Granite 3.0 matches or exceeds the performance of similar-sized open-source models on key academic and industry benchmarks. The model emphasizes strong accuracy, trustworthiness, and transparency, making it a reliable choice for enterprise AI solutions.
Safety and Trustworthiness
The Granite Guardian 3.0 models incorporate advanced guardrail capabilities. These features significantly reduce risks, ensure compliance with industry standards, and build trust in AI outputs, which is crucial for enterprise adoption.
Access and Integration
Businesses can access Granite 3.0 through multiple platforms, including Hugging Face, IBM Watsonx, and Red Hat Enterprise Linux AI. Additionally, users have the flexibility to deploy the model on local systems, cloud platforms like Hyperstack, or directly on IBM’s watsonx platform, facilitating seamless integration into existing workflows.
How to Set Up IBM Granite 3.0
Step 1: Choose the Model Variant
Begin by selecting a model variant that aligns with your specific use case. For example, the 8B Instruct model is ideal for NLP tasks such as summarization.
Step 2: Prerequisite
Ensure your environment is ready by following these steps:
- Hardware: Use a high-performance GPU, such as the NVIDIA A100, especially for larger models.
- Software: Install Python 3.8 or higher and create a virtual environment.
- Dependencies: Install the necessary libraries by running:
pip install torch torchvision torchaudio accelerate transformers
Step 3: Download the Model
Use the Transformers library to download the desired model:
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "ibm/granite-3.0-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map="auto")
model.eval()
Step 4: Input Preparation
Prepare your input in a chat-based format, especially for models that interact with external data:
chat = [
{"role": "user", "content": "List the IBM research labs in the United States."}
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
Step 5: Tokenization and Model Interaction
Tokenize the input and interact with the model:
input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)
Function Calling with IBM Granite 3.0
Granite 3.0 models can dynamically call functions, enabling interaction with external tools and APIs. This capability allows for real-time responses based on live data, such as stock prices or weather information, through a function call framework.
Example Functions
Stock Price Retrieval
import requests
def get_stock_price(ticker: str, date: str) -> dict:
stock_url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={ticker}&apikey=<API_KEY>"
stock_data = requests.get(stock_url).json()
low = stock_data['Time Series (Daily)'][date]['3. low']
high = stock_data['Time Series (Daily)'][date]['2. high']
return {"low": low, "high": high}
Weather Information Retrieval
import requests
def get_current_weather(location: str) -> dict:
weather_url = f"http://api.openweathermap.org/data/2.5/weather?q={location}&appid=<API_KEY>&units=metric"
weather_data = requests.get(weather_url).json()
return {"description": weather_data['weather'][0]['description'], "temperature": weather_data['main']['temp']}
Calling the Function
import json
output = '<function_call>{"name": "get_stock_price", "arguments": {"ticker": "IBM", "date": "2024-10-07"}}'
function_call = json.loads(output.split('<function_call>')[-1])
if function_call['name'] == 'get_stock_price':
result = get_stock_price(**function_call['arguments'])
print("Result:", result)
Use Cases for IBM Granite 3.0
Text Summarization
Generate concise summaries from extensive documents or articles, streamlining information consumption.
Customer Support
Develop chatbots that interact with users in real-time, utilizing API calls for live data to provide accurate and timely responses.
Code Generation and Completion
Leverage Granite 3.0’s training on 116 programming languages to assist developers with code generation and completion, enhancing productivity.
Business Analytics
Automate the extraction and analysis of key business metrics and trends from both structured and unstructured data, facilitating informed decision-making.
Risk Management
Utilize Granite Guardian models to detect potentially harmful or non-compliant behavior in AI outputs, ensuring adherence to regulatory standards.
Conclusion
IBM Granite 3.0 marks a significant advancement in enterprise AI technology. Its multi-model approach, enhanced safety features, and open-source licensing offer businesses the tools needed to build robust AI-driven applications. By enabling dynamic function calling, Granite 3.0 provides interactive and real-time responses, further expanding its capabilities. From enhancing customer support to automating business analytics, Granite 3.0 stands out as a pivotal solution in the evolving landscape of AI-driven enterprise applications.