repository
homepage
LiteLLM is an LLM API integration layer that provides homogenised access to commercial models including ChatGPT , Anthropic Claude , Groq LLama3 and Google Gemini via a single interface.
There are two ‘flavours’ that you can use:
1. A python library that can be imported into your program (similar to Langchain ).
2. A proxy server that can combine multiple services and provides API Management capabilities including API key redistribution, team budgeting and usage monitoring.
Using LiteLLM SDK with Proxy Server
import os
from dotenv import load_dotenv
from litellm import completion
load_dotenv()
response = completion(
model = "openai/llama3.1-8b" ,
messages = [
{
"role" : "user" ,
"content" : "this is a test request, write a short poem"
}
],
api_key = os.environ[ 'LLM_API_KEY' ],
base_url = "https://litellm.example.com" ,
)
print (response)
LiteLLM Proxy Server
Docker Compose LiteLLM + PostgreSQL Configuration
db :
image : postgres:15
restart : always
shm_size : 128mb
ports :
- 5432:5432
volumes :
- ./data/postgres:/var/lib/postgresql/data
environment :
POSTGRES_PASSWORD : somesecretpw
POSTGRES_DB : litellm
PGDATA : /var/lib/postgresql/data/pgdata
litellm :
image : ghcr.io/berriai/litellm:main-latest
restart : always
depends_on :
- db
ports :
- 4444:4000
volumes :
- ./litellm/config.yaml:/app/config.yaml
command : --port 4000 --config /app/config.yaml
env_file : .env.docker
environment :
- DATABASE_URL=postgresql://postgres:somesecretpw@db:5432/litellm
We also provide a separate env file (.env.docker
) which contains two important details:
Our master key which is used to log into the admin portal and to authenticate API calls
API keys for any external commercial models you might want to call.
LITELLM_MASTER_KEY=sk-somesecretvalue123
...
OPEN_API_KEY=sk-blahblah
ANTHROPIC_API_KEY=sk-blahblahblah
...
We can also pass a minimal config.yaml
file which lists models that the user is allowed to access and also sets the master admin API key:
model_list :
- model_name : claude-3-5-sonnet-20240620
litellm_params : # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
model : claude-3-5-sonnet-20240620 ### MODEL NAME sent to `litellm.completion()` ###
api_key : "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
- model_name : claude-3-opus ### RECEIVED MODEL NAME ###
litellm_params : # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
model : claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
api_key : "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
- model_name : gpt-4
litellm_params :
model : openai/gpt-4
api_key : "os.environ/OPENAI_API_KEY"
- model_name : gpt-4o
litellm_params :
model : openai/gpt-4o
api_key : "os.environ/OPENAI_API_KEY"
- model_name : groq-llama-70b
litellm_params :
api_base : https://api.groq.com/openai/v1
api_key : "os.environ/GROQ_API_KEY"
model : openai/llama3-70b-8192
After first start we can add more models in the UI anyway.
See Open Web UI for instructions on integrating LiteLLM with Open Web UI. Also refer to See Complete Self Hosted AI Setup for a comprehensive guide to using these components in conjunction with ollama .
drop_params
option
Some libraries pass unsupported parameters to models. We can tell litellm to strip those parameters
First get the UUID for the model from the endpoint
GET https://litellm.host.example.com/model/info
{
"data" : [
{
"model_name" : "llama3.1-8b" ,
"litellm_params" : {
"api_base" : "http://something:11434" ,
"model" : "ollama/llama3.1:8b"
},
"model_info" : {
"id" : "some-uuid-string-here" ,
... more stuff here
}
}
]
}
POST https://litellm.host.example.com/model/update
{
"litellm_params" : {
"drop_params" : true
},
"model_info" : {
"id" : "some-uuid-string-here"
}
}
This should prevent error messages when doing function calling and some extended functionality.
ollama function calling