repository homepage

LiteLLM is an LLM API integration layer that provides homogenised access to commercial models including ChatGPT, Anthropic Claude, Groq LLama3 and Google Gemini via a single interface.

There are two ‘flavours’ that you can use: 1. A python library that can be imported into your program (similar to Langchain). 2. A proxy server that can combine multiple services and provides API Management capabilities including API key redistribution, team budgeting and usage monitoring.

Using LiteLLM SDK with Proxy Server

import os
 
from dotenv import load_dotenv
from litellm import completion
 
load_dotenv()
 
response = completion(
    model="openai/llama3.1-8b",
    messages=[
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ],
    api_key=os.environ['LLM_API_KEY'],
    base_url="https://litellm.example.com",
 
)
 
print(response)
 

LiteLLM Proxy Server

Docker Compose LiteLLM + PostgreSQL Configuration

  db:
    image: postgres:15
    restart: always
    shm_size: 128mb
    ports:
    - 5432:5432
    volumes:
    - ./data/postgres:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: somesecretpw
      POSTGRES_DB: litellm
      PGDATA: /var/lib/postgresql/data/pgdata
 
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    restart: always
    depends_on:
      - db
    ports:
       - 4444:4000
    volumes:
     - ./litellm/config.yaml:/app/config.yaml
    command: --port 4000 --config /app/config.yaml
    env_file: .env.docker
    environment:
      - DATABASE_URL=postgresql://postgres:somesecretpw@db:5432/litellm
 

We also provide a separate env file (.env.docker) which contains two important details:

  1. Our master key which is used to log into the admin portal and to authenticate API calls
  2. API keys for any external commercial models you might want to call.
LITELLM_MASTER_KEY=sk-somesecretvalue123
...
OPEN_API_KEY=sk-blahblah
ANTHROPIC_API_KEY=sk-blahblahblah
...

We can also pass a minimal config.yaml file which lists models that the user is allowed to access and also sets the master admin API key:

model_list:
  - model_name: claude-3-5-sonnet-20240620
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: claude-3-5-sonnet-20240620 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
     
  - model_name: claude-3-opus ### RECEIVED MODEL NAME ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: groq-llama-70b
    litellm_params:
      api_base: https://api.groq.com/openai/v1
      api_key: "os.environ/GROQ_API_KEY"
      model: openai/llama3-70b-8192
 

After first start we can add more models in the UI anyway.

Connecting To Open Web UI

See Open Web UI for instructions on integrating LiteLLM with Open Web UI. Also refer to See Complete Self Hosted AI Setup for a comprehensive guide to using these components in conjunction with ollama.

drop_params option

Some libraries pass unsupported parameters to models. We can tell litellm to strip those parameters

First get the UUID for the model from the endpoint

GET https://litellm.host.example.com/model/info

{
  "data": [
    {
      "model_name": "llama3.1-8b",
      "litellm_params": {
        "api_base": "http://something:11434",
        "model": "ollama/llama3.1:8b"
      },
      "model_info": {
        "id": "some-uuid-string-here",
        ... more stuff here
      }
    }
  ]
}

POST https://litellm.host.example.com/model/update

{
    "litellm_params": {
      "drop_params": true
    },
    "model_info": {
      "id":"some-uuid-string-here"
    }
}

This should prevent error messages when doing function calling and some extended functionality.

ollama function calling