LiteLLM

repository homepage

LiteLLM is an LLM API integration layer that provides homogenised access to commercial models including ChatGPT, Anthropic Claude, Groq LLama3 and Google Gemini via a single interface.

There are two ‘flavours’ that you can use: 1. A python library that can be imported into your program (similar to Langchain). 2. A proxy server that can combine multiple services and provides API Management capabilities including API key redistribution, team budgeting and usage monitoring.

Using LiteLLM SDK with Proxy Server

import os
 
from dotenv import load_dotenv
from litellm import completion
 
load_dotenv()
 
response = completion(
    model="openai/llama3.1-8b",
    messages=[
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ],
    api_key=os.environ['LLM_API_KEY'],
    base_url="https://litellm.example.com",
 
)
 
print(response)

LiteLLM Proxy Server

Docker Compose LiteLLM + PostgreSQL Configuration

  db:
    image: postgres:15
    restart: always
    shm_size: 128mb
    ports:
    - 5432:5432
    volumes:
    - ./data/postgres:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: somesecretpw
      POSTGRES_DB: litellm
      PGDATA: /var/lib/postgresql/data/pgdata
 
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    restart: always
    depends_on:
      - db
    ports:
       - 4444:4000
    volumes:
     - ./litellm/config.yaml:/app/config.yaml
    command: --port 4000 --config /app/config.yaml
    env_file: .env.docker
    environment:
      - DATABASE_URL=postgresql://postgres:somesecretpw@db:5432/litellm

We also provide a separate env file (.env.docker) which contains two important details:

Our master key which is used to log into the admin portal and to authenticate API calls
API keys for any external commercial models you might want to call.

LITELLM_MASTER_KEY=sk-somesecretvalue123
...
OPEN_API_KEY=sk-blahblah
ANTHROPIC_API_KEY=sk-blahblahblah
...

We can also pass a minimal config.yaml file which lists models that the user is allowed to access and also sets the master admin API key:

model_list:
  - model_name: claude-3-5-sonnet-20240620
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: claude-3-5-sonnet-20240620 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
     
  - model_name: claude-3-opus ### RECEIVED MODEL NAME ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: groq-llama-70b
    litellm_params:
      api_base: https://api.groq.com/openai/v1
      api_key: "os.environ/GROQ_API_KEY"
      model: openai/llama3-70b-8192

After first start we can add more models in the UI anyway.

Connecting To Open Web UI

See Open Web UI for instructions on integrating LiteLLM with Open Web UI. Also refer to See Complete Self Hosted AI Setup for a comprehensive guide to using these components in conjunction with ollama.

`drop_params` option

Some libraries pass unsupported parameters to models. We can tell litellm to strip those parameters

First get the UUID for the model from the endpoint

GET https://litellm.host.example.com/model/info

{
  "data": [
    {
      "model_name": "llama3.1-8b",
      "litellm_params": {
        "api_base": "http://something:11434",
        "model": "ollama/llama3.1:8b"
      },
      "model_info": {
        "id": "some-uuid-string-here",
        ... more stuff here
      }
    }
  ]
}

POST https://litellm.host.example.com/model/update

{
    "litellm_params": {
      "drop_params": true
    },
    "model_info": {
      "id":"some-uuid-string-here"
    }
}

This should prevent error messages when doing function calling and some extended functionality.

ollama function calling

In order to make litellm and ollama produce the right JSON structure for model tool calling you have to pass ollama_chat rather than ollama as part of the settings as per this post here:

  - model_name: ollama_chat/deepseek-r1:14b
    litellm_params:
      model: ollama_chat/deepseek-r1:14b
      api_base: "http://localhost:11434"

Brainsteam

Explorer

LiteLLM

Using LiteLLM SDK with Proxy Server

LiteLLM Proxy Server

Docker Compose LiteLLM + PostgreSQL Configuration

Connecting To Open Web UI

`drop_params` option

ollama function calling

Graph View

Table of Contents

Backlinks

Brainsteam

Explorer

LiteLLM

Using LiteLLM SDK with Proxy Server

LiteLLM Proxy Server

Docker Compose LiteLLM + PostgreSQL Configuration

Connecting To Open Web UI

drop_params option

ollama function calling

Graph View

Table of Contents

Backlinks

`drop_params` option