Github Repo

A GGUF runtime server similar to ollama and vllm

Passing --model or env var MODEL_NAME in docker to tell it what model to load. You can pass the path to a gguf file.