# LiteLLM Proxy Configuration
The [[LiteLLM]] proxy is a self-hosted [[AI Gateway]] shipped as a [[Docker]] image. It's configured through a single `config.yaml` file that declares available models, provider credentials, and gateway-wide settings. Runs on `http://0.0.0.0:4000` by default.
## Docker images
- `docker.litellm.ai/berriai/litellm:main-latest` — core proxy
- `ghcr.io/berriai/litellm-database:main-latest` — bundles [[PostgreSQL]] client support for virtual keys and spend tracking
- Non-root variant available for hardened deployments
## Minimal run command
```bash
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=your-key \
-e AZURE_API_BASE=your-base \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
```
## config.yaml structure
```yaml
model_list:
- model_name: gpt-4o # client-facing alias
litellm_params:
model: azure/my_azure_deployment # <provider>/<model-id>
api_base: os.environ/AZURE_API_BASE # env var reference
api_key: os.environ/AZURE_API_KEY
api_version: "2025-01-01-preview"
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4
api_key: os.environ/ANTHROPIC_API_KEY
general_settings:
master_key: sk-1234 # admin token, must start with sk-
database_url: "postgresql://user:pw@host:5432/litellm"
litellm_settings:
drop_params: true # drop unsupported params silently
num_retries: 3
request_timeout: 600
```
### Key sections
- **`model_list`** — every model the gateway will serve. `model_name` is the public alias clients pass; `litellm_params.model` uses the `<provider>/<model-id>` routing prefix. Multiple entries can share the same `model_name` for load balancing across deployments.
- **`general_settings`** — `master_key` (admin auth), `database_url` (enables virtual keys + spend tracking), SSO / JWT config.
- **`litellm_settings`** — SDK-level behavior: retries, timeouts, caching, callbacks, fallbacks.
- **`router_settings`** — routing strategy (`simple-shuffle`, `least-busy`, `usage-based-routing-v2`, `latency-based-routing`).
## Virtual keys
With `database_url` configured, the master key can mint scoped keys via the admin API:
```bash
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"models": ["gpt-4o"], "rpm_limit": 60, "max_budget": 25.0, "duration": "30d"}'
```
Each virtual key can restrict models, set RPM/TPM caps, enforce budgets, and tie spend to a user or team.
## Production stack
A typical compose setup bundles:
- `docker-compose.yml` — LiteLLM proxy + Postgres (+ optional Redis for caching)
- `config.yaml` — model list and gateway settings
- `.env` — master key, salt key, provider credentials
- `prometheus.yml` — metrics scrape config (LiteLLM exposes `/metrics`)
All files must exist before `docker compose up` or the container exits on startup.
## Client usage
Any OpenAI-compatible SDK ([[Python]], Node, [[LangChain]], curl) works by pointing `base_url` at the proxy and using a virtual key as the API key:
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000", api_key="sk-virtual-key")
client.chat.completions.create(model="claude-sonnet", messages=[...])
```
## References
- https://docs.litellm.ai/docs
- https://docs.litellm.ai/docs/proxy/docker_quick_start
- https://docs.litellm.ai/docs/proxy/configs
## Related
- [[LiteLLM]]
- [[LiteLLM Claude Code Proxy]]
- [[AI Gateway]]
- [[Docker]]
- [[PostgreSQL]]
- [[Model routing]]
- [[AI Observability]]