A FastAPI proxy that lets Claude Code (and any Anthropic or OpenAI-compatible client) talk directly to Ollama's native
/api/chatendpoint.
Exposes two fully-compatible endpoints:
/v1/messages— Anthropic Messages API (used natively by Claude Code)/v1/chat/completions— OpenAI Chat Completions API (used by LiteLLM, OpenAI SDKs, etc.)
Both translate transparently to Ollama's /api/chat format, handling streaming, tool calls, and optional think reasoning injection.
This adapter supports two operating modes:
Claude Code
|
v
claude-code-ollama-adapter :4000 --> Ollama :11434/api/chat
(think: true injected)
- Bypasses LiteLLM - No budget limits, no authentication required
- Use case: Local development, prototyping, testing
- Start:
uvicorn proxy:app --host 0.0.0.0 --port 4000
Claude Code / other clients
|
v
LiteLLM proxy :4001 (auth, routing, logging)
|
+-- openai/* --> Ollama :11434/v1 (Kimi, MiniMax, local coders)
|
+-- glm-5:cloud --> claude-code-ollama-adapter :4000 --> Ollama :11434/api/chat
(think: true injected)
- Managed by LiteLLM - Auth, routing, logging, spend tracking
- Use case: Team environments, production deployments
- Start:
litellm --config litellm_config.yaml --port 4001
- Anthropic
/v1/messagesendpoint — full Claude Code compatibility (streaming + tool use) - OpenAI
/v1/chat/completionsendpoint — LiteLLM / OpenAI SDK compatible - Tool / function calling — translates both directions (Anthropic
tool_use↔ Ollamatool_calls) - Reasoning / thinking support — opt-in
think: trueinjection for GLM-5:cloud and configurable models - Full streaming (SSE) and non-streaming support
/v1/models— proxies Ollama's model list in OpenAI format/health— health check endpoint- Zero config needed — sensible defaults, everything overridable via env vars
- Single file
proxy.py, ~350 lines, easy to read and modify - Docker support — includes
Dockerfile - CI/CD — GitHub Actions workflow for automated testing
git clone https://github.com/T72/claude-code-ollama-adapter.git
cd claude-code-ollama-adapterdocker build -t claude-code-ollama-adapter .
docker run -p 4000:4000 claude-code-ollama-adapterThe proxy will be available at http://localhost:4000.
pip install -r requirements.txt
uvicorn proxy:app --host 0.0.0.0 --port 4000Point Claude Code at the adapter instead of Anthropic directly:
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=ollama
claude --model qwen3-coder:latestIn your litellm_config.yaml, route specific models through the adapter:
model_list:
- model_name: glm-5:cloud
litellm_params:
model: openai/glm-5:cloud
api_base: http://localhost:4000
api_key: ollamaSee litellm_config.yaml for a full example.
For local development without LiteLLM overhead, you can bypass LiteLLM entirely and connect Claude Code directly to the adapter. This is recommended when:
- You want to avoid budget/spending limits imposed by LiteLLM
- You don't need authentication or spend tracking
- You want the simplest possible setup for local prototyping
# Terminal 1: Start the adapter
uvicorn proxy:app --host 0.0.0.0 --port 4000
# Terminal 2: Configure and run Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=ollama
claude --model qwen3-coder:latestFor convenience, use the provided template script:
# Copy the template
cp direct-mode.sh.example direct-mode.sh
# Customize if needed (optional)
# Edit direct-mode.sh to adjust THINK_MODELS or other settings
# Source the configuration
source direct-mode.sh
# Start the adapter (in another terminal)
uvicorn proxy:app --host 0.0.0.0 --port 4000
# Run Claude Code with any local model
claude --model qwen3-coder:latest
claude --model kimi-k2:latest
claude --model llama3:latestFor convenient switching between LiteLLM and Direct Mode, add these aliases to your ~/.bashrc:
# For LiteLLM mode (port 4100) - for cloud models like glm-5:cloud
alias claude-local='ANTHROPIC_BASE_URL=http://localhost:4100 ANTHROPIC_API_KEY=sk-local-free claude'
# For Direct Mode (port 4000) - for local Ollama models
alias claude-direct='ANTHROPIC_BASE_URL=http://localhost:4000 ANTHROPIC_API_KEY=ollama claude'Then you can easily switch between modes:
claude-local --model glm-5:cloud # Uses LiteLLM on port 4100
claude-direct --model qwen3:14b # Uses adapter directly on port 4000| Feature | Direct Mode | LiteLLM Mode |
|---|---|---|
| Budget limits | None | Configurable |
| Authentication | None | Virtual keys |
| Spend tracking | No | Yes |
| Setup complexity | Minimal | Requires config |
| Best for | Local dev, prototyping | Production, teams |
| Environment variable | Default | Description |
|---|---|---|
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server URL |
THINK_MODELS |
(empty) | Comma-separated model names that get think: true injected. Models not in this set never forward think to Ollama. |
This repo supports two LiteLLM operating modes:
Use litellm_config.yaml for local development without a database.
- No authentication - requests pass through directly
- Works out of the box - just run
litellm --config litellm_config.yaml - Use case: Local development, testing, prototyping
litellm --config litellm_config.yaml --port 4001Use litellm_config.secure.example.yaml for production or shared environments requiring authentication.
- Virtual key authentication - requires
master_key+DATABASE_URL - PostgreSQL is required (secure mode does not run without a PostgreSQL database)
- Spend tracking & rate limiting - built-in LiteLLM features
- Use case: Production deployments, team environments
# Set required environment variables
export DATABASE_URL="postgresql://user:password@localhost:5432/litellm"
export LITELLM_MASTER_KEY="sk-your-secure-key"
# Start LiteLLM with secure config
litellm --config litellm_config.secure.example.yaml --port 4001
⚠️ Warning: Settingmaster_keywithoutDATABASE_URLwill cause all requests to fail withNo connected db.error. Use the defaultlitellm_config.yamlfor local development.
pip install pytest httpx fastapi[testclient]
pytest tests/Contributions are welcome! Please read CONTRIBUTING.md before submitting a pull request.
To report a vulnerability, please see SECURITY.md.
MIT — see LICENSE.