claude-code-ollama-adapter

A FastAPI proxy that lets Claude Code (and any Anthropic or OpenAI-compatible client) talk directly to Ollama's native /api/chat endpoint.

Exposes two fully-compatible endpoints:

/v1/messages — Anthropic Messages API (used natively by Claude Code)
/v1/chat/completions — OpenAI Chat Completions API (used by LiteLLM, OpenAI SDKs, etc.)

Both translate transparently to Ollama's /api/chat format, handling streaming, tool calls, and optional think reasoning injection.

Architecture

This adapter supports two operating modes:

Mode 1: Direct Mode (Recommended for Local Development)

Claude Code
    |
    v
claude-code-ollama-adapter :4000 --> Ollama :11434/api/chat
                            (think: true injected)

Bypasses LiteLLM - No budget limits, no authentication required
Use case: Local development, prototyping, testing
Start: uvicorn proxy:app --host 0.0.0.0 --port 4000

Mode 2: LiteLLM Mode (For Production/Team Use)

Claude Code / other clients
    |
    v
LiteLLM proxy :4001 (auth, routing, logging)
    |
    +-- openai/* --> Ollama :11434/v1 (Kimi, MiniMax, local coders)
    |
    +-- glm-5:cloud --> claude-code-ollama-adapter :4000 --> Ollama :11434/api/chat
                                                         (think: true injected)

Managed by LiteLLM - Auth, routing, logging, spend tracking
Use case: Team environments, production deployments
Start: litellm --config litellm_config.yaml --port 4001

Features

Anthropic /v1/messages endpoint — full Claude Code compatibility (streaming + tool use)
OpenAI /v1/chat/completions endpoint — LiteLLM / OpenAI SDK compatible
Tool / function calling — translates both directions (Anthropic tool_use ↔ Ollama tool_calls)
Reasoning / thinking support — opt-in think: true injection for GLM-5:cloud and configurable models
Full streaming (SSE) and non-streaming support
/v1/models — proxies Ollama's model list in OpenAI format
/health — health check endpoint
Zero config needed — sensible defaults, everything overridable via env vars
Single file proxy.py, ~350 lines, easy to read and modify
Docker support — includes Dockerfile
CI/CD — GitHub Actions workflow for automated testing

Quick Start

1. Clone the repo

git clone https://github.com/T72/claude-code-ollama-adapter.git
cd claude-code-ollama-adapter

2. Run with Docker (Recommended)

docker build -t claude-code-ollama-adapter .
docker run -p 4000:4000 claude-code-ollama-adapter

The proxy will be available at http://localhost:4000.

3. Manual Installation

pip install -r requirements.txt
uvicorn proxy:app --host 0.0.0.0 --port 4000

Usage with Claude Code

Point Claude Code at the adapter instead of Anthropic directly:

export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=ollama
claude --model qwen3-coder:latest

Usage with LiteLLM

In your litellm_config.yaml, route specific models through the adapter:

model_list:
  - model_name: glm-5:cloud
    litellm_params:
      model: openai/glm-5:cloud
      api_base: http://localhost:4000
      api_key: ollama

See litellm_config.yaml for a full example.

Direct Mode (Skip LiteLLM)

For local development without LiteLLM overhead, you can bypass LiteLLM entirely and connect Claude Code directly to the adapter. This is recommended when:

You want to avoid budget/spending limits imposed by LiteLLM
You don't need authentication or spend tracking
You want the simplest possible setup for local prototyping

Setup

# Terminal 1: Start the adapter
uvicorn proxy:app --host 0.0.0.0 --port 4000

# Terminal 2: Configure and run Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=ollama
claude --model qwen3-coder:latest

Using the Helper Script

For convenience, use the provided template script:

# Copy the template
cp direct-mode.sh.example direct-mode.sh

# Customize if needed (optional)
# Edit direct-mode.sh to adjust THINK_MODELS or other settings

# Source the configuration
source direct-mode.sh

# Start the adapter (in another terminal)
uvicorn proxy:app --host 0.0.0.0 --port 4000

# Run Claude Code with any local model
claude --model qwen3-coder:latest
claude --model kimi-k2:latest
claude --model llama3:latest

Using Aliases (Recommended for Switching Modes)

For convenient switching between LiteLLM and Direct Mode, add these aliases to your ~/.bashrc:

# For LiteLLM mode (port 4100) - for cloud models like glm-5:cloud
alias claude-local='ANTHROPIC_BASE_URL=http://localhost:4100 ANTHROPIC_API_KEY=sk-local-free claude'

# For Direct Mode (port 4000) - for local Ollama models
alias claude-direct='ANTHROPIC_BASE_URL=http://localhost:4000 ANTHROPIC_API_KEY=ollama claude'

Then you can easily switch between modes:

claude-local --model glm-5:cloud      # Uses LiteLLM on port 4100
claude-direct --model qwen3:14b         # Uses adapter directly on port 4000

When to Use Direct Mode vs LiteLLM Mode

Feature	Direct Mode	LiteLLM Mode
Budget limits	None	Configurable
Authentication	None	Virtual keys
Spend tracking	No	Yes
Setup complexity	Minimal	Requires config
Best for	Local dev, prototyping	Production, teams

Configuration

Environment variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL
`THINK_MODELS`	(empty)	Comma-separated model names that get `think: true` injected. Models not in this set never forward `think` to Ollama.

LiteLLM Modes

This repo supports two LiteLLM operating modes:

Mode 1: DB-Free Local Mode (Default)

Use litellm_config.yaml for local development without a database.

No authentication - requests pass through directly
Works out of the box - just run litellm --config litellm_config.yaml
Use case: Local development, testing, prototyping

litellm --config litellm_config.yaml --port 4001

Mode 2: DB-Backed Secure Mode (Optional)

Use litellm_config.secure.example.yaml for production or shared environments requiring authentication.

Virtual key authentication - requires master_key + DATABASE_URL
PostgreSQL is required (secure mode does not run without a PostgreSQL database)
Spend tracking & rate limiting - built-in LiteLLM features
Use case: Production deployments, team environments

# Set required environment variables
export DATABASE_URL="postgresql://user:password@localhost:5432/litellm"
export LITELLM_MASTER_KEY="sk-your-secure-key"

# Start LiteLLM with secure config
litellm --config litellm_config.secure.example.yaml --port 4001

⚠️ Warning: Setting master_key without DATABASE_URL will cause all requests to fail with No connected db. error. Use the default litellm_config.yaml for local development.

Testing

pip install pytest httpx fastapi[testclient]
pytest tests/

Contributing

Contributions are welcome! Please read CONTRIBUTING.md before submitting a pull request.

Security

To report a vulnerability, please see SECURITY.md.

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-code-ollama-adapter

Architecture

Mode 1: Direct Mode (Recommended for Local Development)

Mode 2: LiteLLM Mode (For Production/Team Use)

Features

Quick Start

1. Clone the repo

2. Run with Docker (Recommended)

3. Manual Installation

Usage with Claude Code

Usage with LiteLLM

Direct Mode (Skip LiteLLM)

Setup

Using the Helper Script

Using Aliases (Recommended for Switching Modes)

When to Use Direct Mode vs LiteLLM Mode

Configuration

LiteLLM Modes

Mode 1: DB-Free Local Mode (Default)

Mode 2: DB-Backed Secure Mode (Optional)

Testing

Contributing

Security

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.continue		.continue
.github		.github
docs		docs
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
direct-mode.sh		direct-mode.sh
direct-mode.sh.example		direct-mode.sh.example
litellm_config.secure.example.yaml		litellm_config.secure.example.yaml
litellm_config.yaml		litellm_config.yaml
proxy.py		proxy.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

claude-code-ollama-adapter

Architecture

Mode 1: Direct Mode (Recommended for Local Development)

Mode 2: LiteLLM Mode (For Production/Team Use)

Features

Quick Start

1. Clone the repo

2. Run with Docker (Recommended)

3. Manual Installation

Usage with Claude Code

Usage with LiteLLM

Direct Mode (Skip LiteLLM)

Setup

Using the Helper Script

Using Aliases (Recommended for Switching Modes)

When to Use Direct Mode vs LiteLLM Mode

Configuration

LiteLLM Modes

Mode 1: DB-Free Local Mode (Default)

Mode 2: DB-Backed Secure Mode (Optional)

Testing

Contributing

Security

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages