Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
243 changes: 18 additions & 225 deletions README.md

Large diffs are not rendered by default.

24 changes: 4 additions & 20 deletions docs/advanced/admin-endpoints.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,26 +17,10 @@ Both are on by default because observability shouldn't be opt-in. If you don't n

## Configuration

| Variable | Description | Default |
| ------------------------------------- | ------------------------------------------------ | ------- |
| `ADMIN_ENDPOINTS_ENABLED` | Enable the admin REST API | `true` |
| `ADMIN_UI_ENABLED` | Enable the admin dashboard UI | `true` |
| `DASHBOARD_LIVE_LOGS_ENABLED` | Stream realtime dashboard audit/usage previews | `true` |
| `DASHBOARD_LIVE_LOGS_BUFFER_SIZE` | In-memory replay window for live dashboard events | `10000` |
| `DASHBOARD_LIVE_LOGS_REPLAY_LIMIT` | Max events replayed to one reconnecting client | `1000` |
| `DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS` | Idle stream heartbeat interval in seconds | `15` |

Or in YAML:

```yaml
admin:
endpoints_enabled: true
ui_enabled: true
live_logs_enabled: true
live_logs_buffer_size: 10000
live_logs_replay_limit: 1000
live_logs_heartbeat_seconds: 15
```
Admin and dashboard behavior is controlled by environment variables (or the
equivalent `admin:` YAML block). See
[Admin configuration](/advanced/configuration#admin) for the full table of
variables, defaults, and the equivalent `admin:` YAML block.

<Note>
The dashboard UI requires the REST API to be enabled. If you set
Expand Down
73 changes: 73 additions & 0 deletions docs/advanced/api-endpoints.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: "API Endpoints"
description: "Reference for GoModel's OpenAI-compatible and Anthropic-compatible endpoints, provider passthrough, and operations routes."
icon: "list-tree"
---

GoModel exposes OpenAI-compatible and Anthropic-compatible APIs, provider-native
passthrough, and a set of operations endpoints. Admin and dashboard routes are
documented separately in [Admin Endpoints](/advanced/admin-endpoints).

For request and response details, see the dedicated guides:
[Responses API](/advanced/responses-api), [Conversations API](/advanced/conversations-api),
[Anthropic Messages API](/advanced/anthropic-messages-api), and
[Audio API](/advanced/audio-api).

## OpenAI-Compatible API

| Endpoint | Method | Description |
| --------------------------------- | ------ | ------------------------------------------------------------------------------------------------------------ |
| `/v1/chat/completions` | POST | Chat completions (streaming supported) |
| `/v1/responses` | POST | Create an OpenAI Responses API response |
| `/v1/responses/{id}` | GET | Retrieve a stored response |
| `/v1/responses/{id}` | DELETE | Delete a stored response (forwards native deletion where supported) |
| `/v1/responses/{id}/cancel` | POST | Cancel an in-progress response (provider-native where supported) |
| `/v1/responses/{id}/input_items` | GET | List the input items of a stored response |
| `/v1/responses/input_tokens` | POST | Count input tokens for a Responses request |
| `/v1/responses/compact` | POST | Compact a Responses conversation (provider-native where supported) |
| `/v1/conversations` | POST | Create a conversation (gateway-managed) |
| `/v1/conversations/{id}` | GET | Retrieve a conversation |
| `/v1/conversations/{id}` | POST | Replace conversation metadata in full |
| `/v1/conversations/{id}` | DELETE | Delete a conversation |
| `/v1/embeddings` | POST | Text embeddings |
| `/v1/models` | GET | List available models |
| `/v1/audio/speech` | POST | Text-to-speech, returning binary audio |
| `/v1/audio/transcriptions` | POST | Speech-to-text from a multipart upload |
| `/v1/realtime` | GET | Realtime speech-to-speech websocket upgrade (when `REALTIME_ENABLED`) |
| `/v1/files` | POST | Upload a file (OpenAI-compatible multipart) |
| `/v1/files` | GET | List files |
| `/v1/files/{id}` | GET | Retrieve file metadata |
| `/v1/files/{id}` | DELETE | Delete a file |
| `/v1/files/{id}/content` | GET | Retrieve raw file content |
| `/v1/batches` | POST | Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native) |
| `/v1/batches` | GET | List stored batches |
| `/v1/batches/{id}` | GET | Retrieve one stored batch |
| `/v1/batches/{id}/cancel` | POST | Cancel a pending batch |
| `/v1/batches/{id}/results` | GET | Retrieve native batch results when available |

## Anthropic-Compatible API

| Endpoint | Method | Description |
| --------------------------- | ------ | ----------------------------------------------------------------------------- |
| `/v1/messages` | POST | Anthropic Messages API through translated model routing (streaming supported) |
| `/v1/messages/count_tokens` | POST | Heuristic Anthropic Messages input token estimate |
Comment thread
greptile-apps[bot] marked this conversation as resolved.

## Provider Passthrough

| Endpoint | Method | Description |
| ------------------- | -------------------------------------------- | ---------------------------------------------------------- |
| `/p/{provider}/...` | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses |

## Admin Endpoints

Admin REST and dashboard routes (`/admin/*`) are covered in
[Admin Endpoints](/advanced/admin-endpoints).

## Operations Endpoints

| Endpoint | Method | Description |
| --------------------- | ------ | ---------------------------------------------------------------------------------- |
| `/health` | GET | Liveness check (always 200 while the process serves) |
| `/health/ready` | GET | Readiness check: pings storage (503 if down) and Redis cache (degraded, still 200) |
| `/metrics` | GET | Prometheus metrics (experimental, when enabled) |
| `/swagger/index.html` | GET | Swagger UI (when enabled) |
1 change: 1 addition & 0 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
"advanced/configuration",
"advanced/config-yaml",
"advanced/cli",
"advanced/api-endpoints",
"advanced/resilience",
"advanced/responses-api",
"advanced/responses-compatibility",
Expand Down
85 changes: 62 additions & 23 deletions docs/providers/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,30 +13,69 @@ quirks.

## Supported providers

| Provider | Credential | Guide |
| -------- | ---------- | ----- |
| OpenAI | `OPENAI_API_KEY` | — |
| Anthropic | `ANTHROPIC_API_KEY` | [Anthropic](/providers/anthropic) |
| Google Gemini | `GEMINI_API_KEY` | [Google Gemini](/providers/gemini) |
| Google Vertex AI | `VERTEX_PROJECT` + `VERTEX_LOCATION` + GCP credentials | [Google Vertex AI](/providers/vertex) |
| DeepSeek | `DEEPSEEK_API_KEY` | [DeepSeek](/providers/deepseek) |
| Groq | `GROQ_API_KEY` | — |
| OpenRouter | `OPENROUTER_API_KEY` | — |
| Z.ai | `ZAI_API_KEY` (`ZAI_BASE_URL` optional) | — |
| xAI (Grok) | `XAI_API_KEY` | — |
| MiniMax | `MINIMAX_API_KEY` (`MINIMAX_BASE_URL` optional) | — |
| Alibaba Cloud Model Studio (Bailian) | `BAILIAN_API_KEY` (`BAILIAN_BASE_URL` optional) | [Alibaba Cloud Model Studio](/providers/bailian) |
| Xiaomi MiMo | `XIAOMI_API_KEY` (`XIAOMI_BASE_URL` optional) | [Xiaomi MiMo](/providers/xiaomi) |
| OpenCode Go | `OPENCODE_GO_API_KEY` (`OPENCODE_GO_BASE_URL` optional) | [OpenCode Go](/providers/opencode-go) |
| Azure OpenAI | `AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional) | [Azure OpenAI](/providers/azure) |
| Amazon Bedrock | `BEDROCK_BASE_URL` (region or endpoint) + AWS credentials | [Amazon Bedrock](/providers/bedrock) |
| Oracle GenAI | `ORACLE_API_KEY` + `ORACLE_BASE_URL` | [Oracle GenAI](/providers/oracle) |
| Ollama | `OLLAMA_BASE_URL` | [Ollama](/providers/multiple-ollama) |
| vLLM | `VLLM_BASE_URL` (`VLLM_API_KEY` optional) | [vLLM](/providers/vllm) |
Example model identifiers are illustrative and subject to change; consult
provider catalogs for current models. Feature columns reflect gateway API
support, not every individual model capability exposed by an upstream provider.

See the [README provider table](https://github.com/ENTERPILOT/GoModel#supported-llm-providers)
for per-provider feature support (chat, Responses, embeddings, files, batches,
passthrough).
| Provider | Credential | Example Model | Chat | `/responses` | Embed | Files | Batches | Passthru | Guide |
| -------- | ---------- | ------------- | :--: | :----------: | :---: | :---: | :-----: | :------: | ----- |
| OpenAI | `OPENAI_API_KEY` | `gpt-5.5` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
| Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-20250514` | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | [Anthropic](/providers/anthropic) |
| Google Gemini | `GEMINI_API_KEY` | `gemini-2.5-flash` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | [Google Gemini](/providers/gemini) |
| Google Vertex AI | `VERTEX_PROJECT` + `VERTEX_LOCATION` + GCP credentials | `google/gemini-2.5-flash` | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | [Google Vertex AI](/providers/vertex) |
| DeepSeek | `DEEPSEEK_API_KEY` | `deepseek-v4-pro` | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | [DeepSeek](/providers/deepseek) |
| Groq | `GROQ_API_KEY` | `llama-3.3-70b-versatile` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | — |
| OpenRouter | `OPENROUTER_API_KEY` | `google/gemini-2.5-flash` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
| Z.ai | `ZAI_API_KEY` (`ZAI_BASE_URL` optional) | `glm-5.1` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | — |
| xAI (Grok) | `XAI_API_KEY` | `grok-4` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | — |
| Alibaba Cloud Model Studio (Bailian) | `BAILIAN_API_KEY` (`BAILIAN_BASE_URL` optional) | `qwen3-max` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | [Alibaba Cloud Model Studio](/providers/bailian) |
| MiniMax | `MINIMAX_API_KEY` (`MINIMAX_BASE_URL` optional) | `MiniMax-M3` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | — |
| Xiaomi MiMo | `XIAOMI_API_KEY` (`XIAOMI_BASE_URL` optional) | `mimo-v2.5-pro` | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | [Xiaomi MiMo](/providers/xiaomi) |
| OpenCode Go | `OPENCODE_GO_API_KEY` (`OPENCODE_GO_BASE_URL` optional) | `glm-5.1` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | [OpenCode Go](/providers/opencode-go) |
| Azure OpenAI | `AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional) | `gpt-5` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | [Azure OpenAI](/providers/azure) |
| Oracle GenAI | `ORACLE_API_KEY` + `ORACLE_BASE_URL` | `openai.gpt-oss-120b` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | [Oracle GenAI](/providers/oracle) |
| Ollama | `OLLAMA_BASE_URL` | `llama3.2` | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | [Ollama](/providers/multiple-ollama) |
| vLLM | `VLLM_BASE_URL` (`VLLM_API_KEY` optional) | `meta-llama/Llama-3.1-8B-Instruct` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | [vLLM](/providers/vllm) |
| Amazon Bedrock | `BEDROCK_BASE_URL` (region or endpoint) + AWS credentials | `anthropic.claude-3-5-haiku-20241022-v1:0` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | [Amazon Bedrock](/providers/bedrock) |

Comment on lines +16 to +40

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Spot-check capability assignments for a few providers
echo "=== Checking provider capability implementations ==="
rg -n 'Anthropic.*embed|anthropic.*files' --type go | head -10
echo -e "\n"
rg -n 'vLLM.*batch|vllm.*batch' --type go | head -10
echo -e "\n"
rg -n 'OpenRouter.*passthrough|openrouter.*passthrough' --type go | head -10

Repository: ENTERPILOT/GoModel

Length of output: 495


🏁 Script executed:

#!/bin/bash
# Check OpenRouter passthrough support
echo "=== OpenRouter passthrough ==="
rg -n 'openrouter.*pass|passthrough.*openrouter' --type go -i | head -5

# Check Google Gemini Files/Batches support
echo -e "\n=== Google Gemini capabilities ==="
find . -name "*gemini*" -type f | grep -E '\.(go|ts)$' | head -5
rg -n 'Files|Batches' internal/providers/gemini --type go -B2 -A2 | head -20

# Check Z.ai implementation
echo -e "\n=== Z.ai provider structure ==="
find . -path "*zai*" -type f | grep -E '\.(go|ts)$'

# Check if Groq implements Files/Batches
echo -e "\n=== Groq Files/Batches support ==="
rg -n 'groq.*Files|groq.*Batches|groq.*Embed' --type go -i | head -10

Repository: ENTERPILOT/GoModel

Length of output: 3027


🏁 Script executed:

#!/bin/bash
# Check Groq Batches support
echo "=== Groq Batches support ==="
rg -n 'Batch' internal/providers/groq/groq.go | head -10

# Check Azure OpenAI capabilities
echo -e "\n=== Azure OpenAI provider methods ==="
rg -n 'func.*Provider.*\(Chat|Files|Embed|Batch|Passthru|Responses)' internal/providers/azure --type go | head -15

# Check Oracle GenAI capabilities
echo -e "\n=== Oracle GenAI provider methods ==="
find . -path "*oracle*" -type f -name "*.go" | head -5
rg -n 'func.*Provider.*\(Chat|Files|Embed|Batch|Passthru|Responses)' internal/providers/oracle --type go -i | head -15

# Check Amazon Bedrock capabilities
echo -e "\n=== Amazon Bedrock provider methods ==="
rg -n 'func.*Provider.*\(Chat|Files|Embed|Batch|Passthru|Responses)' internal/providers/bedrock --type go -i | head -15

# Check which providers support embeddings
echo -e "\n=== Embedding support across providers ==="
rg -n 'func.*Provider.*Embed' internal/providers --type go | wc -l
rg -n 'func.*Provider.*Embed' internal/providers --type go | head -20

Repository: ENTERPILOT/GoModel

Length of output: 4610


🏁 Script executed:

#!/bin/bash
# Check Azure OpenAI - simpler query
echo "=== Azure OpenAI capability methods ==="
ls -la internal/providers/azure/ 2>/dev/null || echo "No azure provider found"
cat internal/providers/azure/azure.go 2>/dev/null | grep -E "func.*Provider.*\(" | head -20

# Check Bedrock capabilities
echo -e "\n=== Bedrock capability methods ==="
cat internal/providers/bedrock/bedrock.go 2>/dev/null | grep -E "func.*Provider.*\(" | head -20

# Check Oracle capabilities  
echo -e "\n=== Oracle capability methods ==="
cat internal/providers/oracle/oracle.go 2>/dev/null | grep -E "func.*Provider.*\(" | head -20

# Check DeepSeek embeddings claim
echo -e "\n=== DeepSeek Embeddings implementation ==="
rg -n 'deepseek' internal/providers/deepseek/deepseek.go -A5 | grep -i embed

# Check Z.ai capabilities
echo -e "\n=== Z.ai capability methods ==="
cat internal/providers/zai/zai.go 2>/dev/null | grep -E "func.*Provider.*\(" | head -20

Repository: ENTERPILOT/GoModel

Length of output: 2864


🏁 Script executed:

#!/bin/bash
# Check Z.ai directly
echo "=== Z.ai provider full content (first 100 lines) ==="
head -100 internal/providers/zai/zai.go

# Check if Bedrock Embeddings actually works or returns error
echo -e "\n=== Bedrock Embeddings implementation detail ==="
sed -n '226,235p' internal/providers/bedrock/bedrock.go

# Verify what each capability check does
echo -e "\n=== Table capability columns mapping ==="
echo "Checking if /responses column includes StreamResponses..."
rg -n 'StreamResponses|StreamChat' internal/providers/bedrock --type go | head -5

# Comprehensive check on Z.ai methods
echo -e "\n=== Z.ai full methods list ==="
rg -n 'func \(p \*Provider\)' internal/providers/zai/zai.go

Repository: ENTERPILOT/GoModel

Length of output: 3021


🏁 Script executed:

#!/bin/bash
# Final verification: Check Z.ai inheritance chain for embeddings support
echo "=== Z.ai embedding support via OpenAI ChatCompatible ==="
rg -n 'Embeddings' internal/providers/openai/compatible_provider.go -B2 -A8

# Double-check: Does OpenAI compatible provider support embeddings?
echo -e "\n=== OpenAI compatible embeddings ==="
sed -n '337,355p' internal/providers/openai/compatible_provider.go

# Verify Bedrock is truly unsupported (not partial)
echo -e "\n=== All Bedrock method implementations ==="
grep -E "^func \(p \*Provider\)" internal/providers/bedrock/bedrock.go

Repository: ENTERPILOT/GoModel

Length of output: 1757


Bedrock Embeddings capability is incorrectly marked as supported (✅) in the matrix.

Code inspection reveals a critical discrepancy: internal/providers/bedrock/bedrock.go implements Embeddings with an explicit error return:

func (p *Provider) Embeddings(_ context.Context, _ *core.EmbeddingRequest) (*core.EmbeddingResponse, error) {
	return nil, core.NewInvalidRequestError("bedrock embeddings are not yet supported by gomodel", nil)
}

The table marks Bedrock Embeddings as ✅, but the implementation confirms it is unsupported. Change this to ❌.

Other spot-checks verified:

  • Anthropic: ❌ Embed/Files confirmed (code explicitly rejects both)
  • Gemini: ✅ Files/Batches confirmed (ListFiles, ListBatches methods present)
  • Groq: ✅ Batches/Embed/Files confirmed; ❌ Passthru confirmed (not in defaultEnabledPassthroughProviders list)
  • Z.ai: ✅ Embed confirmed (inherits working implementation from openai.ChatCompatible)
  • DeepSeek: ❌ Embed confirmed (explicit error: "deepseek does not support embeddings")
  • vLLM: ❌ Batches confirmed (test assertion: "vllm provider should not implement native batch provider")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/providers/overview.mdx` around lines 16 - 40, In the providers table in
the overview.mdx file, locate the Bedrock provider row and find the Embed
column. Change the checkmark from ✅ to ❌ because the actual implementation in
bedrock.go explicitly returns an error with the message "bedrock embeddings are
not yet supported by gomodel" when the Embeddings method is called, confirming
that embeddings functionality is not supported despite what the table currently
indicates.

✅ Supported ❌ Unsupported

## Provider notes

- **Z.ai GLM Coding Plan** — set
`ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4`.
- **Xiaomi MiMo** — TTS (`mimo-v2.5-tts*`) and ASR (`mimo-v2.5-asr`) are served
through `/v1/audio/speech` and `/v1/audio/transcriptions` (translated to
MiMo's chat-completions audio dialect) as well as directly via chat
completions; for 1M context append `[1m]` to the model ID and list it in
`XIAOMI_MODELS`.
- **OpenCode Go (OpenCode Zen)** — routes per model: most models use
OpenAI-style `/chat/completions`, while `/messages`-only models (default
`qwen3.7-max`, override with `OPENCODE_GO_MESSAGES_MODELS`) are sent to the
Anthropic-native endpoint. Set `OPENCODE_GO_API_KEY`; the base URL defaults to
`https://opencode.ai/zen/go/v1`.
- **Configured model lists** — available for every provider with
`<PROVIDER>_MODELS`, for example
`OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4` or
`ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3`. DeepSeek defaults to
`https://api.deepseek.com`; set `DEEPSEEK_BASE_URL` only when using a
compatible proxy or alternate DeepSeek endpoint. By default,
`CONFIGURED_PROVIDER_MODELS_MODE=fallback` uses those lists only when upstream
`/models` is unavailable or empty. Set
`CONFIGURED_PROVIDER_MODELS_MODE=allowlist` to expose only configured models
for providers that define a list, skipping their upstream `/models` calls.
- **vLLM** — set `VLLM_API_KEY` only if the upstream server was started with
`--api-key`.
- **Multiple instances of one provider type** — without `config.yaml`, use
suffixed env vars such as `OPENAI_EAST_API_KEY` and `OPENAI_EAST_BASE_URL`;
add `OPENAI_EAST_MODELS` to configure that instance's model list. This
registers provider `openai-east` with type `openai`. Vertex AI follows the
same suffix pattern — `VERTEX_US_PROJECT` registers provider `vertex-us`.
Vertex project and location env vars must match the instance prefix: for a
suffixed instance such as `VERTEX_US_PROJECT`, also set `VERTEX_US_LOCATION`
and any other suffixed settings for that instance, rather than the generic
`VERTEX_PROJECT` / `VERTEX_LOCATION`. `VERTEX_AUTH_TYPE` defaults to
Application Default Credentials (`gcp_adc`).

## Why some providers have dedicated pages

Expand Down
2 changes: 1 addition & 1 deletion docs/providers/xiaomi.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Xiaomi MiMo"
description: "Configure Xiaomi MiMo in GoModel: thinking mode, the [1m] context suffix, and how TTS/ASR map onto the standard audio endpoints."
icon: "microphone"
icon: "mic"
---

Xiaomi MiMo speaks an OpenAI-compatible chat API with a few dialect quirks:
Expand Down