Astrolabe is a policy-driven OpenAI-compatible routing proxy for OpenClaw.
Astrolabe sits between your agent and OpenRouter, evaluates each request, applies safety checks, picks the lowest-cost model likely to succeed, and optionally escalates once when confidence is low.
Astrolabe is built to solve a practical problem: model quality and model cost both matter, and the right model changes from request to request.
- For short, routine requests, Astrolabe keeps traffic on low-cost models.
- For harder requests, it moves to stronger tiers.
- For sensitive requests, it applies explicit safety logic and stricter routing.
- It preserves OpenAI-compatible request/response shape, so existing clients do not need protocol changes.
Astrolabe is intentionally small and stateless:
- Client layer: OpenClaw (or any OpenAI-compatible client) sends
POST /v1/chat/completions. - Policy layer: Astrolabe classifies request category/complexity and applies safety/cost guardrails.
- Execution layer: Astrolabe sends the upstream request to OpenRouter using the selected model and fallback chain.
- Verification layer: For non-stream responses, Astrolabe can self-check quality and escalate once if needed.
- Observability layer: Astrolabe returns routing metadata headers and emits structured logs.
Astrolabe is headless: no database, no session store, no UI required.
For each request:
- Parse request body and extract user/context features.
- Run high-stakes safety gate detection.
- Classify request into one of 12 policy categories with a complexity level.
- Apply routing profile + cost guardrails.
- Resolve initial model and candidate fallback list.
- Execute upstream request.
- If non-streaming and not forced-model mode, run self-check and optionally escalate once.
- Return upstream response plus
x-astrolabe-*routing headers.
If ASTROLABE_FORCE_MODEL is set, classifier/self-check escalation is skipped and the forced model is used as both initial and final model.
- 12-category request routing policy (
heartbeat,core_loop,retrieval,summarization,planning,orchestration,coding,research,creative,communication,high_stakes,reflection) - Pre-classification high-stakes safety gate
- Category + complexity classifier with heuristic fallback
- Model fallback chains when upstream model/provider is unavailable
- Confidence-scored self-check (1-5) with one-step escalation policy
- Routing metadata headers on responses
/healthendpoint for runtime mode visibility
const MODELS = {
opus: "anthropic/claude-opus-4.6",
sonnet: "anthropic/claude-sonnet-4.6",
m25: "minimax/minimax-m2.5",
kimiK25: "moonshotai/kimi-k2.5",
glm5: "z-ai/glm-5",
grok: "x-ai/grok-4.1-fast",
nano: "openai/gpt-5-nano",
dsCoder: "deepseek/deepseek-v3.2",
gemFlash: "google/gemini-3-flash-preview",
gem31Pro: "google/gemini-3.1-pro-preview"
};Tier intent:
ULTRA-CHEAP: highest throughput and lowest unit costBUDGET:grokfor conversational and light tool-use pathsVALUE:m25as the primary reasoning/workhorse route for most standard and complex text tasksVALUEspecialists:kimiK25for multimodal-first routes andglm5for large-context engineering/text-heavy analysisMID-TIER:gem31Pro(andgemFlashas a conditional fallback/specialist path, not cheap-first routing)STANDARD:sonnetis escalation-focused (plus optional high-stakes budget-floor mode)PREMIUM: high-stakes/safety-critical floor or peak escalation
Multimodal caveat:
m25is treated as text-first in current OpenRouter routing, so multimodal requests are routed tokimiK25orgem31Proby policy.
- Score >= 4: keep current model response
- Score 2-3:
strictcost mode: escalate only for complex/critical/high-stakes routes- simple/standard routes return with low-confidence signal
- Score 1:
strictcost mode: non-critical routes escalate one tier up- critical/high-stakes routes escalate to Opus
- Maximum one escalation per request
- If final score remains low, response is returned with
x-astrolabe-low-confidence: true
cd Astrolabe
npm installCopy .env.example to .env and set at least:
OPENROUTER_API_KEY=your_real_key_here
ASTROLABE_API_KEY=your_proxy_secret
PORT=3000ASTROLABE_API_KEY should be a long random shared secret you control.
Use the same value in your client Authorization: Bearer ... header (or x-api-key).
Generate one:
# cross-platform (Node)
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
# OpenSSL (macOS/Linux)
openssl rand -hex 32npm startServer starts at http://localhost:3000.
curl -X POST http://localhost:3000/v1/chat/completions ^
-H "Content-Type: application/json" ^
-H "Authorization: Bearer your_proxy_secret" ^
-d "{\"model\":\"ignored-by-astrolabe\",\"stream\":false,\"messages\":[{\"role\":\"user\",\"content\":\"Say hello in one line.\"}]}"model in the request is accepted for compatibility, but Astrolabe overrides it with routed policy selection unless ASTROLABE_FORCE_MODEL is set.
strict is used in two different settings with different behavior:
ASTROLABE_COST_EFFICIENCY_MODE=strictcontrols budget aggressiveness for routing/escalationASTROLABE_HIGH_STAKES_CONFIRM_MODE=strictcontrols high-stakes confirmation blocking
# balanced | budget | quality
ASTROLABE_ROUTING_PROFILE=budget
# strict | balanced | off
ASTROLABE_COST_EFFICIENCY_MODE=strict
# if false, non-high-stakes direct Sonnet/Opus routes are guarded down to cheaper models
ASTROLABE_ALLOW_DIRECT_PREMIUM_MODELS=false
# true | false
ASTROLABE_ENABLE_SAFETY_GATE=true
# prompt | strict | off
ASTROLABE_HIGH_STAKES_CONFIRM_MODE=prompt
ASTROLABE_HIGH_STAKES_CONFIRM_TOKEN=confirm
# allow Sonnet floor for high-stakes when routing profile is budget
ASTROLABE_ALLOW_HIGH_STAKES_BUDGET_FLOOR=false
# override classifier/self-check models
ASTROLABE_CLASSIFIER_MODEL_KEY=nano
ASTROLABE_SELF_CHECK_MODEL_KEY=nano
# classifier context window
ASTROLABE_CONTEXT_MESSAGES=8
ASTROLABE_CONTEXT_CHARS=2500
# optional in-memory request rate limiting
ASTROLABE_RATE_LIMIT_ENABLED=false
ASTROLABE_RATE_LIMIT_WINDOW_MS=60000
ASTROLABE_RATE_LIMIT_MAX_REQUESTS=120
# hard override all routing (full model id)
# bypasses classifier/self-check escalation and locks initial/final upstream model id
ASTROLABE_FORCE_MODEL=| Setting | Values | Default | Controls |
|---|---|---|---|
ASTROLABE_ROUTING_PROFILE |
budget, balanced, quality |
budget |
Base policy aggressiveness |
ASTROLABE_COST_EFFICIENCY_MODE |
strict, balanced, off |
strict |
Cost guardrail strictness |
ASTROLABE_HIGH_STAKES_CONFIRM_MODE |
prompt, strict, off |
prompt |
High-stakes confirmation behavior |
Invalid mode values are normalized to safe defaults (budget, strict, prompt).
| Variable | Required | Default | Purpose |
|---|---|---|---|
OPENROUTER_API_KEY |
Yes | none | Required OpenRouter upstream key |
ASTROLABE_API_KEY |
Yes in production (recommended locally) | empty | Inbound API auth for Astrolabe (use a long random shared secret) |
PORT |
No | 3000 |
HTTP listen port |
OPENROUTER_SITE_URL |
No | empty | Optional HTTP-Referer header for OpenRouter |
OPENROUTER_APP_NAME |
No | empty | Optional X-Title header for OpenRouter |
ASTROLABE_ROUTING_PROFILE |
No | budget |
Policy profile selection |
ASTROLABE_COST_EFFICIENCY_MODE |
No | strict |
Cost guardrail mode |
ASTROLABE_ALLOW_DIRECT_PREMIUM_MODELS |
No | false |
Allow/block direct Sonnet/Opus on non-high-stakes routes |
ASTROLABE_ENABLE_SAFETY_GATE |
No | true |
Enable high-stakes detection |
ASTROLABE_HIGH_STAKES_CONFIRM_MODE |
No | prompt |
High-stakes confirmation policy |
ASTROLABE_HIGH_STAKES_CONFIRM_TOKEN |
No | confirm |
Confirmation token used in strict high-stakes mode |
ASTROLABE_ALLOW_HIGH_STAKES_BUDGET_FLOOR |
No | false |
Allow Sonnet floor for high-stakes in budget routing |
ASTROLABE_CLASSIFIER_MODEL_KEY |
No | nano |
Primary classifier model key (nano, grok, m25, sonnet, opus, dsCoder, gemFlash, gem31Pro, kimiK25, glm5) |
ASTROLABE_SELF_CHECK_MODEL_KEY |
No | nano |
Primary self-check model key (nano, grok, m25, sonnet, opus, dsCoder, gemFlash, gem31Pro, kimiK25, glm5) |
ASTROLABE_CONTEXT_MESSAGES |
No | 8 |
Classifier context message bound (3-20) |
ASTROLABE_CONTEXT_CHARS |
No | 2500 |
Classifier context char bound (600-12000) |
ASTROLABE_RATE_LIMIT_ENABLED |
No | false |
Enable in-memory request rate limiting on POST /v1/chat/completions |
ASTROLABE_RATE_LIMIT_WINDOW_MS |
No | 60000 |
Rate limit window size in milliseconds (1000-3600000) |
ASTROLABE_RATE_LIMIT_MAX_REQUESTS |
No | 120 |
Max requests allowed per key per window (1-100000) |
ASTROLABE_FORCE_MODEL |
No | empty | Hard override to one model id (no classifier/self-check escalation) |
See docs/configuration.mdx for full behavior details and preset profiles.
promptmode: high-stakes requests are force-routed and a safety system policy prompt is injectedstrictmode: high-stakes requests require exact confirmation token match (x-astrolabe-confirmed: <token>ormetadata.astrolabe_confirmed: "<token>")offmode: no special confirmation handling
Astrolabe adds:
x-astrolabe-categoryx-astrolabe-complexityx-astrolabe-adjusted-complexityx-astrolabe-initial-modelx-astrolabe-final-modelx-astrolabe-route-labelx-astrolabe-escalatedx-astrolabe-confidence-scorex-astrolabe-low-confidencex-astrolabe-safety-gate
Point OpenClaw OpenAI-compatible base URL at Astrolabe:
- Before:
https://openrouter.ai/api/v1 - After (local):
http://localhost:3000/v1 - After (deploy):
https://your-host/v1
Set OpenClaw API key to the same value as ASTROLABE_API_KEY.
npm testMissing OPENROUTER_API_KEY- Set key in
.envand restart
- Set key in
high_stakes_confirmation_required- If
ASTROLABE_HIGH_STAKES_CONFIRM_MODE=strict, include the exact configured token in header/body
- If
- Frequent escalations
- Increase routing profile quality or tighten category prompts
rate_limit_exceeded- Increase
ASTROLABE_RATE_LIMIT_MAX_REQUESTS, enlargeASTROLABE_RATE_LIMIT_WINDOW_MS, or disable limiter (ASTROLABE_RATE_LIMIT_ENABLED=false)
- Increase
est_usd=n/a- Upstream omitted token usage