Astrolabe 0.2.0 Beta

Astrolabe is a policy-driven OpenAI-compatible routing proxy for OpenClaw.

Astrolabe sits between your agent and OpenRouter, evaluates each request, applies safety checks, picks the lowest-cost model likely to succeed, and optionally escalates once when confidence is low.

What Astrolabe does

Astrolabe is built to solve a practical problem: model quality and model cost both matter, and the right model changes from request to request.

For short, routine requests, Astrolabe keeps traffic on low-cost models.
For harder requests, it moves to stronger tiers.
For sensitive requests, it applies explicit safety logic and stricter routing.
It preserves OpenAI-compatible request/response shape, so existing clients do not need protocol changes.

Architecture overview

Astrolabe is intentionally small and stateless:

Client layer: OpenClaw (or any OpenAI-compatible client) sends POST /v1/chat/completions.
Policy layer: Astrolabe classifies request category/complexity and applies safety/cost guardrails.
Execution layer: Astrolabe sends the upstream request to OpenRouter using the selected model and fallback chain.
Verification layer: For non-stream responses, Astrolabe can self-check quality and escalate once if needed.
Observability layer: Astrolabe returns routing metadata headers and emits structured logs.

Astrolabe is headless: no database, no session store, no UI required.

End-to-end request lifecycle

For each request:

Parse request body and extract user/context features.
Run high-stakes safety gate detection.
Classify request into one of 12 policy categories with a complexity level.
Apply routing profile + cost guardrails.
Resolve initial model and candidate fallback list.
Execute upstream request.
If non-streaming and not forced-model mode, run self-check and optionally escalate once.
Return upstream response plus x-astrolabe-* routing headers.

If ASTROLABE_FORCE_MODEL is set, classifier/self-check escalation is skipped and the forced model is used as both initial and final model.

What's new in 0.2.0-beta.1

12-category request routing policy (heartbeat, core_loop, retrieval, summarization, planning, orchestration, coding, research, creative, communication, high_stakes, reflection)
Pre-classification high-stakes safety gate
Category + complexity classifier with heuristic fallback
Model fallback chains when upstream model/provider is unavailable
Confidence-scored self-check (1-5) with one-step escalation policy
Routing metadata headers on responses
/health endpoint for runtime mode visibility

Default model roster

const MODELS = {
  opus: "anthropic/claude-opus-4.6",
  sonnet: "anthropic/claude-sonnet-4.6",
  m25: "minimax/minimax-m2.5",
  kimiK25: "moonshotai/kimi-k2.5",
  glm5: "z-ai/glm-5",
  grok: "x-ai/grok-4.1-fast",
  nano: "openai/gpt-5-nano",
  dsCoder: "deepseek/deepseek-v3.2",
  gemFlash: "google/gemini-3-flash-preview",
  gem31Pro: "google/gemini-3.1-pro-preview"
};

Tier intent:

ULTRA-CHEAP: highest throughput and lowest unit cost
BUDGET: grok for conversational and light tool-use paths
VALUE: m25 as the primary reasoning/workhorse route for most standard and complex text tasks
VALUE specialists: kimiK25 for multimodal-first routes and glm5 for large-context engineering/text-heavy analysis
MID-TIER: gem31Pro (and gemFlash as a conditional fallback/specialist path, not cheap-first routing)
STANDARD: sonnet is escalation-focused (plus optional high-stakes budget-floor mode)
PREMIUM: high-stakes/safety-critical floor or peak escalation

Multimodal caveat:

m25 is treated as text-first in current OpenRouter routing, so multimodal requests are routed to kimiK25 or gem31Pro by policy.

Self-check and escalation policy

Score >= 4: keep current model response
Score 2-3:
- strict cost mode: escalate only for complex/critical/high-stakes routes
- simple/standard routes return with low-confidence signal
Score 1:
- strict cost mode: non-critical routes escalate one tier up
- critical/high-stakes routes escalate to Opus
Maximum one escalation per request
If final score remains low, response is returned with x-astrolabe-low-confidence: true

Quick start

1) Install dependencies

cd Astrolabe
npm install

2) Configure environment

Copy .env.example to .env and set at least:

OPENROUTER_API_KEY=your_real_key_here
ASTROLABE_API_KEY=your_proxy_secret
PORT=3000

ASTROLABE_API_KEY should be a long random shared secret you control.
Use the same value in your client Authorization: Bearer ... header (or x-api-key).

Generate one:

# cross-platform (Node)
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"

# OpenSSL (macOS/Linux)
openssl rand -hex 32

3) Start server

npm start

Server starts at http://localhost:3000.

4) Smoke test

curl -X POST http://localhost:3000/v1/chat/completions ^
  -H "Content-Type: application/json" ^
  -H "Authorization: Bearer your_proxy_secret" ^
  -d "{\"model\":\"ignored-by-astrolabe\",\"stream\":false,\"messages\":[{\"role\":\"user\",\"content\":\"Say hello in one line.\"}]}"

model in the request is accepted for compatibility, but Astrolabe overrides it with routed policy selection unless ASTROLABE_FORCE_MODEL is set.

Optional routing configuration

strict is used in two different settings with different behavior:

ASTROLABE_COST_EFFICIENCY_MODE=strict controls budget aggressiveness for routing/escalation
ASTROLABE_HIGH_STAKES_CONFIRM_MODE=strict controls high-stakes confirmation blocking

# balanced | budget | quality
ASTROLABE_ROUTING_PROFILE=budget

# strict | balanced | off
ASTROLABE_COST_EFFICIENCY_MODE=strict

# if false, non-high-stakes direct Sonnet/Opus routes are guarded down to cheaper models
ASTROLABE_ALLOW_DIRECT_PREMIUM_MODELS=false

# true | false
ASTROLABE_ENABLE_SAFETY_GATE=true

# prompt | strict | off
ASTROLABE_HIGH_STAKES_CONFIRM_MODE=prompt
ASTROLABE_HIGH_STAKES_CONFIRM_TOKEN=confirm

# allow Sonnet floor for high-stakes when routing profile is budget
ASTROLABE_ALLOW_HIGH_STAKES_BUDGET_FLOOR=false

# override classifier/self-check models
ASTROLABE_CLASSIFIER_MODEL_KEY=nano
ASTROLABE_SELF_CHECK_MODEL_KEY=nano

# classifier context window
ASTROLABE_CONTEXT_MESSAGES=8
ASTROLABE_CONTEXT_CHARS=2500

# optional in-memory request rate limiting
ASTROLABE_RATE_LIMIT_ENABLED=false
ASTROLABE_RATE_LIMIT_WINDOW_MS=60000
ASTROLABE_RATE_LIMIT_MAX_REQUESTS=120

# hard override all routing (full model id)
# bypasses classifier/self-check escalation and locks initial/final upstream model id
ASTROLABE_FORCE_MODEL=

Mode reference

Setting	Values	Default	Controls
`ASTROLABE_ROUTING_PROFILE`	`budget`, `balanced`, `quality`	`budget`	Base policy aggressiveness
`ASTROLABE_COST_EFFICIENCY_MODE`	`strict`, `balanced`, `off`	`strict`	Cost guardrail strictness
`ASTROLABE_HIGH_STAKES_CONFIRM_MODE`	`prompt`, `strict`, `off`	`prompt`	High-stakes confirmation behavior

Invalid mode values are normalized to safe defaults (budget, strict, prompt).

Complete environment variable reference

Variable	Required	Default	Purpose
`OPENROUTER_API_KEY`	Yes	none	Required OpenRouter upstream key
`ASTROLABE_API_KEY`	Yes in production (recommended locally)	empty	Inbound API auth for Astrolabe (use a long random shared secret)
`PORT`	No	`3000`	HTTP listen port
`OPENROUTER_SITE_URL`	No	empty	Optional `HTTP-Referer` header for OpenRouter
`OPENROUTER_APP_NAME`	No	empty	Optional `X-Title` header for OpenRouter
`ASTROLABE_ROUTING_PROFILE`	No	`budget`	Policy profile selection
`ASTROLABE_COST_EFFICIENCY_MODE`	No	`strict`	Cost guardrail mode
`ASTROLABE_ALLOW_DIRECT_PREMIUM_MODELS`	No	`false`	Allow/block direct Sonnet/Opus on non-high-stakes routes
`ASTROLABE_ENABLE_SAFETY_GATE`	No	`true`	Enable high-stakes detection
`ASTROLABE_HIGH_STAKES_CONFIRM_MODE`	No	`prompt`	High-stakes confirmation policy
`ASTROLABE_HIGH_STAKES_CONFIRM_TOKEN`	No	`confirm`	Confirmation token used in strict high-stakes mode
`ASTROLABE_ALLOW_HIGH_STAKES_BUDGET_FLOOR`	No	`false`	Allow Sonnet floor for high-stakes in budget routing
`ASTROLABE_CLASSIFIER_MODEL_KEY`	No	`nano`	Primary classifier model key (`nano`, `grok`, `m25`, `sonnet`, `opus`, `dsCoder`, `gemFlash`, `gem31Pro`, `kimiK25`, `glm5`)
`ASTROLABE_SELF_CHECK_MODEL_KEY`	No	`nano`	Primary self-check model key (`nano`, `grok`, `m25`, `sonnet`, `opus`, `dsCoder`, `gemFlash`, `gem31Pro`, `kimiK25`, `glm5`)
`ASTROLABE_CONTEXT_MESSAGES`	No	`8`	Classifier context message bound (`3-20`)
`ASTROLABE_CONTEXT_CHARS`	No	`2500`	Classifier context char bound (`600-12000`)
`ASTROLABE_RATE_LIMIT_ENABLED`	No	`false`	Enable in-memory request rate limiting on `POST /v1/chat/completions`
`ASTROLABE_RATE_LIMIT_WINDOW_MS`	No	`60000`	Rate limit window size in milliseconds (`1000-3600000`)
`ASTROLABE_RATE_LIMIT_MAX_REQUESTS`	No	`120`	Max requests allowed per key per window (`1-100000`)
`ASTROLABE_FORCE_MODEL`	No	empty	Hard override to one model id (no classifier/self-check escalation)

See docs/configuration.mdx for full behavior details and preset profiles.

Safety gate behavior

prompt mode: high-stakes requests are force-routed and a safety system policy prompt is injected
strict mode: high-stakes requests require exact confirmation token match (x-astrolabe-confirmed: <token> or metadata.astrolabe_confirmed: "<token>")
off mode: no special confirmation handling

Response headers

Astrolabe adds:

x-astrolabe-category
x-astrolabe-complexity
x-astrolabe-adjusted-complexity
x-astrolabe-initial-model
x-astrolabe-final-model
x-astrolabe-route-label
x-astrolabe-escalated
x-astrolabe-confidence-score
x-astrolabe-low-confidence
x-astrolabe-safety-gate

OpenClaw integration

Point OpenClaw OpenAI-compatible base URL at Astrolabe:

Before: https://openrouter.ai/api/v1
After (local): http://localhost:3000/v1
After (deploy): https://your-host/v1

Set OpenClaw API key to the same value as ASTROLABE_API_KEY.

Test

npm test

Troubleshooting

Missing OPENROUTER_API_KEY
- Set key in .env and restart
high_stakes_confirmation_required
- If ASTROLABE_HIGH_STAKES_CONFIRM_MODE=strict, include the exact configured token in header/body
Frequent escalations
- Increase routing profile quality or tighten category prompts
rate_limit_exceeded
- Increase ASTROLABE_RATE_LIMIT_MAX_REQUESTS, enlarge ASTROLABE_RATE_LIMIT_WINDOW_MS, or disable limiter (ASTROLABE_RATE_LIMIT_ENABLED=false)
est_usd=n/a
- Upstream omitted token usage

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Astrolabe 0.2.0 Beta

What Astrolabe does

Architecture overview

End-to-end request lifecycle

What's new in 0.2.0-beta.1

Default model roster

Self-check and escalation policy

Quick start

1) Install dependencies

2) Configure environment

3) Start server

4) Smoke test

Optional routing configuration

Mode reference

Complete environment variable reference

Safety gate behavior

Response headers

OpenClaw integration

Test

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Astrolabe 0.2.0 Beta

What Astrolabe does

Architecture overview

End-to-end request lifecycle

What's new in 0.2.0-beta.1

Default model roster

Self-check and escalation policy

Quick start

1) Install dependencies

2) Configure environment

3) Start server

4) Smoke test

Optional routing configuration

Mode reference

Complete environment variable reference

Safety gate behavior

Response headers

OpenClaw integration

Test

Troubleshooting

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages