Test it locally: fix nested Claude MAX credential read, add per-token request budget, verify end-to-end#36
Conversation
Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: #35
…ement Add an optional max_requests cap to issued tokens so an operator can bound how much of the shared Claude MAX subscription a single task may consume. - storage: TokenRecord gains max_requests/used_requests; Lino + binary codecs round-trip them; TokenStore::try_consume_request enforces+increments. - token: issue_token_full(...) and enforce_request_budget(...); new TokenError::LimitExceeded. - proxy: all three handler paths (Anthropic, OpenAI, Gonka) enforce the budget after validation, returning HTTP 429 when exhausted. - cli/http: --max-requests flag and max_requests JSON field; tokens list shows used/max. - refactor: extract admin token endpoints into src/token_admin.rs to keep proxy.rs under the 1000-line limit. Tests: budget enforced/unlimited/unknown-id cases.
… updates - Add docs/case-studies/issue-35 (README analysis, requirements trace, online research, existing-components survey, redacted live evidence) - Document nested claudeAiOauth credential layout, transparent header injection (anthropic-version + oauth-2025-04-20 beta), and the per-token request budget (max_requests) in README - Correct stale note claiming revocations are lost on restart - Add changelog fragment (bump: minor)
Built the image from the repo Dockerfile and ran it with a read-only copy of the real Claude MAX credentials (original never modified). Verified nested credential read, transparent passthrough (count_tokens HTTP 200 with only a la_sk_ token), per-token budget 429, 401s, and no OAuth token in container logs.
Working session summaryAll work is complete. Here's the final summary. Issue #35 "Test it locally" — DonePR #36 (ready for review, all CI green): #36 What was found and fixed
Verified end-to-end (native and Docker), against
|
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $12.935831📊 Context and tokens usage:Claude Opus 4.8: (3 sub-sessions)
Total: (30.2K new + 321.8K cache writes + 15.4M cache reads) input tokens, 119.6K output tokens, $12.876728 cost Claude Haiku 4.5:
Total: 34.2K input tokens, 975 output tokens, $0.059103 cost 🤖 Models used:
📎 Log file uploaded as Gist (6224KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
✅ Ready to mergeThis pull request is now ready to be merged:
Monitored by hive-mind with --auto-restart-until-mergeable flag |
This reverts commit 0f0ad04.
Summary
Fixes #35.
This PR uses the local Docker + Claude MAX access to run the router end-to-end,
verifies that everything the docs claim actually works, fixes the one genuine
code bug found, and adds the "limit how much each task can use" capability the
issue asks for.
Three outcomes:
to
~/.claude/.credentials.jsonnested under aclaudeAiOauthobject. Therouter only read a flat
accessToken, so against an actual login it foundno token.
src/oauth.rsnow reads both the nested and flat layouts.api.anthropic.com, both natively and through the Docker image. A clientsending only a
la_sk_token (noanthropic-version, no OAuth beta header)gets a working upstream response; the real OAuth token never appears in logs
or client-visible output.
separate task with a hard cap on how many upstream requests it may make.
Changes
Fixed — nested Claude MAX credential layout (
src/oauth.rs)extract_token()/expires_at_ms()accept both the nestedclaudeAiOauthobject (real Claude Code) and the flat
{"accessToken": ...}layout.doctorprobes the credential file and reportsfound, token OK/found, NO TOKEN/MISSING.Added — per-token request budget (
max_requests)src/storage.rs:TokenRecordgainsmax_requests: Option<u64>andused_requests: u64(both#[serde(default)], backward compatible); the Linotext codec round-trips
(max_requests N)/(used_requests N);TokenStore::try_consume_requestchecks-and-increments.src/token.rs:issue_token_full(...)writes the cap;enforce_request_budget(...)returnsTokenError::LimitExceededonce hit.src/proxy.rs: every forwarding path (Anthropic, OpenAI, Gonka) enforces thebudget after token validation and returns
429 rate_limit_errorwhenexhausted. Admin token endpoints were extracted to a new
src/token_admin.rsto stay under the 1000-line per-file CI limit.
src/cli.rs/src/main.rs/src/token_admin.rs:tokens issue --max-requests, thePOST /api/tokensmax_requestsfield, and aused/maxcolumn in
tokens list.Docs
(
anthropic-versiondefault +anthropic-beta: oauth-2025-04-20), and theper-token budget; corrected the stale note claiming revocations are lost on
restart (records are persisted).
docs/case-studies/issue-35/: full case study — requirement-by-requirementtrace, online research (primary sources), existing-components survey (LiteLLM
virtual keys/budgets, Portkey, Kong AI Gateway, community Claude proxies), and
redacted live + Docker evidence.
changelog.d/20260609_233000_issue_35_local_testing.md(bump: minor).How it was verified (live + Docker)
Performed against
https://api.anthropic.comwith a copy of the real ClaudeMAX credentials. The original
~/.claude/.credentials.jsonwas only read/copied— never modified or deleted (confirmed unchanged at 471 bytes).
la_sk_token tocount_tokens{"input_tokens":13}Token has reached its request limit(no upstreamrequest_id)(max_requests 2) (used_requests 2);tokens list→2/2Dockerfile) with copied creds mounted:roEvidence:
docs/case-studies/issue-35/raw/(native) anddocs/case-studies/issue-35/raw/docker/(container).Tests
src/token.rs:test_unlimited_token_never_hits_budget,test_request_budget_enforced(caps at 3, 4th =LimitExceeded, usagepersisted),
test_budget_for_unknown_token_is_permitted.src/storage.rsround-trip literals updated for the new fields.src/oauth.rstests cover nested + flat layouts.Local CI gate (all green)
cargo fmt --check·cargo clippy --all-targets --all-features·file-size check (all
src/*.rs< 1000 lines) ·cargo test --all-features(141 tests pass) ·
cargo test --doc·cargo build --release.Version bump is intentionally not hand-edited in
Cargo.toml— the repoderives it from the
changelog.dfragment (bump: minor), enforced by theprevent_manual_version_modificationpolicy.