feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh by konard · Pull Request #38 · link-assistant/router

konard · 2026-06-17T17:08:59Z

What this PR does

Implements multi-provider subscription support and API routing in the router
engine, adopting the best practices from
heyhuynhgiabuu/proxypal /
CLIProxyAPI, so the router can
"fully support claude, codex, gemini, qwen, and their subscriptions with all our
features and more."

Per the governing directive on this PR, this is implemented as a CLI / Docker /
library engine with no UI, on the latest stable Rust with all
dependencies updated.

Engine changes

Multi-vendor OAuth credential reader (src/subscription.rs) — reads each
vendor CLI's credential file read-only: ~/.codex/auth.json,
~/.gemini/oauth_creds.json, ~/.qwen/oauth_creds.json (Claude via the
existing ~/.claude reader). The router never writes to vendor locations.
Subscription routing (src/subscription_proxy.rs, src/gemini.rs) — new
UpstreamProvider::{Codex, Gemini, Qwen} upstreams route
/v1/chat/completions, /v1/responses, and /v1/models to the correct
vendor backend (ChatGPT Codex Responses API, Gemini Code Assist generateContent,
DashScope OpenAI-compatible).
Dialect translation matrix — OpenAI Chat Completions ↔ OpenAI Responses
(Codex) ↔ Gemini Code Assist generateContent, including SSE synthesis when
a client requests streaming from Gemini.
In-memory OAuth refresh (src/refresh.rs) — expired Codex/Gemini/Qwen
tokens are refreshed via each vendor's public OAuth client and cached in
memory only; vendor files stay read-only and secrets are never logged. No
OAuth client secret is hardcoded — Gemini's installed-app secret is supplied at
runtime via the GEMINI_OAUTH_CLIENT_SECRET env var.
router doctor now probes each subscription credential file and reports
present / token OK / EXPIRED / NO TOKEN / MISSING.
Rate-limit relay — Retry-After and x-ratelimit-* headers from
subscription upstreams are passed through so clients can back off intelligently.

Dependencies & toolchain

Updated all dependencies to latest and adopted Rust edition 2024 on the
latest stable toolchain (build(deps): ... commit).

Case-study package

The evidence-backed analysis under
docs/case-studies/issue-37/
collects the data, analyzes ProxyPal/CLIProxyAPI, lists every requirement
(requirements.md, F1–F12), proposes a plan per requirement
(solution-plans.md), and surveys components (components-survey.md). Online
research with cited primary sources and redacted secrets is in
online-research.md.

Key finding

ProxyPal is a UX shell, not an engine — a Tauri/SolidJS GUI wrapping the Go
CLIProxyAPI binary. "Adopt the best of ProxyPal" therefore resolves to adopting
CLIProxyAPI's engine architecture (per-provider OAuth, dialect translation,
account rotation/cooldown) natively in our Rust engine — which this PR does — while
its GUI/auto-config UX is explicitly out of scope per the no-UI directive.

Requirement status (functional, F1–F12)

Done	Partial	Deferred / Out of scope
F1 Claude, F2 Codex, F3 Gemini, F4 Qwen, F5 provider abstraction, F7 token refresh, F11 dialect matrix	F8 account pool (rate-limit relay + Claude pool), F10 quota observability (doctor)	F6 native login (delegated to vendor CLIs by design), F9 auto-config, F12 GUI (no-UI directive)

Full trace in requirements.md.

Security

Vendor credential files are read read-only — the router never writes
~/.codex, ~/.gemini, ~/.qwen, ~/.claude.
Refresh tokens / access tokens / private keys are never logged.
No OAuth client secret is committed to source — Gemini's is supplied via env var.
unsafe_code = "forbid"; clippy all/pedantic/nursery clean.

Testing

cargo clippy --all-targets --all-features → 0 warnings.
cargo test → 170 tests pass (111 lib + 49 integration + 10 release-tooling),
including unit tests for the refresh layer, rate-limit header selection, and URL
joining per provider.
A changelog.d/ fragment (bump: minor) documents the new capabilities.

Closes #37

Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: #37

Adopt the best experience from heyhuynhgiabuu/proxypal to plan full support for Claude, Codex, Gemini, and Qwen subscriptions. - README.md: analysis + gap matrix (ProxyPal is a UX shell over CLIProxyAPI) - requirements.md: process + functional requirements trace - solution-plans.md: file-level phased plan per requirement + execution order - components-survey.md: build-vs-borrow survey (CLIProxyAPI, oauth2 crate, ...) - online-research.md: cited per-provider OAuth endpoints/tokens/quotas - proxypal-analysis.md: deep inventory of ProxyPal + CLIProxyAPI engine - raw/: issue JSON + ProxyPal snapshot metadata Remove placeholder .gitkeep now that real content exists. Refs #37

The v0.19.0 release (ac24fe4) bumped Cargo.toml to 0.19.0 but left Cargo.lock's link-assistant-router entry at 0.18.0. 'cargo package --list' re-syncs the lock and then fails on the resulting dirty working tree, breaking the Build Package CI job. Sync the lock to match Cargo.toml.

konard · 2026-06-17T17:38:40Z

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

Public pricing estimate: $24.007025
Calculated by Anthropic: $25.833627
Difference: $1.826602 (+7.61%)

📊 Context and tokens usage:

Claude Opus 4.8: (2 sub-sessions)

116.6K / 1M (12%) input tokens, 38.0K / 128K (30%) output tokens
52.4K / 1M (5%) input tokens, 11.8K / 128K (9%) output tokens

Total: (253.1K new + 1.6M cache writes + 7.1M cache reads) input tokens, 271.5K output tokens, $21.651696 cost

Claude Haiku 4.5:

1.3M / 200K (646%) input tokens, 52.6K / 64K (82%) output tokens

Total: 1.3M input tokens, 52.6K output tokens, $2.355329 cost

🤖 Models used:

Tool: Anthropic Claude Code
Requested: opus
Main model: Claude Opus 4.8 (claude-opus-4-8)
Additional models:
- Claude Haiku 4.5 (claude-haiku-4-5-20251001)

📎 Log file uploaded as Gist (6824KB)

View complete solution draft log

Now working session is ended, feel free to review and add any feedback on the solution draft.

konard · 2026-06-17T17:41:03Z

✅ Ready to merge

This pull request is now ready to be merged:

All CI checks have passed
No merge conflicts
No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

konard · 2026-06-17T17:52:22Z

We need implement all the features of underlying tech to support subscriptions, API routing and so on. With no UI support. Only the best practices for the cli/docker/library.

And we also need to update all our dependencies to latest versions (using latest stable Rust only).

konard · 2026-06-17T17:53:50Z

🤖 AI Work Session Started

Starting automated work session at 2026-06-17T17:53:47.507Z

The PR has been converted to draft mode while work is in progress.

This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback.

- Bump edition 2021 -> 2024, rust-version 1.70 -> 1.85 (latest stable Rust) - Major dependency bumps: reqwest 0.12->0.13, jsonwebtoken 9->10, tower-http 0.6->0.7, log-lazy 0.1->0.2, sha2 0.10->0.11; minor bumps for hyper, tokio, bytes, http and others via cargo update - jsonwebtoken 10 requires explicit crypto provider: enable rust_crypto (pure-Rust, no aws-lc C deps) to keep the single-binary build portable - Reformat imports to edition 2024 rustfmt style - Replace io::Error::new(ErrorKind::Other, _) with io::Error::other(_) All tests green (82 + 49 + 10); clippy clean; fmt clean.

Introduce src/subscription.rs: a generalized, fully unit-tested reader for Claude/Codex/Gemini/Qwen subscription OAuth tokens cached by their vendor CLIs. Normalizes the four on-disk layouts (Claude nested claudeAiOauth, Codex tokens block + chatgpt-account-id, Gemini/Qwen flat oauth_creds.json with expiry_date/resource_url) into a single SubscriptionToken with base-URL resolution. Foundation for F2/F3/F4/F5.

…eams Wire the SubscriptionReader into AppState and dispatch chat-completions, responses, and model-listing routes to the right vendor backend: - Codex (ChatGPT): translate Chat Completions -> Responses API and forward to the ChatGPT backend with chatgpt-account-id / openai-beta headers. - Qwen: forward OpenAI-compatible requests to DashScope, honoring the per-account resource_url base. - Gemini: translate to/from the Code Assist generateContent dialect, synthesizing SSE when the client requested streaming. Adds UpstreamProvider::{Codex,Gemini,Qwen} variants plus subscription_provider() mapping, openai::chat_completion_to_responses, src/gemini.rs, and src/subscription_proxy.rs.

Extend `router doctor` to report each vendor subscription credential file: whether it is present, and whether the token is readable, valid, or expired.

When an on-disk Codex/Gemini/Qwen token has expired, exchange its refresh_token for a fresh access token using each vendor's public OAuth client and cache the result in memory. Vendor credential files stay read-only; secrets are never logged. Wired into both the OpenAI-shaped subscription forwarder and the Gemini Code Assist path via a shared TokenCache on AppState. This keeps the proxy working even when the vendor CLI is not running to refresh its own credential file (the key ProxyPal behavior).

Forward Retry-After and x-ratelimit-* headers from subscription upstreams to the client on both streaming and buffered responses, so callers can honor provider throttling and back off.

Add a changelog fragment (minor bump) and README coverage: the new UPSTREAM_PROVIDER values, credential locations, in-memory refresh, and dialect translation for the vendor subscription upstreams.

Mark F2-F5/F7/F11 as Done (subscription reading, routing, dialect translation, in-memory refresh), F8/F10 as Partial, and F6/F9/F12 as deferred/out-of-scope per the no-UI governing directive.

src/openai.rs exceeded the repo's 1000-line file cap after the Codex Responses-API translation was added. Move the Responses-API request/response translators (OpenAIResponseRequest, response_to_anthropic, chat_completion_to_responses, anthropic_to_response) into a dedicated src/responses.rs module, sharing the field-shaping helpers (extract_text, translate_tools) via pub(crate). No behaviour change.

konard · 2026-06-17T19:03:17Z

Working session summary

All tasks are complete. Here's the final status.

Summary

The blocker — a leaked Google OAuth client secret committed in src/refresh.rs (commit c009fae) that GitHub push protection rejected — is resolved, and PR #38 is now fully implemented, green in CI, and marked ready for review.

What I did this session

Removed the leaked secret from history. The hardcoded Gemini OAuth client secret was already replaced in the working tree with env-var resolution (GEMINI_OAUTH_CLIENT_SECRET). I rebuilt the branch from the parent of the offending commit via cherry-pick — re-committing the now-secret-free refresh layer and the three following commits — so the secret exists in no commit anywhere (git log -S confirms empty). Force-pushed successfully; GitHub accepted it.
Fixed the CI failure. The first CI run after the push passed all platform tests but failed the file-size lint — src/openai.rs had grown to 1051 lines (repo cap is 1000) after the Codex Responses-API translation was added. I extracted the Responses-API translators (OpenAIResponseRequest, response_to_anthropic, chat_completion_to_responses, anthropic_to_response) into a new src/responses.rs module, sharing extract_text/translate_tools via pub(crate), and updated the four call sites in proxy.rs. openai.rs is now 860 lines. No behavior change; added a unit test for the projection.
Updated the PR. Rewrote the stale docs-only title/description to reflect the actual implemented engine (multi-provider subscriptions, routing, dialect translation, in-memory refresh, doctor, rate-limit relay), with a security and requirement-status section.

Final verification

cargo clippy --all-targets --all-features → 0 warnings
cargo fmt --check → clean
cargo test → 171 tests pass (112 lib + 49 integration + 10 tooling)
CI run 27712450678 (commit a41c937): all jobs success — Lint and Format Check, Test on macOS/Ubuntu/Windows, Build Package
Working tree clean; 12 commits ahead of main; branch up to date with main
PR feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh #38 marked ready for review

PR: #38

This summary was automatically extracted from the AI working session output.

konard · 2026-06-17T19:03:29Z

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

Public pricing estimate: $16.175775
Calculated by Anthropic: $18.091184
Difference: $1.915410 (+11.84%)

📊 Context and tokens usage:

Claude Opus 4.8: (5 sub-sessions)

110.8K / 1M (11%) input tokens, 5.6K / 128K (4%) output tokens
114.4K / 1M (11%) input tokens, 24.9K / 128K (19%) output tokens
117.0K / 1M (12%) input tokens, 33.1K / 128K (26%) output tokens
116.5K / 1M (12%) input tokens, 42.9K / 128K (34%) output tokens
83.2K / 1M (8%) input tokens, 25.9K / 128K (20%) output tokens

Total: (93.8K new + 510.8K cache writes + 17.0M cache reads) input tokens, 160.6K output tokens, $16.175775 cost

🤖 Models used:

Tool: Anthropic Claude Code
Requested: opus
Model: Claude Opus 4.8 (claude-opus-4-8)

📎 Log file uploaded as Gist (7692KB)

View complete solution draft log

Now working session is ended, feel free to review and add any feedback on the solution draft.

konard · 2026-06-17T19:06:01Z

✅ Ready to merge

This pull request is now ready to be merged:

All CI checks have passed
No merge conflicts
No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

Initial commit with task details

0c29c0f

Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: #37

konard self-assigned this Jun 17, 2026

konard changed the title ~~[WIP] Make sure we use all the best experience from https://github.com/heyhuynhgiabuu/proxypal~~ docs(issue-37): ProxyPal multi-provider case study and roadmap Jun 17, 2026

konard marked this pull request as ready for review June 17, 2026 17:29

konard marked this pull request as draft June 17, 2026 17:53

konard added 8 commits June 17, 2026 18:03

feat(doctor): probe Codex/Gemini/Qwen subscription credentials

5b4de62

Extend `router doctor` to report each vendor subscription credential file: whether it is present, and whether the token is readable, valid, or expired.

feat(subscriptions): relay rate-limit headers from upstreams

b879f58

Forward Retry-After and x-ratelimit-* headers from subscription upstreams to the client on both streaming and buffered responses, so callers can honor provider throttling and back off.

docs(issue-37): document Codex/Gemini/Qwen subscription support

868f31e

Add a changelog fragment (minor bump) and README coverage: the new UPSTREAM_PROVIDER values, credential locations, in-memory refresh, and dialect translation for the vendor subscription upstreams.

docs(issue-37): update requirements trace to reflect implemented engine

ad8d721

Mark F2-F5/F7/F11 as Done (subscription reading, routing, dialect translation, in-memory refresh), F8/F10 as Partial, and F6/F9/F12 as deferred/out-of-scope per the no-UI governing directive.

konard changed the title ~~docs(issue-37): ProxyPal multi-provider case study and roadmap~~ feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh Jun 17, 2026

konard marked this pull request as ready for review June 17, 2026 19:02

konard merged commit 4c33f70 into main Jun 17, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh#38

feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh#38
konard merged 12 commits into
mainfrom
issue-37-69d3f0803294

konard commented Jun 17, 2026 •

edited

Loading

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026 •

edited

Loading

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

konard commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Engine changes

Dependencies & toolchain

Case-study package

Key finding

Requirement status (functional, F1–F12)

Security

Testing

Uh oh!

konard commented Jun 17, 2026

🤖 Solution Draft Log

💰 Cost estimation:

📊 Context and tokens usage:

🤖 Models used:

📎 Log file uploaded as Gist (6824KB)

Uh oh!

konard commented Jun 17, 2026

✅ Ready to merge

Uh oh!

konard commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

konard commented Jun 17, 2026

Uh oh!

konard commented Jun 17, 2026

Working session summary

Summary

What I did this session

Final verification

Uh oh!

konard commented Jun 17, 2026

🤖 Solution Draft Log

💰 Cost estimation:

📊 Context and tokens usage:

🤖 Models used:

📎 Log file uploaded as Gist (7692KB)

Uh oh!

konard commented Jun 17, 2026

✅ Ready to merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

konard commented Jun 17, 2026 •

edited

Loading

konard commented Jun 17, 2026 •

edited

Loading