Skip to content

feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh#38

Merged
konard merged 12 commits into
mainfrom
issue-37-69d3f0803294
Jun 17, 2026
Merged

feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh#38
konard merged 12 commits into
mainfrom
issue-37-69d3f0803294

Conversation

@konard

@konard konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What this PR does

Implements multi-provider subscription support and API routing in the router
engine, adopting the best practices from
heyhuynhgiabuu/proxypal /
CLIProxyAPI, so the router can
"fully support claude, codex, gemini, qwen, and their subscriptions with all our
features and more."

Per the governing directive on this PR, this is implemented as a CLI / Docker /
library engine with no UI
, on the latest stable Rust with all
dependencies updated
.

Engine changes

  • Multi-vendor OAuth credential reader (src/subscription.rs) — reads each
    vendor CLI's credential file read-only: ~/.codex/auth.json,
    ~/.gemini/oauth_creds.json, ~/.qwen/oauth_creds.json (Claude via the
    existing ~/.claude reader). The router never writes to vendor locations.
  • Subscription routing (src/subscription_proxy.rs, src/gemini.rs) — new
    UpstreamProvider::{Codex, Gemini, Qwen} upstreams route
    /v1/chat/completions, /v1/responses, and /v1/models to the correct
    vendor backend (ChatGPT Codex Responses API, Gemini Code Assist generateContent,
    DashScope OpenAI-compatible).
  • Dialect translation matrix — OpenAI Chat Completions ↔ OpenAI Responses
    (Codex) ↔ Gemini Code Assist generateContent, including SSE synthesis when
    a client requests streaming from Gemini.
  • In-memory OAuth refresh (src/refresh.rs) — expired Codex/Gemini/Qwen
    tokens are refreshed via each vendor's public OAuth client and cached in
    memory only
    ; vendor files stay read-only and secrets are never logged. No
    OAuth client secret is hardcoded — Gemini's installed-app secret is supplied at
    runtime via the GEMINI_OAUTH_CLIENT_SECRET env var.
  • router doctor now probes each subscription credential file and reports
    present / token OK / EXPIRED / NO TOKEN / MISSING.
  • Rate-limit relayRetry-After and x-ratelimit-* headers from
    subscription upstreams are passed through so clients can back off intelligently.

Dependencies & toolchain

  • Updated all dependencies to latest and adopted Rust edition 2024 on the
    latest stable toolchain (build(deps): ... commit).

Case-study package

The evidence-backed analysis under
docs/case-studies/issue-37/
collects the data, analyzes ProxyPal/CLIProxyAPI, lists every requirement
(requirements.md, F1–F12), proposes a plan per requirement
(solution-plans.md), and surveys components (components-survey.md). Online
research with cited primary sources and redacted secrets is in
online-research.md.

Key finding

ProxyPal is a UX shell, not an engine — a Tauri/SolidJS GUI wrapping the Go
CLIProxyAPI binary. "Adopt the best of ProxyPal" therefore resolves to adopting
CLIProxyAPI's engine architecture (per-provider OAuth, dialect translation,
account rotation/cooldown) natively in our Rust engine — which this PR does — while
its GUI/auto-config UX is explicitly out of scope per the no-UI directive.

Requirement status (functional, F1–F12)

Done Partial Deferred / Out of scope
F1 Claude, F2 Codex, F3 Gemini, F4 Qwen, F5 provider abstraction, F7 token refresh, F11 dialect matrix F8 account pool (rate-limit relay + Claude pool), F10 quota observability (doctor) F6 native login (delegated to vendor CLIs by design), F9 auto-config, F12 GUI (no-UI directive)

Full trace in requirements.md.

Security

  • Vendor credential files are read read-only — the router never writes
    ~/.codex, ~/.gemini, ~/.qwen, ~/.claude.
  • Refresh tokens / access tokens / private keys are never logged.
  • No OAuth client secret is committed to source — Gemini's is supplied via env var.
  • unsafe_code = "forbid"; clippy all/pedantic/nursery clean.

Testing

  • cargo clippy --all-targets --all-features → 0 warnings.
  • cargo test → 170 tests pass (111 lib + 49 integration + 10 release-tooling),
    including unit tests for the refresh layer, rate-limit header selection, and URL
    joining per provider.
  • A changelog.d/ fragment (bump: minor) documents the new capabilities.

Closes #37

Adding .gitkeep for PR creation (default mode).
This file will be removed when the task is complete.

Issue: #37
@konard konard self-assigned this Jun 17, 2026
Adopt the best experience from heyhuynhgiabuu/proxypal to plan full
support for Claude, Codex, Gemini, and Qwen subscriptions.

- README.md: analysis + gap matrix (ProxyPal is a UX shell over CLIProxyAPI)
- requirements.md: process + functional requirements trace
- solution-plans.md: file-level phased plan per requirement + execution order
- components-survey.md: build-vs-borrow survey (CLIProxyAPI, oauth2 crate, ...)
- online-research.md: cited per-provider OAuth endpoints/tokens/quotas
- proxypal-analysis.md: deep inventory of ProxyPal + CLIProxyAPI engine
- raw/: issue JSON + ProxyPal snapshot metadata

Remove placeholder .gitkeep now that real content exists.

Refs #37
@konard konard changed the title [WIP] Make sure we use all the best experience from https://github.com/heyhuynhgiabuu/proxypal docs(issue-37): ProxyPal multi-provider case study and roadmap Jun 17, 2026
@konard konard marked this pull request as ready for review June 17, 2026 17:29
The v0.19.0 release (ac24fe4) bumped Cargo.toml to 0.19.0 but left
Cargo.lock's link-assistant-router entry at 0.18.0. 'cargo package --list'
re-syncs the lock and then fails on the resulting dirty working tree,
breaking the Build Package CI job. Sync the lock to match Cargo.toml.
@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

  • Public pricing estimate: $24.007025
  • Calculated by Anthropic: $25.833627
  • Difference: $1.826602 (+7.61%)

📊 Context and tokens usage:

Claude Opus 4.8: (2 sub-sessions)

  1. 116.6K / 1M (12%) input tokens, 38.0K / 128K (30%) output tokens
  2. 52.4K / 1M (5%) input tokens, 11.8K / 128K (9%) output tokens

Total: (253.1K new + 1.6M cache writes + 7.1M cache reads) input tokens, 271.5K output tokens, $21.651696 cost

Claude Haiku 4.5:

  • 1.3M / 200K (646%) input tokens, 52.6K / 64K (82%) output tokens

Total: 1.3M input tokens, 52.6K output tokens, $2.355329 cost

🤖 Models used:

  • Tool: Anthropic Claude Code
  • Requested: opus
  • Main model: Claude Opus 4.8 (claude-opus-4-8)
  • Additional models:
    • Claude Haiku 4.5 (claude-haiku-4-5-20251001)

📎 Log file uploaded as Gist (6824KB)


Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

✅ Ready to merge

This pull request is now ready to be merged:

  • All CI checks have passed
  • No merge conflicts
  • No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

We need implement all the features of underlying tech to support subscriptions, API routing and so on. With no UI support. Only the best practices for the cli/docker/library.

And we also need to update all our dependencies to latest versions (using latest stable Rust only).

@konard konard marked this pull request as draft June 17, 2026 17:53
@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

🤖 AI Work Session Started

Starting automated work session at 2026-06-17T17:53:47.507Z

The PR has been converted to draft mode while work is in progress.

This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback.

konard added 8 commits June 17, 2026 18:03
- Bump edition 2021 -> 2024, rust-version 1.70 -> 1.85 (latest stable Rust)
- Major dependency bumps: reqwest 0.12->0.13, jsonwebtoken 9->10,
  tower-http 0.6->0.7, log-lazy 0.1->0.2, sha2 0.10->0.11; minor bumps
  for hyper, tokio, bytes, http and others via cargo update
- jsonwebtoken 10 requires explicit crypto provider: enable rust_crypto
  (pure-Rust, no aws-lc C deps) to keep the single-binary build portable
- Reformat imports to edition 2024 rustfmt style
- Replace io::Error::new(ErrorKind::Other, _) with io::Error::other(_)

All tests green (82 + 49 + 10); clippy clean; fmt clean.
Introduce src/subscription.rs: a generalized, fully unit-tested reader for
Claude/Codex/Gemini/Qwen subscription OAuth tokens cached by their vendor
CLIs. Normalizes the four on-disk layouts (Claude nested claudeAiOauth,
Codex tokens block + chatgpt-account-id, Gemini/Qwen flat oauth_creds.json
with expiry_date/resource_url) into a single SubscriptionToken with base-URL
resolution. Foundation for F2/F3/F4/F5.
…eams

Wire the SubscriptionReader into AppState and dispatch chat-completions,
responses, and model-listing routes to the right vendor backend:

- Codex (ChatGPT): translate Chat Completions -> Responses API and forward
  to the ChatGPT backend with chatgpt-account-id / openai-beta headers.
- Qwen: forward OpenAI-compatible requests to DashScope, honoring the
  per-account resource_url base.
- Gemini: translate to/from the Code Assist generateContent dialect,
  synthesizing SSE when the client requested streaming.

Adds UpstreamProvider::{Codex,Gemini,Qwen} variants plus
subscription_provider() mapping, openai::chat_completion_to_responses,
src/gemini.rs, and src/subscription_proxy.rs.
Extend `router doctor` to report each vendor subscription credential file:
whether it is present, and whether the token is readable, valid, or expired.
When an on-disk Codex/Gemini/Qwen token has expired, exchange its
refresh_token for a fresh access token using each vendor's public OAuth
client and cache the result in memory. Vendor credential files stay
read-only; secrets are never logged. Wired into both the OpenAI-shaped
subscription forwarder and the Gemini Code Assist path via a shared
TokenCache on AppState.

This keeps the proxy working even when the vendor CLI is not running to
refresh its own credential file (the key ProxyPal behavior).
Forward Retry-After and x-ratelimit-* headers from subscription upstreams
to the client on both streaming and buffered responses, so callers can
honor provider throttling and back off.
Add a changelog fragment (minor bump) and README coverage: the new
UPSTREAM_PROVIDER values, credential locations, in-memory refresh, and
dialect translation for the vendor subscription upstreams.
Mark F2-F5/F7/F11 as Done (subscription reading, routing, dialect
translation, in-memory refresh), F8/F10 as Partial, and F6/F9/F12 as
deferred/out-of-scope per the no-UI governing directive.
@konard konard changed the title docs(issue-37): ProxyPal multi-provider case study and roadmap feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh Jun 17, 2026
src/openai.rs exceeded the repo's 1000-line file cap after the Codex
Responses-API translation was added. Move the Responses-API request/response
translators (OpenAIResponseRequest, response_to_anthropic,
chat_completion_to_responses, anthropic_to_response) into a dedicated
src/responses.rs module, sharing the field-shaping helpers (extract_text,
translate_tools) via pub(crate). No behaviour change.
@konard konard marked this pull request as ready for review June 17, 2026 19:02
@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Working session summary

All tasks are complete. Here's the final status.

Summary

The blocker — a leaked Google OAuth client secret committed in src/refresh.rs (commit c009fae) that GitHub push protection rejected — is resolved, and PR #38 is now fully implemented, green in CI, and marked ready for review.

What I did this session

  1. Removed the leaked secret from history. The hardcoded Gemini OAuth client secret was already replaced in the working tree with env-var resolution (GEMINI_OAUTH_CLIENT_SECRET). I rebuilt the branch from the parent of the offending commit via cherry-pick — re-committing the now-secret-free refresh layer and the three following commits — so the secret exists in no commit anywhere (git log -S confirms empty). Force-pushed successfully; GitHub accepted it.

  2. Fixed the CI failure. The first CI run after the push passed all platform tests but failed the file-size lintsrc/openai.rs had grown to 1051 lines (repo cap is 1000) after the Codex Responses-API translation was added. I extracted the Responses-API translators (OpenAIResponseRequest, response_to_anthropic, chat_completion_to_responses, anthropic_to_response) into a new src/responses.rs module, sharing extract_text/translate_tools via pub(crate), and updated the four call sites in proxy.rs. openai.rs is now 860 lines. No behavior change; added a unit test for the projection.

  3. Updated the PR. Rewrote the stale docs-only title/description to reflect the actual implemented engine (multi-provider subscriptions, routing, dialect translation, in-memory refresh, doctor, rate-limit relay), with a security and requirement-status section.

Final verification

  • cargo clippy --all-targets --all-features → 0 warnings
  • cargo fmt --check → clean
  • cargo test → 171 tests pass (112 lib + 49 integration + 10 tooling)
  • CI run 27712450678 (commit a41c937): all jobs success — Lint and Format Check, Test on macOS/Ubuntu/Windows, Build Package
  • Working tree clean; 12 commits ahead of main; branch up to date with main
  • PR feat(issue-37): multi-provider subscription engine (Codex, Gemini, Qwen) + token refresh #38 marked ready for review

PR: #38


This summary was automatically extracted from the AI working session output.

@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

  • Public pricing estimate: $16.175775
  • Calculated by Anthropic: $18.091184
  • Difference: $1.915410 (+11.84%)

📊 Context and tokens usage:

Claude Opus 4.8: (5 sub-sessions)

  1. 110.8K / 1M (11%) input tokens, 5.6K / 128K (4%) output tokens
  2. 114.4K / 1M (11%) input tokens, 24.9K / 128K (19%) output tokens
  3. 117.0K / 1M (12%) input tokens, 33.1K / 128K (26%) output tokens
  4. 116.5K / 1M (12%) input tokens, 42.9K / 128K (34%) output tokens
  5. 83.2K / 1M (8%) input tokens, 25.9K / 128K (20%) output tokens

Total: (93.8K new + 510.8K cache writes + 17.0M cache reads) input tokens, 160.6K output tokens, $16.175775 cost

🤖 Models used:

  • Tool: Anthropic Claude Code
  • Requested: opus
  • Model: Claude Opus 4.8 (claude-opus-4-8)

📎 Log file uploaded as Gist (7692KB)


Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard

konard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

✅ Ready to merge

This pull request is now ready to be merged:

  • All CI checks have passed
  • No merge conflicts
  • No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

@konard konard merged commit 4c33f70 into main Jun 17, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make sure we use all the best experience from https://github.com/heyhuynhgiabuu/proxypal

1 participant