feat: native OpenAI Codex (ChatGPT Plus) OAuth provider#22
feat: native OpenAI Codex (ChatGPT Plus) OAuth provider#22nsyring wants to merge 28 commits intoagent0ai:mainfrom
Conversation
Adds the headless `_core/openai_codex/` module with Codex endpoint detection, Cloudflare-compatible header helpers, and the module contract documenting the ChatGPT Plus OAuth transport. - request.js defines CODEX_BASE_URL plus prefix-match detection so `/responses`, `/models`, and future sub-endpoints share one matcher - applyCodexHeaders() sets the required User-Agent plus originator headers; the Cloudflare layer in front of the Codex endpoint rejects requests without these regardless of token validity - extractChatGPTAccountId() tolerantly parses the OAuth JWT claim `https://api.openai.com/auth.chatgpt_account_id` using atob() for the browser request-mutation path; malformed tokens silently return "" so the header is just omitted - AGENTS.md documents the Cloudflare header requirement, the Chat- Completions-to-Responses request-shape conversion rules, the SSE event mapping tables, the persisted token shape, and the OAuth URL constants - root AGENTS.md index gains the new module path
Pure stateless converter from OpenAI Chat-Completions bodies into
Codex Responses-API bodies, with 11 unit tests covering the shape
rules documented in the module AGENTS.md.
- the first system message lifts into the top-level `instructions`
string and drops from `input`
- remaining user/assistant messages become `input[]` entries with
`content: [{ type: "input_text", text }]`
- multimodal `text` parts stay as `input_text`, `image_url` parts
become `input_image`
- `max_output_tokens`, `temperature`, and other Chat-Completions-only
fields are stripped because Codex rejects them with HTTP 400
- `store: false` is always forced so Codex does not retain completions
Pure stateless mapper from Codex Responses-API SSE events into the
Chat-Completions-shaped delta frames the existing space-agent SSE
parsers already understand, with 15 unit tests covering single-event
mapping plus realistic multi-event sequences.
- `response.output_text.delta` and `response.refusal.delta` emit
`{ choices: [{ delta: { content: delta }, index: 0 }] }` frames
- `response.completed` synthesizes a finish frame with mapped usage
tokens plus a `[DONE]` marker so the existing Chat-Completions
stream reader terminates cleanly; the Responses-API does not emit
a native `[DONE]` line
- `response.incomplete` maps the Codex reason onto the closest Chat-
Completions finish_reason (`max_output_tokens` -> `length`,
`content_filter` passes through)
- `response.failed` and standalone `error` events throw with the
upstream message so the transport layer surfaces the error
- all other events (content_part.added/done, output_item.added/done,
reasoning, audio, tool-calls, code_interpreter, file_search,
web_search, image_gen, mcp, queued, annotations, custom_tool_call
input, future unknowns) are skipped silently to avoid unknown-event
log noise
- text is accumulated live from delta events rather than from the
final `response.completed.response.output` because the Codex
endpoint has been observed returning an empty `output` array even
when deltas streamed correctly
Adds three authenticated endpoints that own the OAuth device-code flow
and refresh-token rotation against `auth.openai.com`, backed by a new
`server/lib/openai_codex/` helper subsystem.
Endpoints:
- `POST /api/openai_codex_auth_start` -> returns
`{ deviceAuthId, userCode, verificationUrl, interval, expiresIn }`
- `POST /api/openai_codex_auth_poll` -> returns `{ status: "pending" }`
until the user authorizes, then `{ status: "complete", tokens }`
- `POST /api/openai_codex_token_refresh` -> returns a refreshed token
payload; maps OAuth `invalid_grant` onto HTTP 401 so the frontend can
prompt a fresh login when the refresh token was consumed elsewhere
Why this lives on the server:
OpenAI refresh tokens use single-use rotation. A frontend-only
implementation cannot serialize concurrent tab refreshes safely: both
calls post the same single-use token, one succeeds, one returns
`invalid_grant`, and the only valid refresh token is lost. Full
re-authentication becomes the only recovery. This is a shared-data
integrity concern under the rule in `/server/AGENTS.md`.
The server layer provides:
- `oauth_client.js` pure transport functions for the three OAuth calls
with defensive JSON body parsing, JWT account-id extraction from the
`access_token` (never `id_token`), and `502` mapping for upstream
failures
- `refresh_lock.js` in-process single-writer coalescer keyed by the
refresh-token string; documented limitation: in clustered runtime
(WORKERS>1) different workers still race, which is acceptable for
the single-user single-browser-profile scenario and documented in
`server/lib/openai_codex/AGENTS.md`
Token persistence stays on the frontend under `userCrypto:`-prefixed
encryption; the server never reads or writes tokens from the app tree.
No revoke endpoint is exposed because logout clears the encrypted
config entry and access tokens expire within about an hour.
Docs:
- `server/lib/openai_codex/AGENTS.md` new contract doc
- `server/api/AGENTS.md` documents the new endpoint family and its
backend-ownership rationale
- `server/AGENTS.md` index and structure updated
- root `AGENTS.md` index updated
Browser-side helper that wraps the three OAuth backend endpoints and
guarantees the locally persisted refresh token is the one actually
used at refresh time, avoiding single-use-rotation loss across tabs.
- `ensureFreshCodexAccessToken({ loadTokens, saveTokens, ... })`
re-reads persisted tokens on every call rather than trusting an
in-memory copy; another tab or process may have rotated the refresh
token, and the single-use rotation rule means a stale in-memory
refresh_token would fail with invalid_grant on next refresh
- refresh is triggered when the access token is within the default
300s safety margin of expiry; the persisted `expires_at` timestamp
is computed server-side from the OAuth `expires_in` response so both
chat surfaces share one source of truth
- concurrent refresh calls that observe the same stale refresh_token
are coalesced into one network request via an in-module map of
in-flight promises, on top of the separate server-side mutex in
`server/lib/openai_codex/refresh_lock.js`
- `saveTokens` failures are logged via console.warn but do not block
the active request; the refreshed tokens are still returned so the
LLM call can proceed
- thin wrappers `startCodexDeviceAuthorization` and
`pollCodexDeviceAuthorization` expose the first two OAuth endpoints
for the upcoming settings UI
- 11 unit tests cover not-expiring, refreshing, always-fresh read,
concurrent coalesce, missing tokens, invalid_grant propagation, and
save-failure resilience
Adds the shipped Codex model catalog used by the settings UI and the stateful controller that drives the device-code login UX against the three OAuth backend endpoints. Also extends the overlay config enum with the `openai-codex` provider variant so upcoming UI wiring has a third tab to bind to. - `models.js` ships the 6 Codex models observable from ChatGPT Plus subscriptions and exports `CODEX_DEFAULT_MODEL_ID` (`gpt-5.4-mini`) as the recommended default: cheapest, fastest, and the most quota-friendly option. Live discovery via `GET /backend-api/codex/models?client_version=1.0.0` is left as follow-up rather than MVP because it would add a second network call to the login UX. - `auth_flow.js` emits a small finite state machine (`STARTING` -> `PENDING` -> `COMPLETE` | `FAILED`) so the settings UI can display the verification URL and user code while polling runs, and aborts cleanly via `AbortSignal` when the user cancels. It uses the poll interval returned by the OAuth server but enforces a 3-second floor so we never hammer the endpoint if the server sends something unusable. - `onscreen_agent/config.js` gains the `openai-codex` provider enum value plus new settings fields (`codexModel`, `codexTokens`), and `normalizeOnscreenAgentLlmProvider` now recognizes the third variant. UI and transport wiring land in the next two commits.
Integrates the `openai-codex` provider into the overlay chat surface: a new LLM-client subclass with a Codex-aware SSE reader, the request- mutation hook that swaps Chat-Completions shape for Responses-API shape and adds the Cloudflare-required headers, and the settings UI that drives the device-code login. Transport: - `OnscreenAgentCodexLlmClient` extends the shared base client and uses a dedicated `readCodexStreamingResponse` that feeds raw SSE event blocks through `mapCodexEventToChatFrames` so the upstream per-delta callback contract stays identical for all providers - the OpenRouter-style Chat-Completions SSE reader is left untouched; Codex is an additive subclass rather than a core refactor - `createOnscreenAgentLlmClient` now dispatches on three provider variants (API / CODEX / LOCAL) Request hook (`ext/js/_core/onscreen_agent/api.js/prepareOnscreen AgentApiRequest/end/openai-codex.js`): - detects Codex-provider settings and rewrites the prepared request in place: `requestUrl` becomes the Codex `/responses` endpoint, `requestBody` goes through `chatToResponsesRequest`, headers add the Cloudflare originator plus extracted `ChatGPT-Account-ID` - ensures a fresh access token through the always-fresh-read `ensureFreshCodexAccessToken` helper, loading persisted tokens from the `userCrypto:`-encrypted `codex_tokens` entry in `~/conf/onscreen-agent.yaml` and saving refreshed tokens back into the same file so other tabs pick up the rotation Settings UI: - third segmented-control tab labeled `ChatGPT` next to the existing `API` and `Local` tabs - three UI states: logged-out (explainer + Sign in button), login-pending (verification URL + user code + Cancel), and signed- in (account summary + model dropdown + Sign out) - login flow runs through `runCodexDeviceAuthorizationFlow` with an AbortController so Cancel stops polling immediately - model dropdown is populated from `CODEX_MODEL_CATALOG` and defaults to `gpt-5.4-mini` Docs: - onscreen_agent AGENTS.md now documents the three-tab contract and references the Codex flow + token-storage rule
Mirrors the overlay integration for the admin chat surface. The admin runtime is function-based rather than class-based, so the shape differs from the overlay wiring but the user-facing behavior and transport contract are identical. Transport: - `streamAdminAgentCodexCompletion` is a new transport function added next to `streamAdminAgentApiCompletion`; it validates Codex- specific settings and dispatches through the shared Codex request hook so the outbound body and headers end up in Responses-API shape with the Cloudflare-required originator - `readCodexStreamingResponse` owns SSE decoding through the shared `mapCodexEventToChatFrames` mapper; the Chat-Completions SSE reader is unchanged so OpenRouter and other OpenAI-compatible providers are untouched - `streamAdminAgentCompletion` now dispatches on three provider variants (LOCAL / CODEX / API) - the existing SSE reader is duplicated in admin and overlay scopes; that duplication is pre-existing and intentionally left alone for this PR to keep the change surface narrow Request hook (`ext/js/_core/admin/views/agent/api.js/prepareAdmin AgentApiRequest/end/openai-codex.js`): - detects Codex-provider settings and rewrites the prepared request to the Codex `/responses` endpoint with shape conversion, fresh- token injection, and Cloudflare headers - reads and writes persisted tokens in `~/conf/admin-chat.yaml` under `codex_tokens` as `userCrypto:`-encrypted ciphertext; admin and overlay keep separate token files so the two surfaces can technically use different ChatGPT accounts, and refresh-token rotation on one surface never races the other Settings UI: - third segmented-control tab labeled `ChatGPT` next to `API` and `Local`, with the same three UI states (logged-out / pending / signed-in) and abort-controlled login flow as the overlay - adds a scope notice on the logged-out state pointing to the overlay settings so users know the login does not carry across surfaces, which mirrors the separate token-file decision above Docs: - admin AGENTS.md now documents the three-tab contract and the admin-scoped Codex login rules
Adds a public-facing section to the README describing the new `ChatGPT` provider tab in the overlay and admin chat, with setup steps, cross-surface scope notes, and troubleshooting for the three failure modes most likely to confuse operators: - Cloudflare `cf-mitigated: challenge` 403 when a downstream edit strips the required originator headers - `invalid_grant` after a parallel Codex CLI or VS Code extension session rotated the same refresh token - `response.completed.response.output` being empty despite streamed deltas (we accumulate from deltas by design) - Chat-Completions-only body fields being rejected with HTTP 400 Also adds an explicit `Requirements` subsection naming ChatGPT Plus as the verified plan, and a ToS disclaimer covering the use of the official OpenAI Codex OAuth flow.
Two bugs blocking the device-code flow against the live endpoint:
1. The OpenAI device-code response uses `interval` as a string (e.g.
`"5"`) and encodes expiry as an ISO-8601 `expires_at` timestamp
rather than an `expires_in` seconds field. The previous
`Number.isFinite(payload.interval)` check rejected the string and
fell back to the 3-second minimum; `expires_in` was always missing
so the flow quietly used the 900-second default. Parse both
defensively and derive seconds from `expires_at` when `expires_in`
is absent.
2. The poll endpoint returned `{ status: "pending" }` and
`{ status: "complete", tokens }`. The router's HTTP-response-shape
heuristic (`server/router/responses.js`) treats any top-level
`status` key as an HTTP status code, so `Number("pending")` became
`NaN` and `writeHead` crashed with `ERR_HTTP_INVALID_STATUS_CODE`.
Rename the semantic field to `state` on both the server return and
the frontend poll handler.
… for user Codex's Responses API rejects `input_text` parts that sit under a `role: "assistant"` input entry with HTTP 400 `invalid_value` on `input[n].content[0]`. The two text-types are source-scoped: - user-authored turns use `input_text` - assistant-authored turns use `output_text` `convertContentParts` now takes the target text-type from its caller, and `extractInstructionsAndInput` derives that type per message from the role. System messages still render as plain text for the `instructions` field so the internal pass uses `input_text` there as a neutral default. The previously recorded test that asserted `input_text` for assistant strings was itself the bug frozen into an expectation; it is updated to assert the correct split. A separate regression test now walks the full `input[]` and fails fast if any assistant entry ever emits `input_text` again.
The hook files previously reached directly into the file API to read
and write the encrypted `codex_tokens` entry of each surface's YAML
config. That bypassed the store, so `saveSettingsFromDialog` never
copied `codexTokens` into `this.settings` and `buildStoredConfig
Payload` never encoded the tokens at all. The result: users stayed
signed-in for the duration of the dialog draft, but every reload
reverted to the signed-out state because the tokens were never
persisted through the canonical save path.
Clean the contract up so token persistence follows the same pattern
as `api_key`:
- New `app/L0/_all/mod/_core/openai_codex/token_envelope.js` owns the
encrypt/decrypt envelope, lock-state preservation, and the
SINGLE_USER_APP bypass, mirroring the existing `encodeStoredApiKey`
/ `decodeStoredApiKey` helpers rather than duplicating that logic in
each surface's storage.
- `onscreen_agent/storage.js` and `admin/views/agent/storage.js` now
persist `codex_model` and `codex_tokens` in their YAML payload, and
load them back through the new envelope helper. Lock-state
bookkeeping (`storedCodexTokensLocked`, `storedCodexTokensValue`)
is preserved across saves so a session that cannot currently
decrypt the ciphertext does not wipe it on save.
- Both stores now include `codexModel`, `codexTokens`, and the
stored-lock bookkeeping fields in their `settings` and
`settingsDraft` templates, and `saveSettingsFromDialog` copies those
fields from draft to live settings. A new
`applyRefreshedCodexTokens(tokens)` method accepts rotated tokens
from the hook, updates live settings, keeps the dialog draft in
sync while the dialog is open, and awaits `persistConfig` so the
refreshed refresh_token is written back before the next request.
- The overlay and admin request hooks now read tokens from
`settings.codexTokens` (already decrypted by storage.js during
load) and, after a refresh, hand the new tokens to
`Alpine.store("onscreenAgent" | "adminAgent").applyRefreshedCodex
Tokens(tokens)`. No hook ever reads or writes the YAML file
directly anymore.
Adds runtime discovery of the Codex model catalog against `https://chatgpt.com/backend-api/codex/models?client_version=1.0.0`. The live list is account-scoped: a ChatGPT Plus account may expose different models than the hardcoded 6-entry fallback, and new models appear upstream faster than we can refresh a static list. Split into two files so the parser is independently testable: - `models_parser.js` is a pure `parseCodexModelsResponse(payload)` that filters entries with `supported_in_api === false` or `visibility` in {`hide`, `hidden`} (case-insensitive), keeps the `slug` as the outbound `id`, prefers `description` over `display_name` for the subtitle, and sorts by `(priority, slug)` ascending to match the reference Codex-rs ordering. No filter on model-id prefix so future model names (`gpt-5.5`, `gpt-6`, ...) flow through without a code change. - `models_discovery.js` wraps the parser with a browser fetch helper that reuses `applyCodexHeaders()` for the Cloudflare originator plus bearer/account-id headers, tries a direct fetch first, falls back to the existing `/api/proxy` (space-agent outbound-proxy infrastructure, not a new backend endpoint) on CORS or network errors, and silently returns `[]` on any failure so the settings UI can fall back to the static `CODEX_MODEL_CATALOG`. 7 parser tests cover happy-path parsing, hidden-visibility filtering, unsupported-in-api filtering, priority+slug sort ordering, missing slug rejection, `display_name` fallback for descriptions, and malformed-payload tolerance.
Both chat-surface stores now merge the discovered runtime catalog with
the static fallback and expose it through the existing
`codexModelCatalog` getter, so panel.html needs no conditional
branches. Runtime entries win on matching `id` so live descriptions
and ordering are preserved; static-only entries trail so the dropdown
always has content even when discovery fails.
- `refreshCodexModelCatalog({ force = false })` is the one entry point.
It caches the runtime catalog in memory with a 10-minute TTL,
coalesces concurrent callers through a shared in-flight promise,
and returns `[]` with the TTL armed when no access token is
available (so we do not re-probe for a token that will never
arrive during the current session).
- The settings dialog triggers a non-forced refresh on open; a
non-authenticated user transparently no-ops. A forced refresh
happens on successful login so the dropdown reflects the live list
before the user picks a model.
- New `isCodexSelectedModelInCatalog` getter + UI fallback option
keeps the user's currently-configured model selectable and readable
even when the live catalog no longer lists it (account downgrade,
deprecation). A field-note warns the user that chats may fail and
suggests switching.
Admin and overlay receive mirror changes: parallel state fields,
parallel merge helpers, parallel getter + dialog trigger. The admin
runtime still does not expose a runtime namespace, so the Alpine
store is the contract between hook and store the same way the token-
rotation refactor introduced it.
…proxy
The previous direct-first / proxy-second fallback was doomed: the
`chatgpt.com/backend-api/codex/models` endpoint sends no
`Access-Control-Allow-Origin` header, so every direct browser fetch
ends with a CORS block. The block surfaces differently depending on
the preflight outcome (opaque `Response` with `status: 0` on some
browsers, synthetic HTTP 400, or a generic `TypeError` on others),
and the fallback condition `err.message.includes("CORS")` was brittle
across those variants.
Route every discovery request through the space-agent outbound proxy
(`space.proxy.buildUrl(...)` -> `/api/proxy`) unconditionally. This is
the existing cross-origin-read infrastructure shared by other frontend
modules, does not require a new backend endpoint, and the proxied URL
is same-origin so CORS is never a factor. When the runtime does not
expose `space.proxy.buildUrl` (test environment, stripped runtime)
the helper now short-circuits to an empty array so callers still
receive the static catalog.
AGENTS.md updated to describe the single-path behavior and the reason
the direct-first attempt was removed.
The proxied model-discovery request landed on `/api/proxy` with
`credentials: "omit"`, so the browser did not attach the
`space_session` cookie. The space-agent authenticated-by-default
API routing then rejected the request with HTTP 401
`{"error": "Authentication required"}` before it could reach the
Codex endpoint, and the signed-in user kept seeing only the static
model catalog.
Switch to `credentials: "same-origin"` — the request URL is always
same-origin because `space.proxy.buildUrl(...)` rewrites to the
space-agent `/api/proxy` endpoint. The proxy explicitly strips the
`cookie` header before forwarding upstream (see
`UPSTREAM_REQUEST_HEADERS_TO_STRIP` in `server/router/proxy.js`), so
this change does not leak space-agent session state to `chatgpt.com`.
The Codex `/models` endpoint requires the `client_version` query
parameter; omitting it returns HTTP 400 `invalid_request_error` with
`loc: ('query', 'client_version'), msg: 'Field required'`. The
previous `CODEX_MODELS_ENDPOINT` constant resolved to the bare path
without the query, so every live discovery call was rejected and the
settings dropdown kept falling back to the static catalog.
The value itself is not account-scoped and only identifies the caller
surface; we advertise `0.0.0` to match the User-Agent prefix we
already send for Cloudflare compliance.
`summarizeOnscreenAgentLlmSelection` and `summarizeAdminAgentLlmSelection` fell through to `summarizeLlmConfig(apiEndpoint, model)` for any non-local provider, which meant a signed-in Codex session displayed the unconfigured API- tab default (`anthropic/claude-sonnet-4.6`) next to the composer and throughout the thread view even though the active transport was Codex with `gpt-5.4` (or whichever Codex model the user selected). Add an explicit `CODEX` branch to both summaries that reads `settings.codexModel`, normalized through `normalizeCodexModelId()` so empty or missing values fall back to the shipped default `gpt-5.4-mini` rather than an empty string. The API branch remains unchanged so existing OpenRouter and other OpenAI-compatible setups are unaffected.
Saving Codex settings previously flashed the status text `API chat settings updated.` (overlay) or `API LLM settings updated.` (admin), which is misleading because the active provider at save time is Codex, not the OpenAI-compatible API tab. Both stores now branch explicitly on `CODEX` and surface `ChatGPT settings updated.` so the confirmation matches the tab the user just saved from.
The module cannot be fully exercised without an active ChatGPT Plus subscription, so the AGENTS.md now spells out explicitly what a reviewer can and cannot verify without credentials. Pure-function tests, endpoint auth-gate registration, and module-hierarchy import cleanliness are reviewable without a subscription; the OAuth flow, live chat, refresh, and live catalog discovery are not. This prevents reviewers from either (a) rejecting the PR because they cannot reach end-to-end verification locally or (b) trying to use personal credentials and leaving them in temp state afterwards.
Commit c637738 renamed the poll endpoint's return field from `status` to `state` to avoid collision with the shared router's HTTP-status heuristic in `server/router/responses.js`, but the two AGENTS.md descriptions of the endpoint contract were not updated. This brings server/api/AGENTS.md and server/lib/openai_codex/AGENTS.md in line with the code and notes the reason for the unusual field name so future readers do not try to "fix" it back to `status`.
`OnscreenAgentCodexLlmClient.validateSettings` checked `settings.model`, but the Codex request hook reads the model slug from `settings.codexModel` (with `settings.model` only as a last- resort fallback when `codexModel` is empty). The check therefore passed for any user whose API tab still held the default OpenRouter model `anthropic/claude-sonnet-4.6` even when Codex was the active provider and `codexModel` was blank, and the subsequent Codex request would fail with an unrelated upstream error. Switch the check to the admin-side pattern (`!codexModel && !model`) so both fields must be empty before the "choose a Codex model" error surfaces. The default in `DEFAULT_ONSCREEN_AGENT_SETTINGS.codexModel` means this rarely fires in practice, but the asymmetry between the overlay and admin clients was a real drift.
The exported `isCodexEndpoint` matcher was never called — the two extension hooks gate on `settings.provider === "openai-codex"` directly, which is stricter (provider intent) than URL matching (could false-positive if a user pointed the API tab at Codex manually). Per the project policy against unused compat shims and mirrored code paths, remove the helper and its AGENTS.md paragraph and add a short explanation of why the gate is the settings field rather than a URL matcher.
Both surface stores defined near-identical local helpers (`parseCodexTokensDraft`/`serializeCodexTokensDraft` in the overlay, the `parseAdminCodexTokensDraft`/`serializeAdminCodexTokensDraft` pair in the admin store) to turn the encrypted YAML `codex_tokens` payload into an in-memory object and back. Promote the two helpers into `token_envelope.js` as `parseCodexTokens` and `serializeCodexTokens`, extend `parseCodexTokens` to also accept a plain object (for the pre-parsed in-memory case), and have both stores import them under their previous local names so call sites stay unchanged. This removes four duplicate helper definitions without touching the persistence contract, and replaces the dangling re-exports in `token_envelope.js` with actual consumers, satisfying the project rule against unused compat shims.
…anel The admin settings panel has a field-note explaining that the Codex sign-in is scope-local to the admin chat and that the overlay needs its own sign-in; the overlay panel had no such reverse note, so a user who signed in through the overlay first had no in-UI hint that the admin chat would still require a separate login. Mirror the note on the overlay panel with the symmetric wording so the scope separation is visible from either starting point.
The poll loop previously awaited `pollIntervalSeconds * 1000` at the top of every iteration, including the first. A user who entered the code into the ChatGPT browser before the space-agent UI had painted the pending panel still waited the full interval (default 5 s) for the first server poll. Skip the wait on the first iteration and keep it on every subsequent iteration so fast humans see completion near- instantly without changing the poll cadence for the common case.
Both storage layers built the YAML payload with a truthy-guarded `codex_tokens` assignment: when the user signed out the key was simply omitted from the new payload. Because `fileWrite` replaces the file contents outright this was sufficient today, but it leaves the guarantee implicit; a future change that starts merging payloads, or an accidental in-place patch helper, could leave the previous ciphertext in place after sign-out. Make the clearing explicit by `delete`-ing the key in the sign-out branch so the contract "empty in memory means empty on disk" is visible to future maintainers.
The standard prepareOnscreenAgentCompletionRequest / prepareAdminAgentApiRequest flow already routes its API URL through space.proxy.buildUrl(...) for proxyable external endpoints. The Codex hooks override `requestUrl` with the bare CODEX_RESPONSES_ENDPOINT and therefore re-introduce the proxy bypass: every Codex chat call paid a failed-direct-fetch roundtrip rescued by installFetchProxy(...)'s fallback retry, and emitted a red `Access-Control-Allow-Origin` CORS error in the DevTools console on the first call of every page load. Mirror the pattern already used by models_discovery.js: route CODEX_RESPONSES_ENDPOINT through space.proxy.buildUrl(...) explicitly and set requestInit.credentials = "same-origin" so the proxy endpoint receives the browser session cookie. The proxy strips the cookie header before forwarding upstream, so this does not leak space_session to chatgpt.com. The existing applyCodexHeaders() Cloudflare originator plus User-Agent block continues to run with the proxied URL. Document the proxy routing in openai_codex/AGENTS.md alongside the existing Cloudflare-header anti-refactor warning so a future cleanup pass does not silently revert the routing.
ThomasR101
left a comment
There was a problem hiding this comment.
Approved, and building a claude code subscription provider layer too, similar to this thanks for the initial planning of implementation for adding providers.
|
Thanks for the approval — and very cool that the provider scaffolding is already useful for the Claude Code subscription layer. Two things I'd like to add to main, if useful:
Cheers, |
|
@nsyring works great local but when I try to run it over caddy (reverse proxy) to have it run over dns it throws a 403 after authenticating. |
@yieldf can you please provide more information to your setup? docker-compose.yml, or similar? I will check it and provide a fix, if possible. |
Summary
Adds a third LLM provider (
openai-codex) that routes chat completions through the official OpenAI Codex OAuth device-code flow, so users with a ChatGPT Plus subscription can use it directly inside space-agent without a separate OpenAI Platform API key.The provider is additive: the existing OpenRouter / OpenAI-compatible API path and the Hugging Face local path are byte-compatible with
main. The Chat-Completions SSE parser is not modified. A dedicated LLM-client subclass (overlay) and transport function (admin) own the Responses-API SSE event stream, so OpenRouter regressions are structurally impossible from this change.This provider uses your ChatGPT Plus subscription via the official OpenAI Codex OAuth flow — the same device-code flow the
codexCLI and the Codex VS Code extension use. OpenAI's terms of service apply. Use at your own risk.Screenshots
1. Logged-out state (overlay settings, ChatGPT tab)

2. Login-pending state (device code + verification URL)

3. Signed-in state (live model catalog from discovery)

(Attaching after PR opens — logged-out / login-pending / signed-in / successful Codex chat / DevTools headers proof)
Tested with
gpt-5.4-miniandgpt-5.4, overlay chat + admin chat, fresh-login + refresh + long-session pathsnpm run desktop:dev) — both run the new provider end-to-endnode --test tests/openai_codex_*_test.mjsnode space servestarts cleanly;curlagainst the three/api/openai_codex_*endpoints without a session returns HTTP 401 with the standard space-agent auth-gate body/api/openai_codex_token_refresh, the rotatedrefresh_tokenlands in the encrypted config, and chat continues without user interactionWhy this touches the server
Space Agent prefers frontend implementations. This feature adds three authenticated server endpoints (
openai_codex_auth_start,openai_codex_auth_poll,openai_codex_token_refresh) because OpenAI refresh tokens use single-use rotation: if two browser tabs refresh concurrently, one succeeds and the other returnsinvalid_grant, discarding the only valid refresh token and forcing a full re-login. A frontend-only implementation cannot provide the serialization needed to prevent that loss.This falls under
shared-data integrityper/server/AGENTS.md. The server-side layer is intentionally thin: it owns OAuth HTTP traffic and an in-process single-writer mutex (server/lib/openai_codex/refresh_lock.js), and never reads or writes tokens itself — token persistence stays on the frontend underuserCrypto:-encrypted entries in~/conf/onscreen-agent.yamland~/conf/admin-chat.yaml, via the same envelope pattern the existingapi_keyfield already uses.Known limitation: the mutex is in-process, so clustered deployments (
WORKERS>1) can still race between workers. This is acceptable for the typical single-user single-browser-profile scenario; clustered-runtime hardening is noted as a follow-up inserver/lib/openai_codex/AGENTS.md.Proxy Interaction (important for reviewers)
The Codex endpoint sits behind Cloudflare bot-detection that rejects standard browser User-Agent strings. Direct browser fetches to
chatgpt.com/backend-api/codex/*are blocked; the request must route through the existingserver/router/proxy.js, which forwards the explicitUser-Agent: codex_cli_rs/...andoriginator: codex_cli_rsheaders set inapplyCodexHeaders()(browsers silently ignore JavaScript-setUser-Agentvalues, so the proxy path is the only way the Cloudflare originator arrives at the upstream). Without the proxy, this feature cannot function — it is a factual dependency of the implementation, not a preference.The proxy is not a new backend endpoint; it is the existing space-agent outbound proxy shared with other cross-origin reads. Cookie header is stripped by the proxy before upstream forward, so
space_sessionnever leaks tochatgpt.com.Architecture
Mirrored wiring across the two chat surfaces. The overlay uses the existing client-subclass pattern; the admin runtime is function-based so Codex ships as a new transport function alongside the existing API / Local dispatcher:
OnscreenAgentCodexLlmClientsubclassstreamAdminAgentCodexCompletionfunctionreadCodexStreamingResponsereadCodexStreamingResponse(duplicated)ext/js/_core/onscreen_agent/.../openai-codex.jsext/js/_core/admin/views/agent/.../openai-codex.js~/conf/onscreen-agent.yaml~/conf/admin-chat.yamlChatGPTChatGPTThe two chat surfaces keep their existing Chat-Completions SSE parser duplication (pre-existing before this PR). Consolidating that parser is tracked as a follow-up; this PR mirrors Codex into both surfaces analogously to keep the change surface narrow.
Request-shape transformation (
chatToResponsesRequest) and SSE event mapping (mapCodexEventToChatFrames) live in pure, testable helpers underapp/L0/_all/mod/_core/openai_codex/. Each is fully covered by unit tests, including regression guards for the bugs surfaced during live testing.No new runtime dependencies
Implementation uses only Node built-ins (
crypto,http,Buffer) and existing space-agent utilities (space.api,space.utils.userCrypto,space.utils.yaml,space.proxy). Zero new npm dependencies added.package.jsonandpackage-lock.jsonare unchanged.No regression for the existing API path
OpenRouter and other OpenAI-compatible API-provider configurations are byte-compatible with
main:api_keyencoding/decoding is untouched (the newcodex_tokenspath runs alongside).createOnscreenAgentLlmClient/streamAdminAgentCompletiongain one additional provider branch; existing branches are unchanged.APIandLocaltabs render and save identically.Browser support matrix unchanged: no new polyfills, the new code uses
atob(),fetch(), andAbortController, all widely supported since ~2019. No Electron-specific hooks were added; the packaged desktop build works because it uses the same renderer code.Known pitfalls handled
app/L0/_all/mod/_core/openai_codex/AGENTS.md.response.outputon completion: text is accumulated live fromresponse.output_text.delta; the finalresponse.completedpayload is only consulted forusageandfinish_reason. Documented as the ignore-final-output contract.max_output_tokens,temperature,tools, etc.): all stripped inchatToResponsesRequest; list is maintained explicitly so future additions are a one-liner.input_textunder an assistant entry (must beoutput_text);convertContentParts(content, textType)takes the role-aware type, with a regression test that walks the full input array and rejects any misplacedinput_text.ensureFreshCodexAccessTokenre-reads persisted tokens on every call (no in-memory cache), and an in-module single-flight map coalesces concurrent refreshes within one tab on top of the server-side mutex.client_versionis a required query parameter on the Codex/modelsendpoint; omitting it returns HTTP 400invalid_request_error. Baked intoCODEX_MODELS_ENDPOINT.Follow-ups (not in this PR)
stateSystem.jsnamed-lock refresh mutex for clustered deployments (WORKERS>1).main).Test plan
Reviewers without a ChatGPT Plus subscription can verify the non-subscription parts (see the "Testing This Locally" section in the module's AGENTS.md):
node --test tests/openai_codex_*_test.mjs— 51 tests passnode space servestarts cleanlycurl -X POST http://127.0.0.1:3000/api/openai_codex_auth_startreturns HTTP 401 with{"error":"Authentication required"}(auth gate works)ChatGPTtab with the logged-out state (explainer + Sign-in button + scope note)With a ChatGPT Plus subscription (only author verified):
auth.openai.com/codex/device→ tokens persisted encrypted in~/conf/onscreen-agent.yamlundercodex_tokensgpt-5.4-mini, streamed response rendered correctly/api/openai_codex_token_refreshcodex_tokensremoved from config file~/conf/admin-chat.yaml🤖 Generated with Claude Code