Skip to content

fix(codex): normalize Responses body for ChatGPT backend (strip max_output_tokens, require instructions)#40

Open
skulidropek wants to merge 5 commits into
link-assistant:mainfrom
skulidropek:fix/codex-responses-normalization
Open

fix(codex): normalize Responses body for ChatGPT backend (strip max_output_tokens, require instructions)#40
skulidropek wants to merge 5 commits into
link-assistant:mainfrom
skulidropek:fix/codex-responses-normalization

Conversation

@skulidropek

Copy link
Copy Markdown

Fixes #39.

Problem

With UPSTREAM_PROVIDER=codex, the subscription proxy forwards the client's /v1/responses body to the ChatGPT Codex backend verbatim (only forcing stream). The Codex backend rejects bodies that standard OpenAI Responses clients send:

  • max_output_tokens400 {"detail":"Unsupported parameter: max_output_tokens"}
  • missing instructions400 {"detail":"Instructions are required"}

So clients like OpenClaw (and any standard OpenAI Responses/Chat Completions client) cannot use a Codex subscription through the router — every request 400s. Credential/transport are fine; only the body shape is wrong. See #39 for full reproduction.

Change

Add normalize_codex_responses_body in src/subscription_proxy.rs, applied at the single chokepoint where both /v1/responses and projected /v1/chat/completions requests converge. For Codex it:

  • forces stream: true (unchanged behavior),
  • strips max_output_tokens (unsupported by the backend),
  • injects a default instructions only when the caller omitted one (preserves caller intent; required by the backend).

Qwen and other providers are unaffected (guarded by provider == Codex).

Tests

Added unit tests:

  • codex_normalizes_responses_body_for_chatgpt_backend — OpenClaw-style body (has max_output_tokens, no instructions) → stripped + default instructions + stream:true; other fields (store, reasoning) preserved.
  • codex_preserves_caller_instructions — caller-provided instructions kept; max_output_tokens stripped.

Manual verification

Against a live Codex subscription via the router:

  • Before: OpenClaw's exact body → 400 Unsupported parameter: max_output_tokens.
  • After (equivalent normalized body): 200, streamed "ROUTER_OK".

A changelog.d entry (bump: patch) is included.

…utput_tokens, require instructions)

The ChatGPT Codex backend is stricter than the generic OpenAI Responses API:
it rejects `max_output_tokens` (HTTP 400 "Unsupported parameter") and requires a
non-empty `instructions` field (HTTP 400 "Instructions are required"). Standard
OpenAI Responses clients (e.g. OpenClaw) send `max_output_tokens` and omit
`instructions`, so the subscription proxy forwarded bodies the backend rejected.

Add `normalize_codex_responses_body` which (for Codex only) forces streaming,
strips `max_output_tokens`, and injects a default `instructions` when the caller
omits one. Applied at the single chokepoint where both `/v1/responses` and
projected `/v1/chat/completions` requests converge. Covered by unit tests.
@skulidropek

Copy link
Copy Markdown
Author

End-to-end verification with the patched binary

Built this branch (cargo build) and ran the resulting link-assistant-router serve with UPSTREAM_PROVIDER=codex against a real ChatGPT Codex subscription, then pointed an unmodified OpenClaw agent at it (provider api: "openai-responses").

OpenClaw emits its stock Responses body — {model, input, stream:true, store:false, max_output_tokens:8192, reasoning:{effort:"none"}} (no instructions) — which fails with 400 against an unpatched router. With this patch:

$ openclaw infer model run --local --model router/gpt-5.5 --prompt "Reply with exactly: ROUTER_OK"
provider: router
model: gpt-5.5
outputs: 1
ROUTER_OK

The proxy stripped max_output_tokens, injected a default instructions, and the ChatGPT backend returned the real completion. cargo check --tests and cargo test --lib codex_ both pass.

The ChatGPT Codex backend streams Server-Sent Events but labels the response
`application/json`. SSE-aware clients (e.g. OpenClaw's gateway) then parse the
body as a single JSON object and fail with an incomplete/terminal-less result,
even though the stream ends with a proper `response.completed` event. Re-label
streamed Codex responses as `text/event-stream` so clients treat them as a stream.
@skulidropek

skulidropek commented Jun 22, 2026

Copy link
Copy Markdown
Author

AI Session Backup

Commit: ec59b6d
Status: success
Files: 26 (511.82 MB)
Links: README | Manifest

git status

On branch fix/codex-responses-normalization
Your branch is up to date with 'fork/fix/codex-responses-normalization'.

nothing to commit, working tree clean

Codex always streams SSE (we force stream:true) but labels it application/json.
When the client requested stream:false the response fell through to the buffered
branch and reached SSE-aware clients as application/json, which they parse as a
single JSON object and reject as an incomplete result. Always route Codex through
the streaming branch and label it text/event-stream.
@skulidropek

skulidropek commented Jun 22, 2026

Copy link
Copy Markdown
Author

AI Session Backup

Commit: e7eb176
Status: success
Files: 26 (511.82 MB)
Links: README | Manifest

git status

On branch fix/codex-responses-normalization
Your branch is up to date with 'fork/fix/codex-responses-normalization'.

nothing to commit, working tree clean

Codex only streams SSE even when the client requests stream:false, labeling the
body application/json. Non-streaming clients (OpenClaw gateway) then parse the raw
event stream as one JSON object and fail with incomplete_result. For non-streaming
codex responses, collapse the SSE into the terminal response.completed payload and
return it as a single JSON Responses object. Streaming clients still get the SSE
passthrough relabeled text/event-stream.
@skulidropek

skulidropek commented Jun 22, 2026

Copy link
Copy Markdown
Author

AI Session Backup

Commit: e39e414
Status: success
Files: 26 (511.82 MB)
Links: README | Manifest

git status

On branch fix/codex-responses-normalization
Your branch is up to date with 'fork/fix/codex-responses-normalization'.

nothing to commit, working tree clean

The ChatGPT Codex backend rejects system/developer messages inside the Responses
`input` array (HTTP 400 "System messages are not allowed"); system content must
live in the top-level `instructions` field. Clients like OpenClaw's gateway place
the system prompt as a system message in input, so hoist any system/developer
turns out of input and merge them into instructions (falling back to a default
only when nothing remains).
@skulidropek

skulidropek commented Jun 22, 2026

Copy link
Copy Markdown
Author

AI Session Backup

Commit: 9ecd9ea
Status: success
Files: 26 (511.82 MB)
Links: README | Manifest

git status

On branch fix/codex-responses-normalization
Your branch is up to date with 'fork/fix/codex-responses-normalization'.

nothing to commit, working tree clean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex subscription proxy forwards Responses bodies the ChatGPT backend rejects (max_output_tokens / missing instructions → HTTP 400)

1 participant