fix(codex): normalize Responses body for ChatGPT backend (strip max_output_tokens, require instructions)#40
Open
skulidropek wants to merge 5 commits into
Conversation
…utput_tokens, require instructions) The ChatGPT Codex backend is stricter than the generic OpenAI Responses API: it rejects `max_output_tokens` (HTTP 400 "Unsupported parameter") and requires a non-empty `instructions` field (HTTP 400 "Instructions are required"). Standard OpenAI Responses clients (e.g. OpenClaw) send `max_output_tokens` and omit `instructions`, so the subscription proxy forwarded bodies the backend rejected. Add `normalize_codex_responses_body` which (for Codex only) forces streaming, strips `max_output_tokens`, and injects a default `instructions` when the caller omits one. Applied at the single chokepoint where both `/v1/responses` and projected `/v1/chat/completions` requests converge. Covered by unit tests.
Author
End-to-end verification with the patched binaryBuilt this branch ( OpenClaw emits its stock Responses body — The proxy stripped |
The ChatGPT Codex backend streams Server-Sent Events but labels the response `application/json`. SSE-aware clients (e.g. OpenClaw's gateway) then parse the body as a single JSON object and fail with an incomplete/terminal-less result, even though the stream ends with a proper `response.completed` event. Re-label streamed Codex responses as `text/event-stream` so clients treat them as a stream.
Author
Codex always streams SSE (we force stream:true) but labels it application/json. When the client requested stream:false the response fell through to the buffered branch and reached SSE-aware clients as application/json, which they parse as a single JSON object and reject as an incomplete result. Always route Codex through the streaming branch and label it text/event-stream.
Author
Codex only streams SSE even when the client requests stream:false, labeling the body application/json. Non-streaming clients (OpenClaw gateway) then parse the raw event stream as one JSON object and fail with incomplete_result. For non-streaming codex responses, collapse the SSE into the terminal response.completed payload and return it as a single JSON Responses object. Streaming clients still get the SSE passthrough relabeled text/event-stream.
Author
The ChatGPT Codex backend rejects system/developer messages inside the Responses `input` array (HTTP 400 "System messages are not allowed"); system content must live in the top-level `instructions` field. Clients like OpenClaw's gateway place the system prompt as a system message in input, so hoist any system/developer turns out of input and merge them into instructions (falling back to a default only when nothing remains).
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #39.
Problem
With
UPSTREAM_PROVIDER=codex, the subscription proxy forwards the client's/v1/responsesbody to the ChatGPT Codex backend verbatim (only forcingstream). The Codex backend rejects bodies that standard OpenAI Responses clients send:max_output_tokens→400 {"detail":"Unsupported parameter: max_output_tokens"}instructions→400 {"detail":"Instructions are required"}So clients like OpenClaw (and any standard OpenAI Responses/Chat Completions client) cannot use a Codex subscription through the router — every request 400s. Credential/transport are fine; only the body shape is wrong. See #39 for full reproduction.
Change
Add
normalize_codex_responses_bodyinsrc/subscription_proxy.rs, applied at the single chokepoint where both/v1/responsesand projected/v1/chat/completionsrequests converge. For Codex it:stream: true(unchanged behavior),max_output_tokens(unsupported by the backend),instructionsonly when the caller omitted one (preserves caller intent; required by the backend).Qwen and other providers are unaffected (guarded by
provider == Codex).Tests
Added unit tests:
codex_normalizes_responses_body_for_chatgpt_backend— OpenClaw-style body (hasmax_output_tokens, noinstructions) → stripped + default instructions +stream:true; other fields (store,reasoning) preserved.codex_preserves_caller_instructions— caller-providedinstructionskept;max_output_tokensstripped.Manual verification
Against a live Codex subscription via the router:
400 Unsupported parameter: max_output_tokens.200, streamed"ROUTER_OK".A
changelog.dentry (bump: patch) is included.