Skip to content

feat(agent-server): harden OpenAI gateway client compatibility#3626

Merged
enyst merged 24 commits into
mainfrom
openhands/openai-gateway-client-compat
Jun 10, 2026
Merged

feat(agent-server): harden OpenAI gateway client compatibility#3626
enyst merged 24 commits into
mainfrom
openhands/openai-gateway-client-compat

Conversation

@enyst

@enyst enyst commented Jun 10, 2026

Copy link
Copy Markdown
Member

HUMAN:
This PR proposes introduction of streaming, a must-have for LLM completions endpoint. It is intently applied to the last agent response. Also accepts a few more parameters that are expected.

  • A human has tested these changes.

AGENT:

Why

Summary

Fixes compatibility-hardening scope for #3540 on top of #3545.

Fixed #3597

  • accept stream: true chat completions and return OpenAI-compatible SSE chunks
  • support stream_options.include_usage with a final usage chunk before [DONE]
  • accept developer role messages and fold them into gateway system instructions
  • add live-server coverage using the OpenAI Python SDK streaming path
  • include .pr/openai-gateway-client-compat.md with temporary live-test notes for Open WebUI, LibreChat-shaped requests, and Pipecat

Testing

  • TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files openhands-agent-server/openhands/agent_server/openai_models.py openhands-agent-server/openhands/agent_server/openai_service.py openhands-agent-server/openhands/agent_server/openai_router.py tests/cross/test_remote_conversation_live_server.py .pr/openai-gateway-client-compat.md
  • TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server tests/cross/test_remote_conversation_live_server.py::test_openai_gateway_replays_frozen_llm_fixtures -q

Live compatibility probes

  • OpenAI Python SDK non-streaming and streaming smoke tests passed against a local agent-server gateway.
  • Open WebUI v0.9.6 was installed via uvx, opened in the browser, fetched gateway models, selected openhands_compat, and completed a chat through the gateway.
  • LibreChat source was cloned and dependencies installed; full browser startup was blocked by the sandbox's unavailable Docker/Mongo runtime, so the documented LibreChat custom-endpoint request shape (stream: true plus common OpenAI params) was exercised directly and passed.
  • Pipecat OpenAILLMService was installed/imported and both one-shot and streaming OpenAI-compatible voice-framework paths passed.

Full live-tests

Please read full description of the tests I performed:
#3540 (comment)

PR-only Live-test Artifacts

This PR intentionally includes .pr/openai-gateway-client-compat.md as temporary reviewer evidence. The repository's .github/workflows/pr-artifacts.yml / PR Artifacts workflow is expected to remove .pr/ through its approval-triggered cleanup path before final merge, matching the repo workflow documented on #3545.

This PR was created by an AI agent (OpenHands) on behalf of the user.

@enyst can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:f8781e7-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-f8781e7-python \
  ghcr.io/openhands/agent-server:f8781e7-python

All tags pushed for this build

ghcr.io/openhands/agent-server:f8781e7-golang-amd64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-golang-amd64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-golang-amd64
ghcr.io/openhands/agent-server:f8781e7-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:f8781e7-golang-arm64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-golang-arm64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-golang-arm64
ghcr.io/openhands/agent-server:f8781e7-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:f8781e7-java-amd64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-java-amd64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-java-amd64
ghcr.io/openhands/agent-server:f8781e7-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:f8781e7-java-arm64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-java-arm64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-java-arm64
ghcr.io/openhands/agent-server:f8781e7-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:f8781e7-python-amd64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-python-amd64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-python-amd64
ghcr.io/openhands/agent-server:f8781e7-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:f8781e7-python-arm64
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-python-arm64
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-python-arm64
ghcr.io/openhands/agent-server:f8781e7-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:f8781e7-golang
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-golang
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-golang
ghcr.io/openhands/agent-server:f8781e7-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:f8781e7-java
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-java
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-java
ghcr.io/openhands/agent-server:f8781e7-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:f8781e7-python
ghcr.io/openhands/agent-server:f8781e7048cde47f0c79954b7156b155c6bbcfa3-python
ghcr.io/openhands/agent-server:openhands-openai-gateway-client-compat-python
ghcr.io/openhands/agent-server:f8781e7-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., f8781e7-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., f8781e7-python-amd64) are also available if needed

enyst and others added 20 commits June 6, 2026 08:30
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server/openai
   router.py42295%49, 63
   service.py1843978%61, 70–71, 75–76, 80–81, 110–112, 126, 130–140, 142, 146–148, 152, 167, 236, 242, 248, 251, 263, 271, 280, 373–374, 401, 437
TOTAL307171340556% 

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

Gateway compatibility behavior works end-to-end with a fake OpenAI backend and real OpenAI-compatible client requests; one non-functional PR description CI check is still failing.

Does this PR achieve its stated goal?

Yes. The PR set out to harden the agent-server OpenAI gateway for streaming chat completions, stream_options.include_usage, developer messages, and common OpenAI-compatible client request shapes. I reproduced the base-branch failures (stream: true returns 400; developer role returns 422), then verified the PR branch accepts those same requests through the OpenAI Python SDK and returns OpenAI-compatible SSE chunks, including a parsed final usage chunk.

Phase Result
Environment Setup uv sync --dev completed; local agent-server ran with isolated /tmp/ohqa_home profile and fake OpenAI-compatible backend.
CI Status ⚠️ Core test/pre-commit/doc checks are green; PR Description Check / Validate PR description is failing and QA Changes by OpenHands was in progress when checked.
Functional Verification ✅ OpenAI SDK non-streaming, SDK streaming with developer + include_usage, and LibreChat-shaped streaming HTTP request all worked on the PR branch.
Functional Verification

Test 1: OpenAI SDK streaming and developer-role compatibility

Step 1 — Reproduce / establish baseline (without the fix):
Checked out base branch openhands/openai-chat-completions-gateway at 88a7d457, started openhands.agent_server on 127.0.0.1:19083, and ran a real OpenAI Python SDK request with stream=True, stream_options={"include_usage": True}, and then a request containing a developer role:

models: ['openhands_qa']
stream_error_type: BadRequestError
stream_error: Error code: 400 - {'detail': 'Streaming chat completions are not supported yet'}
developer_error_type: UnprocessableEntityError
developer_error: Error code: 422 - {'detail': [{'type': 'literal_error', 'loc': ['body', 'messages', 0, 'role'], 'msg': "Input should be 'system', 'user', 'assistant' or 'tool'", 'input': 'developer', 'ctx': {'expected': "'system', 'user', 'assistant' or 'tool'"}}]}

This confirms the compatibility gap existed on the base branch: streaming requests were rejected, and modern OpenAI developer messages were rejected during request validation.

Step 2 — Apply the PR's changes:
Checked out openhands/openai-gateway-client-compat at 4a4bfe5c, restarted openhands.agent_server on 127.0.0.1:19082, using the same isolated profile and fake OpenAI-compatible backend.

Step 3 — Re-run with the fix in place:
Ran the OpenAI Python SDK against /v1/chat/completions with a normal request, then a streaming request containing developer and stream_options.include_usage:

models: ['openhands_qa']
nonstream_content: non-streaming gateway request works
nonstream_usage: {'completion_tokens': 7, 'prompt_tokens': 11, 'total_tokens': 18, 'completion_tokens_details': None, 'prompt_tokens_details': None}
stream_chunk_count: 4
stream_text: developer role reached the backing LLM
stream_usage_chunks: [{'completion_tokens': 7, 'prompt_tokens': 11, 'total_tokens': 18, 'completion_tokens_details': None, 'prompt_tokens_details': None}]

This shows the OpenAI SDK streaming path now works: the SDK successfully iterated four streamed chunks, reconstructed the expected assistant content, and parsed the final usage chunk requested by stream_options.include_usage.

Test 2: LibreChat-shaped streaming request

Step 1 — Establish baseline:
The same base-branch SDK baseline above showed stream: true requests returned HTTP 400 before the PR. That is the key compatibility blocker for LibreChat-style custom OpenAI endpoints, which commonly send stream: true plus extra OpenAI parameters.

Step 2 — Apply the PR's changes:
Used the PR branch server at 127.0.0.1:19082.

Step 3 — Re-run with the fix in place:
Sent a real HTTP request shaped like a LibreChat custom endpoint call, including stream: true, user, temperature, top_p, presence_penalty, and frequency_penalty:

HTTP/1.1 200 OK
x-openhands-serverconversation-id: f62a7e58-2bd6-4dd0-aeca-0d121a201386
content-type: text/event-stream; charset=utf-8

data: {"id":"chatcmpl-a44d0374b37442a3bb808f84351471d4","choices":[{"delta":{"role":"assistant"},"index":0}],"created":1781088371,"model":"openhands_qa","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-a44d0374b37442a3bb808f84351471d4","choices":[{"delta":{"content":"LibreChat-shaped streaming request works"},"index":0}],"created":1781088371,"model":"openhands_qa","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-a44d0374b37442a3bb808f84351471d4","choices":[{"delta":{},"finish_reason":"stop","index":0}],"created":1781088371,"model":"openhands_qa","object":"chat.completion.chunk"}

data: [DONE]

This shows the gateway now accepts common OpenAI-compatible extra request fields and returns a proper text/event-stream response ending in [DONE].

Test 3: Upstream fake-backend evidence

I also inspected the fake backend request log after the PR-branch probes:

{'request': 5, 'stream': None, 'model': 'qa-model', 'roles': ['system', 'user'], 'developer_sentinel_seen': True, 'librechat_seen': False}
{'request': 6, 'stream': None, 'model': 'qa-model', 'roles': ['system', 'user'], 'developer_sentinel_seen': False, 'librechat_seen': True}

This confirms the developer instruction reached the backing LLM path and the LibreChat-shaped prompt was processed. The upstream request is intentionally non-streaming while the gateway synthesizes OpenAI-compatible client-side SSE, which matches the PR's documented scope.

Issues Found

  • 🟡 Minor / non-functional: PR Description Check / Validate PR description is failing. I did not edit the human-only PR description fields; a human should update the PR description if that validator requires human content.

This review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PASS WITH ISSUES

Base automatically changed from openhands/openai-chat-completions-gateway to main June 10, 2026 17:06
@enyst

enyst commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

@OpenHands Understand the goal of this PR. It was originall stacked on top of another that we merged. Please merge main into this branch and fix conflicts.

@openhands-ai

openhands-ai Bot commented Jun 10, 2026

Copy link
Copy Markdown

I'm on it! enyst can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

PR Artifacts Cleaned Up

The .pr/ directory has been automatically removed.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

enyst commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

Merged main into this branch and resolved the conflicts by moving the compatibility changes onto the new openhands.agent_server.openai package layout from the merged base PR. Pushed commit 1c836e074db6b01ff0e13207999db252a6b89220.

Validation run:

  • TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files .pr/openai-gateway-client-compat.md openhands-agent-server/openhands/agent_server/openai/models.py openhands-agent-server/openhands/agent_server/openai/router.py openhands-agent-server/openhands/agent_server/openai/service.py tests/cross/test_remote_conversation_live_server.py
  • TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server tests/cross/test_remote_conversation_live_server.py::test_openai_gateway_replays_frozen_llm_fixtures -q
  • TMUX_TMPDIR=/tmp/oh-test-tmux-auth-$RANDOM uv run pytest tests/agent_server/test_api_authentication.py::test_openai_routes_accept_bearer_session_key -q

PR is mergeable from GitHub's perspective; checks are pending.

This comment was created by an AI agent (OpenHands) on behalf of the user.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@OpenHands OpenHands deleted a comment from openhands-ai Bot Jun 10, 2026
@enyst enyst added the review-this This label triggers a PR review by OpenHands label Jun 10, 2026

all-hands-bot commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: OpenAI Gateway Client Compatibility

Taste Rating

🟡 Acceptable - Functional and well-tested, with minor areas for improvement.


Analysis

[CRITICAL ISSUES]

None. No breaking issues found.


[IMPROVEMENT OPPORTUNITIES]

  • [openhands-agent-server/openhands/agent_server/openai/service.py, Line 311] Simplification: if content: check before yielding the content chunk. The ChoiceDelta already accepts None for content - you could always yield this chunk and let the empty string be a no-op, eliminating one conditional.

[TESTING GAPS]

  • [tests/cross/test_remote_conversation_live_server.py] Edge Case Missing: The test exercises stream=True with stream_options={"include_usage": true}, but there's no test for stream=True without stream_options (the default case). Consider adding a second assertion block that omits stream_options to ensure backward compatibility and correct behavior when include_usage is False.

[RISK ASSESSMENT]

  • [Overall PR] Risk Assessment: 🟢 LOW

This PR adds OpenAI-compatible streaming support and developer role handling to the gateway. The changes are backward-compatible (new fields are optional, defaults are preserved), and the security improvement to cookie lifetime (10 years -> 8 hours) is a net positive. The streaming implementation is straightforward SSE generation. The documentation in .pr/openai-gateway-client-compat.md is thorough and includes evidence of real-world testing with OpenAI SDK, Open WebUI, LibreChat, and Pipecat.


VERDICT:
Worth merging: Core logic is sound. The streaming implementation follows the OpenAI SSE protocol correctly, and the test coverage is reasonable for a compatibility-focused PR.

KEY INSIGHT:
The streaming implementation is architecturally correct but currently buffered - it only emits chunks after the agent run completes, not progressively. This is documented as a known limitation and appropriately scoped for this compatibility PR.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-agent-server/openhands/agent_server/openai/service.py
Comment thread tests/cross/test_remote_conversation_live_server.py
@enyst

enyst commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

@OpenHands you have reviews inline and top level, please /iterate

@openhands-ai

openhands-ai Bot commented Jun 10, 2026

Copy link
Copy Markdown

I'm on it! enyst can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

enyst commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

Addressed the review feedback in b16c44e and resolved the inline threads.

Validation run locally:

  • TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files openhands-agent-server/openhands/agent_server/openai/service.py tests/cross/test_remote_conversation_live_server.py
  • TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server tests/cross/test_remote_conversation_live_server.py::test_openai_gateway_replays_frozen_llm_fixtures -q

This comment was created by an AI agent (OpenHands) on behalf of the user.

@enyst enyst requested a review from all-hands-bot June 10, 2026 19:46

all-hands-bot commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

🟢 Good taste — Clean, focused implementation that solves the stated problem well.

Files Reviewed

File Changes Assessment
openhands-agent-server/openhands/agent_server/openai/models.py +13 -2 ✅ Well-structured type aliases and request model
openhands-agent-server/openhands/agent_server/openai/service.py +81 -1 ✅ Clean SSE generation with proper event ordering
openhands-agent-server/openhands/agent_server/openai/router.py +19 -3 ✅ Correct streaming response handling
tests/cross/test_remote_conversation_live_server.py +49 -0 ✅ Good test coverage for both usage-included and default cases

Key Observations

Strengths:

  • SSE event sequence is correct: role → content → finish_reason → (optional usage) → [DONE]
  • The iter_openai_chat_completion_sse function is a clean generator that properly handles JSON serialization
  • Test coverage includes both include_usage: true and default cases with proper assertions
  • developer role is correctly folded into system instructions alongside system
  • Previous review feedback (content guard removal) was properly addressed in commit b16c44e3

Implementation Quality:

  • The streaming implementation follows OpenAI's SSE specification correctly
  • Proper use of model_dump(mode="json", exclude_none=True) for clean JSON output
  • stream_options.include_usage is correctly handled as an optional feature

Minor Notes

  • The test assertions for token counts (prompt_tokens==7, completion_tokens==5, total_tokens==12) are hardcoded values. These are fine for the test but would need updating if the prompt changes. This is acceptable for a smoke test.

  • The .pr/openai-gateway-client-compat.md artifact is noted in the PR description as temporary and will be cleaned up by the PR Artifacts workflow before merge.


[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a focused feature addition for OpenAI-compatible streaming. The implementation:

  • Adds new functionality without modifying existing behavior
  • Has comprehensive test coverage
  • Was validated against real clients (OpenAI SDK, Open WebUI, LibreChat, Pipecat)
  • Follows established patterns from the codebase

No breaking changes, no security concerns, no dependency issues.


VERDICT:
Worth merging — Implementation is sound, well-tested, and addresses the compatibility requirements.

KEY INSIGHT:
The SSE implementation correctly handles the three-chunk pattern (role, content, finish) with optional usage, matching OpenAI's streaming protocol. Previous review feedback has been addressed.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@enyst enyst enabled auto-merge (squash) June 10, 2026 19:53

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

The OpenAI-compatible gateway changes work end-to-end through a real local agent-server and the OpenAI Python SDK streaming client.

Does this PR achieve its stated goal?

Yes. On origin/main, the same user-facing requests failed: developer messages were rejected with 422 and stream: true requests returned 400 Streaming chat completions are not supported yet. On this PR, the gateway accepted a developer role plus common client parameters, returned OpenAI SDK-consumable streaming chunks, emitted a final usage chunk only when stream_options.include_usage was set, and preserved non-streaming behavior.

Phase Result
Environment Setup uv created the environment and both base/PR agent-server instances started locally.
CI Status ⏳ At review time: 19 successful checks, 8 in progress, 1 queued, 1 skipped, no failures observed.
Functional Verification ✅ Real OpenAI SDK requests verified streaming, usage chunks, developer role, extra params, and non-streaming behavior.
Functional Verification

Test 1: Baseline failures on origin/main

Step 1 — Reproduce / establish baseline (without the fix):
Checked out origin/main, started the real server with:

env -u TMUX TMUX_TMPDIR=/tmp/oh-qa-tmux-base   OPENHANDS_SUPPRESS_BANNER=1   LLM_MODEL=openhands/claude-haiku-4-5-20251001   uv run python -m openhands.agent_server --host 127.0.0.1 --port 12002

Then ran an OpenAI Python SDK request using stream=True, stream_options={"include_usage": True}, user="qa-user", and a developer message:

models ['openhands_qa']
developer_stream_error_type UnprocessableEntityError
developer_stream_error Error code: 422 - {'detail': [{'type': 'literal_error', 'loc': ['body', 'messages', 0, 'role'], 'msg': "Input should be 'system', 'user', 'assistant' or 'tool'", 'input': 'developer', 'ctx': {'expected': "'system', 'user', 'assistant' or 'tool'"}}]}

This confirms the old gateway rejected newer OpenAI-compatible developer messages before the agent could run.

I also sent a user-only stream=True request on the base branch:

user_stream_error_type BadRequestError
user_stream_error Error code: 400 - {'detail': 'Streaming chat completions are not supported yet'}

This confirms the baseline did not support streaming chat completions.

Step 2 — Apply the PR's changes:
Checked out openhands/openai-gateway-client-compat at b16c44e3201c2525ff20a0078099045096a1d49c, started the real server with:

env -u TMUX TMUX_TMPDIR=/tmp/oh-qa-tmux-pr   OPENHANDS_SUPPRESS_BANNER=1   LLM_MODEL=openhands/claude-haiku-4-5-20251001   uv run python -m openhands.agent_server --host 127.0.0.1 --port 12000

Saved a real gateway profile through /api/profiles/qa and verified it was visible through /v1/models:

save_profile_status 201
{"name":"qa","message":"Profile 'qa' saved"}
models_status 200
{"object":"list","data":[{"id":"openhands_qa","created":0,"object":"model","owned_by":"openhands"}]}

Step 3 — Re-run with the fix in place:
Ran the same OpenAI SDK streaming path against http://127.0.0.1:12000/v1 using model="openhands_qa", a developer message, stream=True, stream_options={"include_usage": True}, user, temperature, top_p, presence_penalty, and frequency_penalty:

chunk_count 4
stream_text GATEWAY_QA_STREAM_OK
usage_chunk_count 1
usage_last {'completion_tokens': 94, 'prompt_tokens': 4956, 'total_tokens': 5050, 'completion_tokens_details': None, 'prompt_tokens_details': None}
finish_reasons [None, None, 'stop']

This shows the OpenAI SDK could consume the response as a stream, the final content arrived correctly, and include_usage produced exactly one usage-bearing chunk before completion.

Test 2: Default stream behavior and non-stream regression check

Ran one streaming request without stream_options.include_usage, then a normal non-streaming completion:

default_stream_chunk_count 3
default_stream_text GATEWAY_QA_NO_USAGE_OK
default_stream_usage_chunks 0
nonstream_text GATEWAY_QA_NONSTREAM_OK
nonstream_usage_total 5004

This confirms default streaming omits usage chunks, and the existing non-streaming gateway path still returns an ordinary OpenAI SDK chat completion with usage.

Issues Found

None.

Final verdict: PASS.

This QA review was generated by an AI agent (OpenHands) on behalf of the user.

@enyst enyst merged commit e01557e into main Jun 10, 2026
31 checks passed
@enyst enyst deleted the openhands/openai-gateway-client-compat branch June 10, 2026 19:56
@openhands-ai

openhands-ai Bot commented Jun 10, 2026

Copy link
Copy Markdown

OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server
**

See the conversation for more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-this This label triggers a PR review by OpenHands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(agent-server): harden OpenAI gateway compatibility for common clients

4 participants