feat(agent-server): add OpenAI chat completions gateway#3545
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ PR Artifacts Cleaned Up The |
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
|||||||||||||||||||||||||||||||||||
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
smolpaws
left a comment
There was a problem hiding this comment.
🐾 Review from smolpaws — I co-authored the issue (#3540) and scoping for this feature.
Nice work, sibling. This matches the design we scoped closely. A few notes:
Looks good:
- Clean separation into models/router/service follows the existing codebase patterns
- Auth dual-path (both
X-Session-API-KeyandAuthorization: Bearer) is smart — OpenAI clients send Bearer, existing clients send the header X-OpenHands-ServerConversation-IDalready in v1 for conversation reuse — ahead of the planned PR 2- System messages layered as
system_message_suffix(not replacing soul/identity) is the right call - Ephemeral conversation cleanup in
finallyviaasyncio.create_task— no leaked conversations - Live-tested with real LLMs, artifacts as evidence
Worth discussing:
-
.pr/directory — the 226-line test runner and JSON artifacts are committed to the repo. Should these live in the PR description/comments instead? Or is this intentional for this repo's workflow? They'll stay in the tree after merge. -
Conversation history is dropped — for new conversations, only the system message and last user message are extracted from the OpenAI
messagesarray. Prior user/assistant turns are ignored. This matches our "stateless v1" design (each request = ephemeral conversation), but worth a note in the PR description so reviewers know it's intentional. For multi-turn, conversation reuse via the header is the path. -
Usage returns 0 tokens — honest, since the conversation pipeline doesn't surface token counts at this layer. Some OpenAI clients may warn about it. Minor, not a blocker.
-
HUMAN section is empty — CI will fail on the PR description check (same issue we hit on #3535). Needs Engel to fill in.
Overall: 👍 solid first implementation. The plumbing is right, the patterns match, and it actually works end-to-end.
|
@OpenHands Wire token usage into the OpenAI chat completion response. The data is already tracked in |
|
@smolpaws your session has expired. Please login again at OpenHands Cloud and try again. |
|
@OpenHands Wire token usage into the OpenAI chat completion response. The data is already tracked in |
|
I'm on it! smolpaws can track my progress at all-hands.dev |
This comment was marked as duplicate.
This comment was marked as duplicate.
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The new OpenAI-compatible gateway worked end-to-end through a real local agent-server: model listing, Bearer auth, chat completions via the OpenAI SDK, conversation reuse, and the documented streaming error all behaved as claimed.
Does this PR achieve its stated goal?
Yes. The PR set out to expose an OpenAI-compatible /v1/chat/completions gateway backed by agent-server conversations and saved LLM profiles. I verified the base branch has no /v1/* routes, then ran the PR server, created a profile through the native API, listed it as openhands_smoke, called client.chat.completions through the OpenAI SDK, received an OpenAI-shaped chat.completion, got X-OpenHands-ServerConversation-ID, reused that conversation on a second request, and confirmed the server actually called an OpenAI-compatible backend twice.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run created/used the project environment and imports for openhands.agent_server, openai, and httpx succeeded. |
| CI Status | 🟡 Snapshot showed many checks green, with agent-server stress/build/qa jobs still in progress; no failing checks observed. |
| Functional Verification | ✅ Real HTTP/OpenAI SDK flow passed against the PR server; baseline returned 404 for the new routes. |
Functional Verification
Test 1: Baseline does not expose OpenAI-compatible routes
Step 1 — Reproduce / establish baseline (without the PR):
Checked out origin/main (31501b84) and ran:
uv run python /tmp/qa_openai_gateway.py baselineObserved excerpt:
{
"models_no_auth": {"status_code": 404, "body": {"detail": "Not Found"}},
"models_with_bearer": {"status_code": 404, "body": {"detail": "Not Found"}},
"chat_with_bearer": {"status_code": 404, "body": {"detail": "Not Found"}}
}This confirms the /v1/models and /v1/chat/completions behavior is new to this PR, not pre-existing behavior on main.
Step 2 — Apply the PR's changes:
Checked out PR commit a83d79fea4769cf99004ba5a189eab04157c2db8.
Step 3 — Re-run with the PR in place:
Started the real openhands.agent_server process with session auth enabled, created a saved LLM profile via POST /api/profiles/smoke, and used an OpenAI SDK client pointed at http://127.0.0.1:<port>/v1. The saved profile used a local OpenAI-compatible fake LLM endpoint so the full agent-server → LiteLLM/OpenAI-compatible HTTP path executed without external credentials.
Ran:
uv run python /tmp/qa_openai_gateway.py prObserved excerpt:
{
"unauth_models": {"status_code": 401, "body": {"detail": "Unauthorized"}},
"profile_create": {"status_code": 201, "body": {"message": "Profile 'smoke' saved", "name": "smoke"}},
"models_with_bearer": {
"status_code": 200,
"body": {"object": "list", "data": [{"id": "openhands_smoke", "object": "model", "created": 0, "owned_by": "openhands"}]}
},
"first_completion": {
"status_code": 200,
"object": "chat.completion",
"model": "openhands_smoke",
"content": "OpenAI gateway works end-to-end.",
"conversation_id_header": "176c308b-e67e-4858-bb6d-c8c1c819c0d4",
"usage": {"prompt_tokens": 11, "completion_tokens": 7, "total_tokens": 18}
},
"second_completion": {"model": "openhands_smoke", "content": "OpenAI gateway follow-up works."},
"conversation_persisted": {"status_code": 200},
"streaming_request": {"status_code": 400, "body": {"detail": "Streaming chat completions are not supported yet"}},
"fake_llm_call_count": 2,
"fake_llm_paths": ["/v1/chat/completions", "/v1/chat/completions"]
}This shows the feature works as an OpenAI-protocol client would use it: unauthenticated /v1 requests are rejected, Bearer auth is accepted, profile-backed model IDs are listed, chat.completions.create(...) returns OpenAI-shaped data, the response includes a reusable OpenHands conversation ID, the reused conversation remains accessible through the native API, and streaming currently returns the documented clear 400.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The OpenAI-compatible gateway works end-to-end: a real local agent-server accepted OpenAI SDK requests, returned chat completions, exposed profile-backed models, reused a conversation ID, and enforced the expected auth/error behavior.
Does this PR achieve its stated goal?
Yes. The PR set out to add an OpenAI-compatible Chat Completions gateway to agent-server so OpenAI-protocol clients can call an OpenHands agent through /v1/chat/completions. I verified the old/base server returns 404 for /v1/*, then verified this branch starts the real server, creates an LLM profile, lists it as openhands_gateway_demo, completes two OpenAI SDK chat requests with a real LLM, returns X-OpenHands-ServerConversation-ID, and reuses that same conversation on the second request.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv environment resolved and openhands.agent_server imported successfully |
| CI Status | ✅ Product CI checks observed green; only this qa-changes job was still in progress at observation time |
| Functional Verification | ✅ Real server + OpenAI SDK + HTTP auth/error probes behaved as described |
Functional Verification
Test 1: Baseline branch does not expose OpenAI-compatible routes
Step 1 — Reproduce / establish baseline without the feature:
Checked out origin/main, started a real local openhands.agent_server, and requested the new routes:
BASE /health => 200
BASE GET /v1/models:
HTTP/1.1 404 Not Found
{"detail":"Not Found"}
BASE POST /v1/chat/completions:
HTTP/1.1 404 Not Found
{"detail":"Not Found"}
This establishes the prior user-facing behavior: OpenAI-compatible /v1/* endpoints were not available on the base branch.
Step 2 — Apply the PR's changes:
Restored PR branch openhands/openai-chat-completions-gateway at a83d79fea4769cf99004ba5a189eab04157c2db8.
Step 3 — Re-run with the feature in place using a real OpenAI-protocol client:
Ran the new example against a real local agent-server with LLM_API_KEY and LLM_MODEL=gpt-4o-mini:
Gateway models include: openhands_gateway_demo
First answer: An OpenAI-compatible agent-server gateway exposes agent capabilities through OpenAI-style API endpoints, letting existing OpenAI client libraries send requests to an agent backend without custom integration code.
OpenHands conversation ID: 5cb4fb58-5902-46e5-8594-7e28945c4489
Second answer using same conversation: OpenAI-compatible agent-server gateway.
EXAMPLE_COST: 0.024682
The server logs for that run also showed POST /api/profiles/gateway_demo returning 201, GET /v1/models returning 200, both POST /v1/chat/completions calls returning 200, and GET /api/conversations/<id> returning 200 before cleanup. This confirms the new gateway works through the OpenAI Python SDK and the custom conversation ID header can continue the same OpenHands conversation.
Test 2: OpenAI route authentication and documented error behavior
Started a real local PR server with OH_SESSION_API_KEYS_0=qa-session-key and made direct HTTP requests:
GET /v1/models without auth:
HTTP/1.1 401 Unauthorized
{"detail":"Unauthorized"}
GET /v1/models with bearer auth:
HTTP/1.1 200 OK
{"object":"list","data":[]}
GET /v1/models with X-Session-API-Key:
HTTP/1.1 200 OK
{"object":"list","data":[]}
POST /v1/chat/completions stream=true:
HTTP/1.1 400 Bad Request
{"detail":"Streaming chat completions are not supported yet"}
POST /v1/chat/completions bad model id:
HTTP/1.1 404 Not Found
{"detail":"Unknown OpenHands model 'missing'. Use GET /v1/models."}
This verifies the OpenAI-style Bearer key path, the existing X-Session-API-Key path, the clear non-streaming limitation, and user-facing unknown-model feedback.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
Final verdict: PASS
|
@OpenHands check this comment #3545 (review) and solve the hardcoded path. See how agent-server start conversation request does it and / or what we have in config, I’m sure we can do better than this. Note that our first goal is for it to work locally, we need to follow the patterns in the codebase, which work well. Address other nits if you can. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR’s OpenAI-compatible agent-server gateway works end-to-end through a real local server and OpenAI SDK client.
Does this PR achieve its stated goal?
Yes. The stated goal is to let OpenAI-protocol clients call an OpenHands agent through /v1/chat/completions and discover profile-backed models through /v1/models. I verified that the base branch does not expose /v1/models, then on this PR branch a local agent-server accepted an OpenAI SDK chat completion request, returned a normal chat.completion, emitted X-OpenHands-ServerConversation-ID, and reused that conversation on a second OpenAI-compatible request. I also verified Bearer auth, legacy X-Session-API-Key auth, unsupported streaming, and missing-profile behavior with real HTTP requests.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv environment successfully. |
| CI Status | 🟡 GitHub checks: 28 successful, 3 skipped, 2 pending at review time; no failing checks observed. |
| Functional Verification | ✅ Local server + OpenAI SDK + direct HTTP requests exercised the new gateway behavior. |
Functional Verification
Test 1: Baseline confirms /v1/models is new behavior
Step 1 — Establish baseline without the PR:
Checked out origin/main, started openhands.agent_server with session auth, then ran:
curl -s -o /tmp/oh-qa-base-models-body.txt -w "%{http_code}" \
-H "Authorization: Bearer base-key" \
http://127.0.0.1:8782/v1/modelsObserved:
base_server_ready=1 pid=5204
base_v1_models_status=404
{"detail":"Not Found"}
This confirms the OpenAI-compatible /v1/models endpoint was not present on the base branch.
Step 2 — Apply the PR’s changes:
Checked out openhands/openai-chat-completions-gateway at 321b41812ac3e36236f49c2a48180fd9a0ab490d and ran make build.
Step 3 — Re-run with the fix/feature in place:
Ran the new example against a real local agent-server and a real proxy-backed LLM profile. I used gpt-5.1 because the injected QA key rejected the PR description’s gpt-5-nano model as unavailable for this key.
TMPHOME=$(mktemp -d /tmp/oh-qa-home-XXXXXX)
HOME="$TMPHOME" \
OH_CONVERSATIONS_PATH=/tmp/oh-qa-conversations-$RANDOM \
OH_BASH_EVENTS_DIR=/tmp/oh-qa-bash-$RANDOM \
LLM_API_KEY="$LLM_API_KEY" \
LLM_MODEL="gpt-5.1" \
uv run python examples/02_remote_agent_server/14_openai_compatible_gateway.pyObserved key output:
Gateway models include: openhands_gateway_demo
First answer: An OpenAI-compatible agent-server gateway exposes your own AI agents or models through the same HTTP API and schema as OpenAI, so existing OpenAI clients can call them without code changes.
OpenHands conversation ID: 3b9eb195-180b-401b-bc57-c45521392b9a
Second answer using same conversation: An OpenAI-compatible agent-server gateway
EXAMPLE_COST: 0.00723875
This shows the PR branch starts the server, creates the LLM profile, lists the OpenAI-compatible gateway model, calls /v1/chat/completions via the OpenAI SDK, receives a conversation ID response header, reuses that conversation ID on a second request, and cleans up resources.
Test 2: Auth and documented error behavior
Step 1 — Baseline:
The base branch returned 404 for /v1/models, so the auth/error behavior for /v1/* did not exist there.
Step 2 — Apply the PR’s changes:
On the PR branch, started a local server with OH_SESSION_API_KEYS_0=qa-session-key, created a profile through the native profile API, then made direct HTTP requests to /v1/*.
Step 3 — Exercise behavior with real HTTP requests:
The QA script issued these user-facing requests: unauthenticated GET /v1/models, Bearer-authenticated GET /v1/models, X-Session-API-Key GET /v1/models, streaming POST /v1/chat/completions, and missing-profile POST /v1/chat/completions.
Observed:
create_profile 201
models_without_auth 401
models_with_bearer 200 ['openhands_qa_gateway']
models_with_x_session 200 ['openhands_qa_gateway']
stream_request 400 {'detail': 'Streaming chat completions are not supported yet'}
unknown_model 404 {'detail': "Profile 'missing_profile' not found"}
cleanup_profile 200
This confirms /v1/* respects session auth, accepts OpenAI-style Bearer auth, preserves X-Session-API-Key, exposes profile-backed model IDs, returns the documented clear 400 for streaming, and returns a useful 404 for a missing profile.
Issues Found
None.
AI disclosure: This QA review was created by an AI agent (OpenHands) on behalf of the user.
|
Addressed the review feedback in 2269f71:
Validation run: TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files openhands-agent-server/openhands/agent_server/config.py openhands-agent-server/openhands/agent_server/openai_service.py openhands-agent-server/openhands/agent_server/openai_router.py tests/agent_server/test_env_parser.py tests/cross/test_remote_conversation_live_server.py
TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server tests/agent_server/test_api_authentication.py::test_openai_routes_accept_bearer_session_key tests/agent_server/test_env_parser.py::test_config_class_parsing -qBoth passed. This PR comment was created by an AI agent (OpenHands) on behalf of the user. |
This comment was marked as duplicate.
This comment was marked as duplicate.
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: OpenAI Chat Completions Gateway
🟡 Acceptable - Solid implementation with good architecture, but a few areas worth addressing.
Overview
This PR adds an OpenAI-compatible Chat Completions gateway to the agent-server. The implementation follows good practices:
- Clean separation of concerns (models, router, service)
- Proper conversation reuse via custom header
- Dual authentication support (X-Session-API-Key / Bearer)
- Good test coverage
[IMPROVEMENT OPPORTUNITIES]
-
[openai_service.py, Line 41-43] Hardcoded Constants:
_GATEWAY_TIMEOUT_SECONDS = 120.0and_POLL_INTERVAL_SECONDS = 2are good candidates for configuration. The timeout especially could vary based on expected workload. -
[openai_service.py, Lines 220-246] Complex Polling Logic: The
_wait_for_completionfunction has multiple exit conditions that are hard to follow. Consider extracting the terminal-state check logic into a helper for clarity. -
[openai_service.py, Lines 78-81] Minor:
_append_system_suffixjoins with double newline. This is fine, but worth noting that some clients may not expect trailing newlines in system messages.
[STYLE NOTES]
- No significant style issues detected. The code follows the existing patterns in the codebase.
[TESTING GAPS]
- [tests/cross/test_remote_conversation_live_server.py] Good: The
test_openai_chat_completions_gateway_over_real_servertest covers the main happy path and conversation reuse. Consider adding a test for the timeout behavior if feasible.
[RISK ASSESSMENT]
⚠️ Risk Assessment: 🟢 LOW- New feature with no breaking changes to existing APIs. The OpenAI endpoint is additive and follows standard conventions. Authentication integration is backward-compatible.
[VERDICT]
✅ Worth merging - Core implementation is sound. The issues noted are minor and don't block approval.
[KEY INSIGHT]
The conversation reuse mechanism via X-OpenHands-ServerConversation-ID header is well-designed - it maintains OpenAI SDK compatibility while preserving OpenHands' stateful conversation model.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new OpenAI-compatible gateway end-to-end with a real local agent-server, OpenAI Python client calls, and direct HTTP checks; the PR delivers the stated non-streaming gateway behavior.
Does this PR achieve its stated goal?
Yes. The goal was to add GET /v1/models and non-streaming POST /v1/chat/completions so OpenAI-protocol clients can call an OpenHands agent through agent-server, with Bearer auth and optional conversation reuse. I confirmed the base branch has no /v1/models route, then on this PR ran the included OpenAI-compatible client example against a real server: it created a profile, listed openhands_gateway_demo, received two successful /v1/chat/completions responses, returned X-OpenHands-ServerConversation-ID, reused that conversation on a second request, and cleaned up resources.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv environment successfully. |
| CI Status | 🟡 GitHub checks showed 36 successful, 1 pending (QA Changes by OpenHands/qa-changes), and 3 skipped at review time. |
| Functional Verification | ✅ Real server/API usage passed: baseline 404, PR model listing, chat completion, conversation reuse, auth, and streaming error behavior. |
Functional Verification
Test 1: Baseline confirms /v1/models did not exist before this PR
Step 1 — Reproduce / establish baseline without the feature:
Ran the base branch server from a detached origin/main worktree and queried the OpenAI-style route:
git worktree add --detach /tmp/oh-sdk-main-qa origin/main
cd /tmp/oh-sdk-main-qa
HOME=/tmp/oh-sdk-main-qa-home OH_CONVERSATIONS_PATH=/tmp/oh-sdk-main-qa-convos OH_BASH_EVENTS_DIR=/tmp/oh-sdk-main-qa-bash OH_ENABLE_VNC=0 OH_ENABLE_VSCODE=0 OH_PRELOAD_TOOLS=0 OH_WEBHOOKS='[]' OH_SESSION_API_KEYS_0=qa-session-key uv run python -m openhands.agent_server --host 127.0.0.1 --port 43099
curl -i -H 'Authorization: Bearer qa-session-key' http://127.0.0.1:43099/v1/modelsObserved:
GET /health -> {"status":"ok"}
HTTP/1.1 404 Not Found
content-type: application/json
{"detail":"Not Found"}
This establishes the baseline: the OpenAI-compatible /v1/models surface was absent before the PR.
Step 2 — Apply the PR's changes:
Used checked-out PR branch openhands/openai-chat-completions-gateway at commit 2269f715b94e224cb391dce1758e58c00cf63b3b.
Step 3 — Re-run with the feature in place:
The PR-specific verification below shows /v1/models exists and is usable by an OpenAI-compatible client.
Test 2: End-to-end OpenAI SDK usage through the gateway
Ran the actual user-facing example with a real LLM API key:
OPENHANDS_SUPPRESS_BANNER=1 OH_CONVERSATIONS_PATH=/tmp/oh-qa-conversations-$RANDOM OH_BASH_EVENTS_DIR=/tmp/oh-qa-bash-$RANDOM LLM_API_KEY="$LLM_API_KEY" uv run python examples/02_remote_agent_server/14_openai_compatible_gateway.pyKey observed output:
Starting OpenHands API server on http://127.0.0.1:8770...
Session API key is set; send it as X-Session-API-Key on API calls.
API server is ready at http://127.0.0.1:8770
POST /api/profiles/gateway_demo HTTP/1.1" 201
GET /v1/models HTTP/1.1" 200
Gateway models include: openhands_gateway_demo
POST /v1/chat/completions HTTP/1.1" 200
First answer: An OpenAI-compatible agent-server gateway lets clients use the standard OpenAI API format to send requests to an agent server, translating those requests into agent actions and returning responses in OpenAI-style results.
OpenHands conversation ID: 3d437027-fa9f-424a-93e5-a486d890c463
GET /api/conversations/3d437027-fa9f-424a-93e5-a486d890c463 HTTP/1.1" 200
POST /v1/chat/completions HTTP/1.1" 200
Second answer using same conversation: OpenAI-compatible agent-server gateway.
DELETE /api/conversations/3d437027-fa9f-424a-93e5-a486d890c463 HTTP/1.1" 200
DELETE /api/profiles/gateway_demo HTTP/1.1" 200
EXAMPLE_COST: 0.024857
API server stopped.
This verifies a real OpenAI SDK client can list the profile-backed gateway model, call non-streaming chat completions, receive an OpenHands conversation ID header, reuse that conversation ID, inspect the native conversation, and clean up.
Test 3: OpenAI-route auth and unsupported streaming behavior
Started the PR server with OH_SESSION_API_KEYS_0=qa-session-key, then exercised direct HTTP requests:
HOME=/tmp/oh-qa-pr-home OH_CONVERSATIONS_PATH=/tmp/oh-qa-pr-convos OH_BASH_EVENTS_DIR=/tmp/oh-qa-pr-bash OH_ENABLE_VNC=0 OH_ENABLE_VSCODE=0 OH_PRELOAD_TOOLS=0 OH_WEBHOOKS='[]' OH_SECRET_KEY='qa-secret-key-for-agent-server-32b' OH_SESSION_API_KEYS_0='qa-session-key' OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --host 127.0.0.1 --port 44947Observed:
# No auth
HTTP/1.1 401 Unauthorized
{"detail":"Unauthorized"}
# Wrong bearer
HTTP/1.1 401 Unauthorized
{"detail":"Unauthorized"}
# Valid bearer
HTTP/1.1 200 OK
{"object":"list","data":[]}
# stream=true request
HTTP/1.1 400 Bad Request
{"detail":"Streaming chat completions are not supported yet"}
This confirms the OpenAI routes map Bearer auth to the existing session key and reject unsupported streaming requests with a clear client error, matching the PR description.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
Co-authored-by: openhands <openhands@all-hands.dev>
|
Followed up on review #4462061451 in 2e7ecca:
Validation run: TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files openhands-agent-server/openhands/agent_server/openai_service.py
TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server -qBoth passed. This PR comment was created by an AI agent (OpenHands) on behalf of the user. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: OpenAI Chat Completions Gateway
🟢 Good taste — Clean, pragmatic implementation that follows established patterns.
Summary
This PR adds an OpenAI-compatible gateway (/v1/models, /v1/chat/completions) to the agent-server. The architecture is well-designed and the implementation is production-ready.
What Works Well
- Clean separation:
models/router/servicestructure follows the existing codebase patterns - Dual auth: Both
X-Session-API-KeyandAuthorization: Bearerfor OpenAI SDK compatibility — the right call - Conversation reuse:
X-OpenHands-ServerConversation-IDheader enables stateful multi-turn interactions - Config-driven workspace: Uses
config.workspace_pathinstead of hardcoded paths (the previous concern was addressed) - Proper dependency:
openaipackage added as explicit dependency, reusing OpenAI Pydantic shapes for responses
Minor Observations
[IMPROVEMENT OPPORTUNITIES] (non-blocking)
- [openhands-agent-server/openhands/agent_server/openai_service.py, Line 318-323] Streaming: Streaming chat completions return a 400 error with a clear message. This is acceptable for v1 — consider adding streaming support in a follow-up since it's a common OpenAI client expectation.
Testing
The PR includes:
- Unit tests in
tests/agent_server/test_api_authentication.pyandtest_env_parser.py - Live integration tests with real LLM calls (
.pr/live-*.jsonartifacts) - Example script
examples/02_remote_agent_server/14_openai_compatible_gateway.py
Risk Assessment
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
The implementation is low risk: it adds new endpoints under /v1/ without modifying existing behavior, uses existing auth patterns, and has been validated through live testing.
VERDICT: ✅ Worth merging — Solid implementation with no blocking issues.
KEY INSIGHT: The design decision to expose LLM profiles as openhands_<profile_name> models is elegant — it keeps the gateway model selection aligned with the existing profile system while providing a clean OpenAI-compatible interface.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new OpenAI-compatible agent-server gateway end-to-end with a live local agent-server, a real profile-backed LLM call, OpenAI SDK requests, auth, conversation persistence/reuse, and documented error cases.
Does this PR achieve its stated goal?
Yes. The PR set out to let OpenAI-protocol clients call an OpenHands agent through /v1/chat/completions and list saved profiles through /v1/models; I verified both using the OpenAI Python SDK against a real local server. The baseline main server returned 404 for /v1/models, while the PR branch returned the profile-backed model, produced an OpenAI-shaped chat.completion, emitted X-OpenHands-ServerConversation-ID, persisted that conversation, and reused the same conversation ID for a follow-up request.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed; PR branch uv sync --dev refreshed the editable agent-server package. |
| CI Status | 🟡 At refresh time: 12 checks successful, 1 skipped, remaining checks queued/in progress. |
| Functional Verification | ✅ Gateway behavior worked end-to-end through real HTTP requests and the OpenAI SDK. |
Functional Verification
Test 1: Baseline endpoint availability before the feature
Step 1 — Reproduce / establish baseline (without the fix):
Ran git checkout origin/main, started a local server with uv run python -m openhands.agent_server --host 127.0.0.1 --port <free-port> using isolated HOME, OH_CONVERSATIONS_PATH, OH_BASH_EVENTS_DIR, OH_WORKSPACE_PATH, TMUX_TMPDIR, and OH_SESSION_API_KEYS_0=qa-session-key, then requested GET /v1/models with Authorization: Bearer qa-session-key.
Observed output:
BASE_URL=http://127.0.0.1:37193
GET /v1/models status=404
GET /v1/models body={"detail":"Not Found"}
This establishes the baseline: the OpenAI-compatible route did not exist on main.
Step 2 — Apply the PR's changes:
Checked out openhands/openai-chat-completions-gateway, reset to 2e7ecca0f6fca888997373d73fdc743802ffdeb7, and ran uv sync --dev to refresh the editable package.
Test 2: OpenAI-compatible models, completions, auth, reuse, and errors
Step 3 — Re-run with the feature in place:
Started a real local agent-server with the same isolated environment and OH_SESSION_API_KEYS_0=qa-session-key. Through HTTP I created an LLM profile named qa_gateway using the configured LLM_MODEL=litellm_proxy/openai/gpt-5.5, LLM_BASE_URL=https://llm-proxy.app.all-hands.dev, and LLM_API_KEY secret. Then I used OpenAI(api_key="qa-session-key", base_url="http://127.0.0.1:<port>/v1") to call models.list() and chat.completions.with_raw_response.create(...) twice, passing the returned X-OpenHands-ServerConversation-ID into the second request.
Observed output:
{
"unauthorized_models_status": 401,
"profile_create_status": 201,
"models_include_gateway": true,
"models": ["openhands_qa_gateway"],
"first_status": 200,
"first_conversation_header_present": true,
"first_completion": {
"id_prefix_ok": true,
"object": "chat.completion",
"model": "openhands_qa_gateway",
"finish_reason": "stop",
"message_role": "assistant",
"message_content_preview": "Hello QA, the OpenAI gateway is ready.",
"usage": {"completion_tokens": 30, "prompt_tokens": 3905, "total_tokens": 3935}
},
"conversation_lookup_status": 200,
"conversation_workspace": "/tmp/oh-qa-pr-gateway-q2tswe0a/workspace",
"second_status": 200,
"second_header_reused": true,
"second_completion": {
"id_prefix_ok": true,
"object": "chat.completion",
"model": "openhands_qa_gateway",
"finish_reason": "stop",
"message_role": "assistant",
"message_content_preview": "QA"
},
"stream_true_status": 400,
"stream_true_detail": "Streaming chat completions are not supported yet",
"invalid_model_status": 404,
"invalid_model_detail": "Unknown OpenHands model 'not_a_gateway_model'. Use GET /v1/models."
}This confirms the claimed user-facing behavior: an OpenAI-compatible client can list the saved profile as a model, authenticate with Bearer auth, receive a valid OpenAI-shaped chat completion backed by the OpenHands agent flow, get a reusable OpenHands conversation ID header, continue that conversation on a second OpenAI SDK request, and receive clear responses for unsupported streaming and invalid model IDs.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
|
@xingyaoww I think maybe I got too used to remake LLM APIs from scratch, so I didn’t blink much when the agent did it. But I made it reuse Is that OK? |
xingyaoww
left a comment
There was a problem hiding this comment.
LGTM - just a few nits, definitely open to experiment with this.
We should also need a docs PR
|
@OpenHands Why does duplicate example CI job fail now, check main branch, did we just add an example? Redo the numbering for this one and fix the PR. Then do this : #3545 (comment) |
Yes, we already have a docs PR - will fix! |
Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the latest feedback in e54ed38:
Validation run: python .github/scripts/check_duplicate_example_numbers.py
TMUX_TMPDIR=/tmp/oh-test-tmux-precommit-$RANDOM uv run pre-commit run --files examples/02_remote_agent_server/15_openai_compatible_gateway.py openhands-agent-server/openhands/agent_server/api.py openhands-agent-server/openhands/agent_server/openai/README.md openhands-agent-server/openhands/agent_server/openai/__init__.py openhands-agent-server/openhands/agent_server/openai/models.py openhands-agent-server/openhands/agent_server/openai/router.py openhands-agent-server/openhands/agent_server/openai/service.py tests/agent_server/test_api_authentication.py tests/cross/test_remote_conversation_live_server.py
TMUX_TMPDIR=/tmp/oh-test-tmux-$RANDOM uv run pytest tests/agent_server/test_api_authentication.py::test_openai_routes_accept_bearer_session_key tests/cross/test_remote_conversation_live_server.py::test_openai_chat_completions_gateway_over_real_server tests/cross/test_remote_conversation_live_server.py::test_openai_gateway_replays_frozen_llm_fixtures -qAll passed. This PR comment was created by an AI agent (OpenHands) on behalf of the user. |
HUMAN:
This PR proposes adding an endpoint to run the agent, which is openai-compatible. That is, let people run the Openhands agent as if it was an LLM 😅
According to my agent's and my research, that would allow people to:
Architecture note:
For review:
the
.pr/directory is temporary for live-testing, and for reviewers to see the testing artefacts.it will be automatically removed on PR approval.
A human has tested these changes.
AGENT:
This PR was created by an AI agent (OpenHands) on behalf of the user.
Why
Fixes #3594 as part of #3540 by adding a KISS OpenAI-compatible Chat Completions gateway to the agent-server, so OpenAI-protocol clients can call an OpenHands agent through
/v1/chat/completions.For review:
This is the first PR in a series of several. Below you find the tests. You should approve if there are no major issues, so we can continue and improve the new endpoint.
Summary
POST /v1/chat/completionsfor non-streaming OpenAI-compatible requests backed by agent-server conversations and profile-selected LLMs.GET /v1/modelsto expose saved LLM profiles asopenhands_{profile_name}model IDs./v1/*, preservesX-Session-API-Key, and emitsX-OpenHands-ServerConversation-IDfor optional conversation reuse..pr/live-test artifacts for OpenAIgpt-5-nanoand the eval LiteLLM proxy Haiku profile.Architecture:
/v1/chat/completionsis an OpenAI-compatible ingress into the normal agent-server conversation flow: model/profile lookup → conversation creation or reuse → agent run → OpenAI response formatting.PR-only Live-test Artifacts
The
.pr/files are intentionally committed PR-only live-test artifacts, following the repository's PR Artifacts workflow (.github/workflows/pr-artifacts.yml). That workflow posts or updates the PR artifacts notice when.pr/exists, and on same-repository PR approval (pull_request_review.submittedwith stateapproved) runs thecleanup-on-approvaljob togit rm -rf .pr/, commitchore: Remove PR-only artifacts [automated], push the cleanup commit to the PR branch, and update the notice comment.For fork PRs, the workflow intentionally skips the push cleanup because it cannot safely write to the contributor's branch; in that case
.pr/must be removed manually before merge.Reviewer workflow: keep
.pr/during review so the live-test commands, logs, and JSON responses are visible; approve when the code and live evidence are acceptable; let the PR Artifacts workflow remove.pr/; then merge the cleaned branch without those temporary artifacts.Issue Number
Fixes #3594
Parent issue: #3540
How to Test
Automated validation run:
Live smoke validation run with real external LLMs:
gpt-5-nano→openhands_openai_nanolitellm_proxy/anthropic/claude-haiku-4-5-20251001withbase_url=https://llm-proxy.eval.all-hands.dev→openhands_haiku_eval_proxy.pr/live-tests.mdand captured in.pr/live-*.jsonplus.pr/live-server.log.Video/Screenshots
N/A. This is an API endpoint change. See
.pr/live-tests.mdand the JSON/log artifacts in.pr/for live endpoint evidence.Evidence
Live run of the new OpenAI-compatible gateway example with
gpt-5-nano:Observed output:
This verifies that the example starts the local agent-server, creates a profile, lists the OpenAI-compatible model, calls
/v1/chat/completionsthrough the OpenAI SDK, receivesX-OpenHands-ServerConversation-ID, reuses that conversation ID on a second request, and cleans up afterward.Type
Notes
X-OpenHands-ServerConversation-IDreuses that server conversation instead.@enyst can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:e54ed38-pythonRun
All tags pushed for this build
About Multi-Architecture Support
e54ed38-python) is a multi-arch manifest supporting both amd64 and arm64e54ed38-python-amd64) are also available if needed