Add ChatGPT subscription LLM support#744
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
all-hands-bot
left a comment
There was a problem hiding this comment.
Review Summary
🟡 Acceptable with critical issues - The subscription auth feature is well-structured overall, but has fundamental issues that must be addressed before merging.
[CRITICAL ISSUES]
-
[__tests__/api/llm-subscription-service.test.ts] Test Quality: The core test suite violates repository guidelines by mocking the entire
LLMMetadataClientHTTP client instead of testing real code paths with MSW. This means tests won't catch actual regressions when the HTTP layer, auth headers, or server contract changes. Rewrite to use the MSW handlers already available insettings-handlers.ts. -
[src/components/features/settings/llm-profiles/llm-settings-local-view.tsx:170] Complexity: The
handleSavefunction has >4 levels of nesting (try-catch → rename if → auth type if-else → subscription model if → API key if-else → base_url if → wasActive if), violating the repository's '3 levels max' guideline. Extract helpers:buildLlmConfigForAuthType(),handleProfileRename(),handleProfileReactivation(). -
[src/api/llm-subscription-service.ts:78] Type Safety: Unsafe casting
(client as unknown as LLMMetadataClientWithTransport).clientto access internal transport. If@openhands/typescript-clientchanges its internal structure, this fails at runtime with cryptic errors. Add runtime validation that the transport exists and has the expected methods before using it.
[IMPORTANT ISSUES]
-
[src/components/features/settings/llm-settings/openai-subscription-auth-card.tsx:61] User Experience: When polling returns
connected: false, an error toast is shown with a "PENDING" message. This is misleading—users think something failed when they just need to wait. Use an info/warning toast or inline status text instead. -
[src/components/features/settings/llm-settings/openai-subscription-auth-card.tsx:36] State Management: Users can get stuck in the challenge state if they start device login but close the tab or navigate away. No way to clear the challenge except completing login or logging out (which fails if not connected). Add a "Cancel" button to reset challenge state.
-
[src/routes/llm-settings.tsx:219] Data Flow: Model selection is silently changed when switching auth types. If a user has 'gpt-4o', switches to subscription to explore, then switches back, their original selection is lost forever (replaced with
defaultModel). Store previous models in component state and restore them, or at minimum show a toast explaining the change. -
[src/api/llm-subscription-service.ts:109] Validation: Missing validation for empty strings in
normalizeDeviceChallenge. A buggy server could return empty strings that pass the!deviceCodecheck. Add.trim().length > 0validation fordeviceCode,userCode, andverificationUri. -
[__tests__/api/llm-subscription-service.test.ts] Test Coverage: Missing critical test cases: error handling (401/403, 500, timeouts, malformed responses), validation (incomplete device challenges), end-to-end integration (full device flow), and edge cases (empty strings, null values, concurrent requests).
-
[src/api/llm-subscription-service.ts:87] Error Handling: Network errors from the HTTP client propagate without context. Wrap errors to provide more specific messages about which subscription operation failed.
[SUGGESTIONS]
-
[src/api/agent-server-adapter.ts:820] Documentation: Add docstring to
assertSubscriptionAuthReadynoting that it's NOT called when resuming conversations or sending additional messages, so subscriptions could expire mid-conversation. The agent-server must handle expired tokens gracefully. -
[src/api/llm-subscription-service.ts:136] Data Structure: The
expiresAtfield tries both string and number fallbacks, includingexpires_inwhich is typically a relative duration in seconds, not an absolute timestamp. Mixing relative and absolute time values in the same field could cause confusion. Document the expected behavior or convert relative durations to absolute timestamps. -
[src/hooks/query/use-llm-subscription-status.ts:14] Query Configuration:
staleTimeof 30 seconds seems short for auth status that rarely changes. Consider 5 minutes to reduce unnecessary network requests, especially since mutations properly invalidate this query.
[RISK ASSESSMENT]
🟡 MEDIUM RISK - This PR adds new authentication flows and modifies conversation creation logic. The main risks are:
- Untested error paths could cause poor UX when auth fails
- Complex nested logic increases maintenance burden
- Type safety issues could cause runtime failures with SDK updates
Not blocking auto-merge based on risk alone, but the critical issues above should be addressed.
VERDICT:
❌ Needs rework - Address the critical test quality and complexity issues, then fix the important UX and validation gaps.
KEY INSIGHT:
The subscription auth implementation follows OAuth device flow patterns correctly, and backward compatibility is preserved (API key flows still work). However, the test suite doesn't provide confidence that the integration will work in production, and the complex nested save logic creates maintenance risk. Focus first on rewriting tests to use MSW and refactoring handleSave.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/agent-canvas/actions/runs/26336381445
Co-authored-by: openhands <openhands@all-hands.dev>
- Merge /api/llm/subscription/openai/models with chatgpt/ provider models from /api/llm/models so the subscription dropdown stays in sync with the full LiteLLM registry (e.g. includes gpt-5.5) - Add gpt-5.3-codex to the hardcoded fallback constant - Replace hardcoded OPENAI_SUBSCRIPTION_MODELS array in the dropdown with a dynamic useOpenAISubscriptionModels hook (falls back to constant when endpoints are unavailable) - Device code block: remove 'Code: ' prefix, fix black-on-dark contrast by using --oh-surface-primary / --oh-text-primary tokens, add CopyToClipboardButton for one-click copy Co-authored-by: openhands <openhands@all-hands.dev>
LiteLLM's chatgpt/ provider registry does not yet include gpt-5.5, but it is available via ChatGPT Plus/Pro subscription. Add it to the hardcoded fallback so it appears in the dropdown regardless of whether the backend's chatgpt/ entries have been updated. Co-authored-by: openhands <openhands@all-hands.dev>
- Drop /api/llm/subscription/openai/models as a model source; it was added in this PR set and is redundant — LiteLLM's chatgpt/ provider is the single authoritative list - Remove the OPENAI_SUBSCRIPTION_MODELS hardcoded fallback array, OpenAISubscriptionModel type, DEFAULT_OPENAI_SUBSCRIPTION_MODEL, and isOpenAISubscriptionModel type guard — all replaced by the live subscriptionModels from the hook, or a simple pass-through in non-hook contexts (adapter, profile builder) - Update agent-server-adapter test to reflect pass-through behavior Co-authored-by: openhands <openhands@all-hands.dev>
|
I tested the ChatGPT subscription flow locally. The frontend/model-selection parts of this PR helped, but the remaining Observed behavior:
The likely root cause is that the SDK loses subscription runtime state after serializing/reloading the LLM config. A saved conversation LLM contains: { But after validating/reloading it as an SDK LLM, llm.is_subscription becomes False. That means Potential SDK fix: @Property There is a second issue after auth is fixed: LiteLLM’s streamed Responses completion can expose completed_event.response as a plain dict, while the SDK expects an object with .output. I verified the stream output can be recovered from response.output_item.done events and parsed successfully. So I think this PR may need a companion SDK fix for:
|
Co-authored-by: openhands <openhands@all-hands.dev>
|
Updated this stacked PR in 1e94188: merged current main, resolved package conflicts, kept the temporary @openhands/typescript-client branch pin at b025805, rewrote subscription service tests to use MSW/real client paths, tightened device challenge validation, added login cancel/pending UX, preserved model selection when toggling auth type, and refactored profile-save logic. Local verification: npm run lint, npm run build, and full npm test passed.\n\nThis comment was generated by an AI agent (OpenHands) on behalf of neubig. |
|
Uh oh! There was an unexpected error starting the job :( |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Review Summary
✅ Previous critical issues resolved — The three blockers from the prior review have all been addressed: tests now use MSW handlers instead of full-client mocks, handleSave complexity has been properly refactored into helper functions (buildProfileLlmConfig, renameProfileIfNeeded, reactivateProfileIfNeeded), and the unsafe (client as unknown as LLMMetadataClientWithTransport).client cast is gone. The cancel button, model restoration refs, and empty-string validation in readString are all present. This is a significant quality improvement.
The OAuth device flow pattern is correctly implemented, subscription credential omission from conversation payloads is good security practice, and withLlmClient's finally { client.close() } is solid resource management.
[IMPORTANT ISSUES]
-
Unconditional subscription model fetch (
src/routes/llm-settings.tsx:140):useOpenAISubscriptionModels()fires on every mount ofLlmSettingsScreen, including for users who never use subscription auth. The hook supports{ enabled }— passenabled: isSubscriptionModewhereisSubscriptionModeis derived from the current settings' auth type. The silent failure path means no visible bug, but it generates unnecessary traffic for all API-key users and may produce 404/500 noise in agent-server logs. -
Race condition: model saved as empty string (
src/routes/llm-settings.tsx:408): InbuildPayload, ifsubscriptionModelsis stillundefinedwhen the user saves (e.g. they switch to subscription and immediately click save before the model list loads),subscriptionModels?.[0] ?? ""evaluates to"". The same race exists inhandleAuthTypeChangeat line 235 — switching before models load firesonChange("llm.model", ""). ThesubscriptionModelValuedisplay fallback at line 181–183 hides this in the UI, but the underlying form value is"". Guard the save path: ifauthType === LLM_AUTH_TYPE_SUBSCRIPTION && !subscriptionModels?.length, show a validation message or block the save.
[SUGGESTIONS]
-
assertSubscriptionAuthReadymissing lifecycle note (src/api/agent-server-adapter.ts:844): Flagged in the previous review, still unaddressed. The function is called at conversation start and profile switch, but not on subsequent message sends or conversation resume. A one-line JSDoc noting this scope prevents future maintainers from assuming token expiry is covered end-to-end. -
@openhands/typescript-clientgit dep tracking (__tests__/package-library.test.ts:57): The comment says "temporarily allowed while this stacked PR waits for the subscription client branch to merge/release" but there is no linked issue or deadline. Link a tracking issue or PR in the comment so this exemption doesn't outlive its purpose. -
staleTimeof 30s for subscription status (src/hooks/query/use-llm-subscription-status.ts:14): Raised in the previous review. Since both poll and logout mutations already invalidate this query, the 30-second background refetch is redundant. Bumping to 5 minutes reduces background requests with no UX regression.
VERDICT: 🟡 Close to merge-ready. The race condition in buildPayload/handleAuthTypeChange can silently produce a profile with an empty model field — worth a targeted fix before merge. The unconditional model fetch is a UX polish issue. Everything else is non-blocking.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Co-authored-by: openhands <openhands@all-hands.dev>
|
@openhands-ai[bot] it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again. |
1 similar comment
|
@openhands-ai[bot] it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again. |
Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: Add ChatGPT subscription LLM support
🟡 Acceptable - The implementation is functional and the architecture is reasonable, but there are a few issues that should be addressed before merging.
[CRITICAL ISSUES]
-
[src/routes/llm-settings.tsx, ~line 450] Error thrown in callback:
buildPayloadthrowsnew Error("Subscription models are not loaded yet.")when subscription models haven't loaded. This is problematic because:- Raw
Errorthrown in a React callback won't be caught gracefully - Poor user experience - users will see a cryptic error instead of a loading state
- Consider returning a validation result object instead, or preventing form submission until models are loaded
- Raw
-
[src/mocks/workspace-handlers.ts] Unrelated file: This new file contains workspace mock handlers and appears completely unrelated to LLM subscription support. It should be removed from this PR or moved to a separate PR.
[IMPROVEMENT OPPORTUNITIES]
-
[src/routes/llm-settings.tsx,
handleAuthTypeChange] Complex branching: This function has deep nesting with multiple early returns for auth type switching. Consider extracting the model persistence/switching logic into a separate helper function to improve readability. -
[src/components/features/settings/llm-profiles/llm-settings-local-view.tsx] Duplicate auth handling: The auth type logic in
handleSaveduplicates code fromllm-settings.tsx. Consider extracting this into a shared utility to reduce code duplication. -
[src/api/agent-server-adapter.ts,
assertSubscriptionAuthReady] Vendor-specific check: This function is hardcoded to check OpenAI subscription status, but the schema suggests future subscription vendors. Consider renaming or adding a vendor parameter to make this extensible.
[TESTING GAPS]
- [tests/components/settings/llm-profiles/llm-settings-local-view.test.tsx] Test regression: The test "calls save mutation with correct payload and returns to list" was simplified from a full integration test with proper
userEventinteractions to a simpler test withfireEvent. The new test doesn't verify the actual save behavior - it just checks that the create view remains stable. Consider restoring the original test behavior or renaming the test to reflect what it actually tests.
[STYLE NOTES]
- The new
src/mocks/workspace-handlers.tsfile (89 lines) is completely unrelated to this PR. Please remove it or explain why it belongs here.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
The core functionality (ChatGPT subscription auth via device code flow) is implemented correctly. However, the error-throwing behavior inbuildPayloadposes a runtime risk that could cause unhandled exceptions. The unrelated workspace-handlers.ts file adds noise and potential confusion.
VERDICT:
❌ Needs rework: The error-throwing pattern in buildPayload must be fixed before merging. The unrelated workspace-handlers.ts file should be removed.
KEY INSIGHT:
Throwing raw Error objects inside React callbacks is an anti-pattern that leads to poor user experience; use validation result objects or conditional rendering instead.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
✅ Mock-LLM E2E Tests53/53 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
🛑 Mock-LLM Docker E2E Test Results27/29 passed · 2 skipped · Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
|
Addressed the fresh AI review in
I did not remove Local validation:
|
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: Add ChatGPT subscription LLM support
Taste Rating
🟡 Acceptable — Clean implementation with minor UX and robustness gaps worth addressing before merge.
Analysis
This PR introduces support for ChatGPT subscription-based LLM authentication via OpenAI's device code flow. The implementation follows the existing patterns in the codebase and adds:
- A subscription service for device authentication
- React Query hooks for status and model fetching
- UI components for the auth card and subscription settings
- Schema injection for subscription-related fields
[IMPROVEMENT OPPORTUNITIES]
-
🟡 Manual polling UX —
openai-subscription-auth-card.tsx:64-78requires users to click "Finish Sign In" after completing browser authentication. Consider auto-polling with a retry loop to match standard device code flow patterns. -
🟡 No retry/refresh on subscription unavailable —
openai-subscription-auth-card.tsx:118-121shows "unavailable" but provides no way to retry. Consider adding a retry button or refetch on click. -
🟡 Subscription models cached indefinitely —
use-llm-subscription-models.ts:14usesstaleTime: 1000 * 60 * 5but models could change (user upgrades/downgrades). Consider shorter staleTime or adding a refetch mechanism when auth status changes. -
🟡 No cleanup for expired device challenges — If the user starts login but doesn't complete it within the challenge expiry window, the UI doesn't show this state. The
expiresAtfield is stored but never displayed.
[TESTING]
✅ Tests in __tests__/api/llm-subscription-service.test.ts cover the main service methods including error cases and normalization logic.
✅ MSW handlers in src/mocks/settings-handlers.ts properly mock the subscription endpoints.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
This is an additive feature with no breaking changes to existing functionality. The subscription auth is opt-in via a new auth type selector. Code paths are guarded by the isSubscriptionLlmConfig check, so existing API key users are unaffected. No new dependencies are introduced.
Verdict
✅ Worth merging — Core logic is sound, with suggestions for improving UX and robustness.
Key Insight
The device code flow implementation is correct per OAuth standards, but requires manual user action to poll — consider auto-polling to match expected behavior for this auth pattern.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
✅ Mock-LLM E2E Tests53/53 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
🛑 Mock-LLM Docker E2E Test Results35/38 passed · 3 skipped · Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
|
| Status | Test | Duration |
|---|
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)
🛑 Mock-LLM E2E Tests13/13 passed · Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: Add ChatGPT subscription LLM support
Taste Rating: 🟡 Acceptable
The implementation adds ChatGPT subscription authentication support using OAuth Device Code flow. The core logic is sound, and the feature integrates well with the existing settings infrastructure.
Summary
This PR introduces support for authenticating with ChatGPT Plus/Pro subscriptions to use supported Codex models without requiring an OpenAI API key. The implementation includes:
- New subscription service and API endpoints
- UI components for device code authentication flow
- Schema modifications to support subscription auth type
- Comprehensive tests for the new functionality
🟡 Minor Issue
[package-lock.json] Unintentional Change: The diff shows a license field being removed from @openhands/typescript-client. This appears unintentional — npm typically adds this field automatically during install. Consider reverting this specific change.
🟡 Suggestions
[src/api/agent-server-adapter.ts, Line 424] Silent Data Loss: The toRecord function silently converts undefined or null values to {}:
function toRecord(value: unknown): SettingsRecord {
if (!value || typeof value !== 'object' || Array.isArray(value)) {
return {};
}
return structuredClone(value as SettingsRecord);
}This is pre-existing code, but it's worth noting that any caller passing undefined or null will get an empty object back with no indication of error. Consider adding a runtime check or type guard to prevent silent failures if the expected data is missing.
[src/routes/llm-settings.tsx, Lines 458-467] Model Selection Logic: The fallback logic for subscription model selection could be simplified. The double-check !fallbackSubscriptionModel in the condition is redundant since subscriptionModels?.[0] would be falsy anyway.
Test Coverage ✅
The tests for LLMSubscriptionService and the adapter modifications look comprehensive:
- Tests cover the subscription auth flow (device start, poll, logout)
- Adapter tests verify the correct payload construction for subscription vs API key auth
- Mock handlers in
settings-handlers.tsproperly simulate the subscription endpoints
Security Assessment
🟢 LOW RISK
- Uses OAuth Device Code flow, a standard and secure authentication mechanism
- No sensitive credentials stored in code
- API keys are properly handled and excluded from settings payloads
- Device code tokens have appropriate expiration handling
Verdict
✅ Worth merging: Core functionality is well-implemented. The minor issues above are suggestions for improvement rather than blockers.
Key Insight
The subscription authentication integrates cleanly with the existing settings infrastructure by adding a new auth_type field to distinguish between API key and subscription-based authentication, with the adapter properly sanitizing the payload based on auth type.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
✅ Mock-LLM E2E Tests53/53 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
🔶 Mock-LLM Docker E2E Test Results48/53 passed · 5 skipped Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
@/tmp/pr744-body-updated.txt
🐳 Docker images for this PR
• GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas
ghcr.io/openhands/agent-canvasghcr.io/openhands/agent-server:1.28.1-pythonopenhands-automation==1.0.0a9f96869da032d256ee6d3dfaffadc6cab69556be4Pull (multi-arch manifest)
# Multi-arch manifest — Docker automatically pulls the correct architecture docker pull ghcr.io/openhands/agent-canvas:sha-f96869dRun
All tags pushed for this build
About Multi-Architecture Support
sha-f96869d) is a multi-arch manifest supporting both amd64 and arm64sha-f96869d-amd64) are also available if needed