fix(llm): recognize claude-opus-4-8 as a reasoning model on Bedrock#3695
Open
kimnamu wants to merge 1 commit into
Open
fix(llm): recognize claude-opus-4-8 as a reasoning model on Bedrock#3695kimnamu wants to merge 1 commit into
kimnamu wants to merge 1 commit into
Conversation
LiteLLM (>=1.84.1) recognizes the first-party "anthropic/claude-opus-4-8" id as a reasoning model, but not the Bedrock cross-region inference ids (e.g. "bedrock/us.anthropic.claude-opus-4-8-v1:0"). Those forms fall through to the non-reasoning branch in chat_options.py and leave temperature/top_p in the request, which Anthropic rejects for this model. Add "claude-opus-4-8" to the REASONING_EFFORT_MODELS override list, mirroring the existing claude-fable-5 entry (OpenHands#3642). This routes the model through the reasoning_effort path that strips temperature/top_p, without taking the extended-thinking path that injects the legacy thinking.type=enabled block (which Anthropic now rejects for opus-4-8 — see reverted OpenHands#3427 / revert OpenHands#3441). Co-authored-by: openhands <openhands@all-hands.dev> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
HUMAN:
I reviewed this change locally against the pinned litellm 1.84.1, confirmed the Bedrock cross-region reasoning gap, and verified the regression tests fail before / pass after the one-line fix.
AGENT:
Why
LiteLLM 1.84.1 recognizes
anthropic/claude-opus-4-8but not its Bedrock cross-region ids (bedrock/{us,eu,apac,global}.anthropic.claude-opus-4-8-v1:0), so on Bedrock the model falls through to the non-reasoning branch and leakstemperature/top_p→ Anthropic 400. This mirrors the fable-5 gap fixed in #3642; opus-4-7's Bedrock ids are already covered by LiteLLM metadata.It deliberately uses
REASONING_EFFORT_MODELS(strips temp/top_p only), notEXTENDED_THINKING_MODELS— adding it there is what #3427 did and #3441 reverted, because that branch injects the legacythinking.type=enabledblock +interleaved-thinkingheader that Anthropic now rejects for opus-4-8.Summary
"claude-opus-4-8"toREASONING_EFFORT_MODELS(model_features.py) — one data line, mirrors the fable-5 fix (fix(llm): recognize claude-fable-5 as a reasoning model #3642).us./eu./apac./global.).Issue Number
Fixes #3694
How to Test
Run the added offline unit tests (no network):
Before/after evidence: with the source line reverted (
git stash push -- openhands-sdk/openhands/sdk/llm/utils/model_features.py), the new tests FAIL —temperatureleaks and_supports_reasoning_effort("bedrock/us.anthropic.claude-opus-4-8-v1:0")returnsFalse(5 failed, 36 passed). Restoring the line →174 passed. Localpre-commit(ruff + pyright) passes._supports_reasoning_effort("bedrock/us.anthropic.claude-opus-4-8-v1:0")anthropic/claude-opus-4-8(already worked)thinking.type=enabledblock / beta headerType
Notes
This contribution was prepared with the help of an AI agent (Claude Code); a human reviewed the change, rationale, and test results before submission.