Skip to content

Intelligent model router: classify_and_switch_llm tool + meta-profiles#3744

Open
juanmichelini wants to merge 8 commits into
mainfrom
jmj/intelligent-model-router
Open

Intelligent model router: classify_and_switch_llm tool + meta-profiles#3744
juanmichelini wants to merge 8 commits into
mainfrom
jmj/intelligent-model-router

Conversation

@juanmichelini

@juanmichelini juanmichelini commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

HUMAN:

  • I (the human author) have tested these changes end-to-end.

Evidence:

Using this simple meta-profile

  {
    "classifier_model": "minimax-2.5",
    "default_model": "minimax-m3",
    "classes": [
      {
        "description": "tasks is UI oriented or requires looking pictures",
        "model": "gpt-5.4"
      },
      {
        "description": "tasks is research oriented, requires looking at links",
        "model": "Gemini-3.1-Pro"
      },
      {
        "description": "tasks is test oriented, just adding tests",
        "model": "claude-opus-4-8"
      }
    ]
  }

starting with claude-sonnet-4-5

image

models routes successfully to gpt-5.4 and answers

image

AGENT:

Why

OpenHands users want a single conversation to automatically use the best model
for the task at hand (e.g. a cheap model for tests, a stronger model for
research) without manually switching profiles. This adds an intelligent model
router that classifies the current task and switches the conversation to the
most suitable saved LLM profile.

To make the router usable from a UI, this PR also exposes meta-profiles through
a dedicated CRUD API (previously they could only be created by hand-editing
files under ~/.openhands/meta-profiles/ or set via PATCH /api/settings).

Summary

  • Add ClassifyAndSwitchLLMAction / ClassifyAndSwitchLLMTool (runtime name classify_and_switch_llm): reads the active meta-profile, makes a single classifier-model call over the recent conversation, and switches to the chosen class's LLM profile (falls back to default_model).
  • Add MetaProfile / MetaProfileClass / MetaProfileStore reading ~/.openhands/meta-profiles/*.json, plus settings wiring (enable_classify_and_switch_llm_tool, active_meta_profile) in OpenHandsAgentSettings and agent-server PersistedSettings.
  • When the tool is enabled but no meta-profile is active, fall back to the first available meta-profile instead of erroring.
  • Meta-profile management API (new):
    • Extend MetaProfileStore with save(), delete(), and list_summaries(), plus a MetaProfileLimitExceeded guard; export the new symbols from openhands.sdk.llm.
    • Add meta_profiles_router.py (registered in api.py) exposing GET /api/meta-profiles (list summaries + active_meta_profile), GET /api/meta-profiles/{name}, POST /api/meta-profiles/{name} (create/update), DELETE /api/meta-profiles/{name}, and POST /api/meta-profiles/{name}/activate (records active_meta_profile in settings). Mirrors the existing profiles_router.py.

Issue Number

N/A

How to Test

  1. Save the LLM profiles referenced by a meta-profile into ~/.openhands/profiles/ (classifier model, default model, and each class's model).
  2. Create a meta-profile — either via the new API (POST /api/meta-profiles/<name> with {classifier_model, default_model, classes}) / the companion Agent Canvas UI, or by hand at ~/.openhands/meta-profiles/<name>.json.
  3. Build an agent with OpenHandsAgentSettings(..., enable_classify_and_switch_llm_tool=True, active_meta_profile="<name>") (or omit active_meta_profile to exercise the new first-profile fallback) and start a conversation.
  4. Send a task that clearly matches one class (e.g. "add unit tests for the parsing module") and instruct the agent to call classify_and_switch_llm.
  5. Confirm the Classified task and switched LLM observation names the expected target profile and the conversation continues on the new model.

Automated checks run locally:

uv run pytest tests/sdk/tool/test_classify_and_switch_llm.py \
  tests/sdk/llm/test_meta_profile_store.py tests/sdk/test_settings.py \
  tests/agent_server/test_active_meta_profile.py
# 150 passed

# New meta-profile management API/store coverage:
uv run pytest tests/sdk/llm/test_meta_profile_store.py \
  tests/agent_server/test_meta_profiles_router.py
# 26 passed

uv run ruff check ...        # All checks passed
uv run ruff format --check   # clean
uv run pyright ...           # 0 errors, 0 warnings

Manual end-to-end: triggered the tool in a live conversation; it classified a
research task and switched to the Gemini profile successfully. (Note: switching
into a Gemini-3 thinking model can then 400 with "Thought signature is not
valid" because inherited history lacks Gemini thought signatures — a provider
limitation, not this router's logic; use a non-thinking target profile.)

The new endpoints were also verified end-to-end over real HTTP (with session
auth) against a local agent-server: list / create (201) / get / activate / delete (200), plus 404 for a missing profile and 422 for an invalid body,
reading/writing the real ~/.openhands/meta-profiles/ store.

Review fixes (commit 6468a93)

Addressed PR-review feedback on the meta-profile management surface:

  1. Activation was not wired to the agent settings that control behavior.
    activate_meta_profile() and the delete helper mutated active_meta_profile
    via a direct field assignment in the settings-store callback, which set only
    the top-level field and never propagated into agent_settings
    (active_meta_profile + enable_classify_and_switch_llm_tool). So /activate
    reported success while the routing tool was never attached, and deleting the
    active profile left stale nested state. Both paths now route through
    PersistedSettings.update({"active_meta_profile": ...}), the single source of
    truth that performs the propagation.
  2. Startup no longer depends on mutable disk state.
    ClassifyAndSwitchLLMTool.create() deferred-loads the meta-profile: resolution
    (active-or-first-available + load) moved from agent-init time into the executor
    at invocation time. A missing/renamed file under ~/.openhands/meta-profiles
    now surfaces as a tool error observation instead of breaking
    LocalConversation._ensure_agent_ready() / conversation startup.
  3. MetaProfileStore regained the store-level concurrency protections.
    save() / delete() now hold a FileLock across the precondition check and
    the atomic replace (mirroring LLMProfileStore), so concurrent creators can't
    overshoot max_profiles (TOCTOU).

New/strengthened tests (behavior, not facade):

uv run pytest tests/agent_server/test_meta_profiles_router.py \
  tests/sdk/tool/test_classify_and_switch_llm.py \
  tests/sdk/llm/test_meta_profile_store.py tests/agent_server/test_active_meta_profile.py
# all passing
  • Router: /activate and delete now assert the nested agent_settings
    (active_meta_profile + enable_classify_and_switch_llm_tool) — catching the
    top-level/nested drift that the previous facade-only test missed.
  • Tool: create() no longer touches disk or raises on missing profiles; the
    error surfaces at invocation; a dangling active_meta_profile does not
    break _ensure_agent_ready().
  • Store: concurrent saves respect max_profiles under the lock.

Companion PR

Video/Screenshots

N/A (CLI/SDK feature; see logs above).

Type

  • Bug fix
  • Feature
  • Refactor
  • Breaking change
  • Docs / chore

Notes

  • Follow-up (out of scope): a general fix that strips provider-specific reasoning artifacts (Gemini thought signatures, Anthropic thinking blocks, OpenAI reasoning) from inherited history on model switch, which would also benefit switch_llm.
  • The agent-server update() derives enable_classify_and_switch_llm_tool = (active_meta_profile is not None), so the no-active-profile fallback is reachable via the SDK directly; decoupling that flag server-side can be a follow-up.

This PR description (AGENT section) was drafted by an AI agent (OpenHands) on behalf of the repository owner.


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:716e7dd-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-716e7dd-python \
  ghcr.io/openhands/agent-server:716e7dd-python

All tags pushed for this build

ghcr.io/openhands/agent-server:716e7dd-golang-amd64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-golang-amd64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-golang-amd64
ghcr.io/openhands/agent-server:716e7dd-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:716e7dd-golang-arm64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-golang-arm64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-golang-arm64
ghcr.io/openhands/agent-server:716e7dd-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:716e7dd-java-amd64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-java-amd64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-java-amd64
ghcr.io/openhands/agent-server:716e7dd-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:716e7dd-java-arm64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-java-arm64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-java-arm64
ghcr.io/openhands/agent-server:716e7dd-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:716e7dd-python-amd64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-python-amd64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-python-amd64
ghcr.io/openhands/agent-server:716e7dd-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:716e7dd-python-arm64
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-python-arm64
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-python-arm64
ghcr.io/openhands/agent-server:716e7dd-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:716e7dd-golang
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-golang
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-golang
ghcr.io/openhands/agent-server:716e7dd-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:716e7dd-java
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-java
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-java
ghcr.io/openhands/agent-server:716e7dd-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:716e7dd-python
ghcr.io/openhands/agent-server:716e7dd5cf598bb22d4ac3097810c43aa2c363fd-python
ghcr.io/openhands/agent-server:jmj-intelligent-model-router-python
ghcr.io/openhands/agent-server:716e7dd-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 716e7dd-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 716e7dd-python-amd64) are also available if needed

Introduce a meta-profile driven model router:
- MetaProfile/MetaProfileStore (~/.openhands/meta-profiles) mirroring LLMProfileStore.
- ClassifyAndSwitchLLM builtin tool: reads active meta-profile, runs one
  classifier call over recent conversation messages, and switches to the
  chosen LLM profile (falls back to default_model).
- Wire tool through OpenHandsAgentSettings.create_agent via new
  enable_classify_and_switch_llm_tool + active_meta_profile fields.
- Server parity (Option A): active_meta_profile in PersistedSettings,
  SettingsResponse/SettingsUpdateRequest, and settings_router; update()
  propagates it into agent_settings and toggles the enable flag.
- Tests for store, helpers, executor paths, create_agent wiring, and server
  persistence/propagation.

Co-authored-by: openhands <openhands@all-hands.dev>
When the classify_and_switch_llm tool is enabled without an active
meta-profile, resolve the first available meta-profile from the store
instead of requiring one. Wire the tool in create_agent() whenever it is
enabled, and only error if no meta-profiles exist at all.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   api.py2762690%109, 111–116, 118, 120, 122, 157, 169, 184, 190, 243, 248, 257–259, 503, 506, 510–512, 514, 521
   meta_profiles_router.py107595%81, 160, 231–233
   settings_router.py125992%211, 264, 266–267, 368, 370–371, 409, 414
openhands-agent-server/openhands/agent_server/persistence
   models.py2032687%296, 301, 339, 387, 422–428, 430, 432, 435, 439, 465, 482–483, 528, 532, 534, 563–566, 569
openhands-sdk/openhands/sdk/llm
   meta_profile_store.py90792%93, 96, 188–190, 221–222
openhands-sdk/openhands/sdk/settings
   api_models.py32293%95, 97
   model.py7104893%102, 401, 419, 599, 609–612, 615, 628, 632, 638, 648, 654, 659, 882, 907, 909, 911, 913, 915, 917, 919, 921, 923, 1235, 1237, 1651, 1671, 1839, 1968, 2007, 2033, 2169–2171, 2173, 2227, 2259, 2269, 2271, 2276, 2294, 2307, 2309, 2311, 2313, 2320
openhands-sdk/openhands/sdk/tool/builtins
   classify_and_switch_llm.py1241885%45–47, 69–71, 73–81, 183, 267–268
TOTAL32538692278% 

Adds MetaProfileStore.save()/delete()/list_summaries() plus a
meta_profiles_router mirroring profiles_router: list, get, save,
delete, and activate (records active_meta_profile in settings).

This lets clients (e.g. Agent Canvas) manage meta-profiles for the
classify_and_switch_llm routing tool, which previously could only
have its active_meta_profile set via PATCH /api/settings.

Co-authored-by: openhands <openhands@all-hands.dev>
Fixes from PR review of the meta-profile management surface:

1. Activation/clear now route through PersistedSettings.update({...})
   instead of a direct field assignment in the settings-store callback.
   Direct assignment only set the top-level active_meta_profile and never
   propagated into agent_settings (active_meta_profile +
   enable_classify_and_switch_llm_tool), so /activate reported success
   while the routing tool was never attached. The delete helper had the
   same top-level/nested drift. (meta_profiles_router.py)

2. ClassifyAndSwitchLLMTool defers meta-profile resolution from create()
   to execution time. A missing/renamed file under ~/.openhands/
   meta-profiles no longer breaks conversation startup
   (_ensure_agent_ready); it surfaces as a tool error observation instead.

3. MetaProfileStore.save()/delete() now hold a FileLock across the
   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   concurre   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr  : a   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pr   pgation (not just the facade), guarding against drift.
- Tool: create() no longer touches disk / raises on missing profiles;
  invocation reports the error; a missing active meta-profile does not
  break _ensure_agent_ready().
- Store: concurrent saves respect max_profiles under the lock.

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Intelligent Model Router

🟡 Taste Rating: Acceptable — Core design is sound and solves a real problem. Minor improvements suggested below.


[CRITICAL ISSUES]

None identified.


[IMPROVEMENT OPPORTUNITIES]

  • [openhands-sdk/openhands/sdk/tool/builtins/classify_and_switch_llm.py, Lines 47-53] Comment Noise: The field descriptions on chosen_class, model, active_model in ClassifyAndSwitchLLMObservation are verbose restatements of the variable names. Pydantic generates adequate descriptions from field names.

  • [openhands-sdk/openhands/sdk/llm/meta_profile_store.py, Lines 36-44] Comment Noise: The module docstring explains the data model in detail, but the Pydantic class docstrings below repeat this information. The MetaProfile class docstring could be trimmed.

  • [openhands-agent-server/openhands/agent_server/persistence/models.py, Lines 254-268] Complex Nesting: The update method now has 4 levels of nesting with the new active_meta_profile block. Consider extracting the propagation logic into a helper method for readability.


[TESTING GAPS]

None — the test coverage is comprehensive:

  • Unit tests for MetaProfileStore including concurrent access
  • Router integration tests with TestClient
  • Tool executor tests with real code paths
  • Settings propagation tests

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is an additive feature with minimal risk:

  • New functionality only, no breaking changes to existing APIs
  • Proper file locking prevents race conditions
  • Input validation via Pydantic and regex patterns
  • Deferred meta-profile resolution avoids startup failures
  • Comprehensive test coverage including edge cases

VERDICT

Worth merging: Core logic is sound, well-tested, and solves a real user need. The comments flagged above are stylistic and won't affect correctness.

KEY INSIGHT

The deferred meta-profile resolution pattern (resolve at invocation, not at startup) is the right trade-off: it allows graceful fallback to the first available profile without breaking conversation startup when the active profile is missing.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 QA Report: PARTIAL

Core SDK routing and meta-profile HTTP management worked in real execution; the only unverified piece is a live external provider-backed classifier call because available credentials/models failed.

Does this PR achieve its stated goal?

Partially verified, with the exercised paths behaving as intended. The PR set out to add an intelligent model router tool and CRUD API for meta-profiles. I verified the feature was absent on origin/main, then on this PR the server exposed ClassifyAndSwitchLLMTool, the SDK attached classify_and_switch_llm, switched to the class-selected LLM profile, fell back to the default profile when the classifier returned 0, and reported a tool error instead of breaking startup for a dangling active meta-profile. The new /api/meta-profiles endpoints also worked end-to-end over real HTTP and propagated activation/deletion into nested agent_settings.

Phase Result
Environment Setup make build completed and installed the uv environment.
CI Status ⚠️ Most checks are green, but pre-commit is failing and qa-changes is in progress at review time.
Functional Verification 🟡 SDK routing + API CRUD passed; live external classifier provider call could not be verified with available credentials/models.
Functional Verification

Test 1: SDK model-routing tool before/after

Step 1 — Establish baseline without the feature:
Checked out origin/main and ran the same SDK exercise script:

git checkout --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 QA_HOME=/tmp/qa-router-base uv run python /tmp/qa_intelligent_router_sdk.py

IMPORT_STATUS=missing_or_unavailable:ModuleNotFoundError:No module named 'openhands.sdk.llm.meta_profile_store'

This confirms the meta-profile store/tool surface did not exist on the base branch.

Step 2 — Apply the PR's changes:
Checked out jmj/intelligent-model-router at 6468a931d8b3ac7f41c7e733d076b47185bbb0f0.

Step 3 — Re-run with the PR:
Ran a real SDK script that created saved LLM profiles and a meta-profile under an isolated HOME, built an OpenHandsAgentSettings(..., enable_classify_and_switch_llm_tool=True) agent, sent a user message, and executed the classify_and_switch_llm tool:

TOOLS=classify_and_switch_llm,finish,switch_llm,think
OBS_ERROR=False
OBS_TEXT=Classified task as 'UI oriented work' and switched to LLM profile 'ui-target' (model 'ui-model'). Future agent steps will use this profile.
OBS_MODEL=ui-target
OBS_CHOSEN_CLASS=UI oriented work
ACTIVE_MODEL=ui-model

This shows the tool was attached and switched the active conversation LLM from the starting model to the class-selected ui-target profile.

I also exercised the no-active-meta-profile fallback and dangling-active-profile path:

FALLBACK_OBS_ERROR=False
FALLBACK_OBS_MODEL=default-target
FALLBACK_ACTIVE_MODEL=default-target-model
MISSING_PROFILE_STARTUP_OK=True
MISSING_PROFILE_OBS_ERROR=True
MISSING_PROFILE_OBS_TEXT=Failed to resolve the active meta-profile: Meta-profile `deleted-profile` not found. Available meta-profiles: balanced

This verifies the first-available meta-profile fallback works, and a missing active file surfaces as a tool observation instead of breaking conversation startup.

Test 2: Meta-profile API before/after over real HTTP

Step 1 — Establish baseline without the feature:
Started the base branch server and called the new endpoint paths:

HOME=/tmp/qa-api-base OH_PERSISTENCE_DIR=/tmp/qa-api-base/persistence   OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --port 8124

GET /api/meta-profiles
{"detail":"Not Found"}
HTTP_STATUS=404

POST /api/meta-profiles/balanced
{"detail":"Not Found"}
HTTP_STATUS=404

This confirms the CRUD API did not exist on the base branch.

Step 2 — Apply the PR's changes:
Checked out jmj/intelligent-model-router and started the actual agent server with isolated HOME and OH_PERSISTENCE_DIR.

Step 3 — Re-run with the PR:
Server readiness included the new tool:

"usable_tools": [
  "ClassifyAndSwitchLLMTool",
  "file_editor",
  "task_tool_set"
]

Then I exercised list/create/get/activate/settings/error/delete paths with curl:

GET /api/meta-profiles
{"meta_profiles":[],"active_meta_profile":null}
HTTP_STATUS=200

POST /api/meta-profiles/balanced
{"name":"balanced","message":"Meta-profile 'balanced' saved"}
HTTP_STATUS=201

GET /api/meta-profiles/balanced
{"name":"balanced","config":{"classifier_model":"classifier-profile","default_model":"default-profile","classes":[...]}}
HTTP_STATUS=200

POST /api/meta-profiles/balanced/activate
{"name":"balanced","message":"Meta-profile 'balanced' activated"}
HTTP_STATUS=200

GET /api/settings
... "enable_classify_and_switch_llm_tool":true,"active_meta_profile":"balanced" ...
... "active_meta_profile":"balanced" ...
HTTP_STATUS=200

POST invalid body
{"detail":[{"type":"missing","loc":["body","classifier_model"],"msg":"Field required",...}]}
HTTP_STATUS=422

GET /api/meta-profiles/missing-profile
{"detail":"Meta-profile 'missing-profile' not found"}
HTTP_STATUS=404

DELETE /api/meta-profiles/balanced
{"name":"balanced","message":"Meta-profile 'balanced' deleted"}
HTTP_STATUS=200

GET /api/settings after delete
... "enable_classify_and_switch_llm_tool":false,"active_meta_profile":null ...
... "active_meta_profile":null ...
HTTP_STATUS=200

This verifies the new API works over HTTP, writes to the real store, and activation/deletion propagate to the nested agent settings that control tool attachment.

Unable to Verify

I attempted a live external classifier-model call with the available LLM_API_KEY, but the tested models were not usable in this environment:

MODEL_FAIL=openhands/gpt-4o-mini;LLMBadRequestError: ... Invalid model name passed in model=gpt-4o-mini
MODEL_FAIL=gpt-4o-mini;LLMAuthenticationError: ... Incorrect API key provided
MODEL_FAIL=openhands/claude-sonnet-4-5;LLMBadRequestError: ... Invalid model name passed in model=claude-sonnet-4-5

Because of that, I verified the routing behavior with the SDK's in-process TestLLM as the classifier response source. That exercises tool wiring, meta-profile resolution, profile switching, fallback, and error-observation behavior, but not provider-specific live classifier authentication/model availability. Future QA runs would benefit from AGENTS.md guidance naming a small model known to work with the injected QA LLM_API_KEY.

Issues Found

  • 🟠 CI: The GitHub pre-commit check is failing at review time (qa-changes is also still in progress). I did not rerun linters locally per QA instructions.
  • No functional issues found in the SDK/API paths I was able to exercise.

This PR review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PARTIAL

@juanmichelini juanmichelini marked this pull request as ready for review June 17, 2026 16:53

all-hands-bot commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@juanmichelini

juanmichelini commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

This has been tested, see screenshots working with UI

image image

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Intelligent Model Router

🟡 Acceptable — Well-structured feature with solid patterns, minor suggestions below.


Overview

This PR introduces intelligent model routing through:

  • A classify_and_switch_llm tool that classifies tasks and switches LLM profiles
  • A meta-profile system for declarative routing configuration
  • HTTP endpoints for CRUD operations on meta-profiles

The implementation follows existing code patterns and integrates cleanly with the settings/persistence layer.


[IMPROVEMENT OPPORTUNITIES]

  • [openhands-sdk/openhands/sdk/tool/builtins/classify_and_switch_llm.py, Line 116-130] Incomplete filtering: _recent_messages_text only processes MessageEvent types. If other event types contain relevant conversation content, they would be excluded. Verify this is intentional, or document why only MessageEvent is relevant for classification.

  • [openhands-sdk/openhands/sdk/llm/meta_profile_store.py, Line 106-111] Implicit fallback order: list() returns alphabetically sorted names, and tool creation falls back to available[0]. This could surprise users who expect chronological or most-recent ordering. Consider documenting this behavior or adding a more explicit "last activated" timestamp.


[STYLE NOTES]

  • Generally well-documented with clear docstrings explaining the "why" behind design decisions (e.g., lazy meta-profile resolution, atomic writes).
  • Error messages are user-friendly and actionable.
  • No significant style issues observed.

[TESTING]

  • Test coverage appears comprehensive for both happy paths and edge cases.
  • Good: Concurrent save test verifies thread-safety of file locking.
  • Good: Tests cover deferred resolution (tool creation doesn't fail on missing profiles).

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is an additive feature that doesn't modify existing APIs or break existing functionality. File locking handles concurrent access safely, and error handling is thorough throughout.


VERDICT:
Worth merging — Core architecture is sound, suggestions are minor.

KEY INSIGHT:
The lazy meta-profile resolution pattern (deferring disk reads to invocation time) is a good design choice that prevents broken conversation startup when profiles are missing or misconfigured.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the intelligent model router tool and meta-profile management API by running the SDK and agent-server end-to-end; the PR delivers the stated behavior.

Does this PR achieve its stated goal?

Yes. The PR adds a usable classify_and_switch_llm routing path and meta-profile CRUD API: the PR server exposes the new tool and /api/meta-profiles endpoints, activation propagates into nested agent settings, and the SDK tool classified a test-writing task and switched the conversation to the expected saved profile using a real classifier LLM call. I also verified the no-active-meta-profile fallback and dangling-active-profile startup behavior described in the PR.

Phase Result
Environment Setup make build completed successfully and the local agent-server started on the PR branch
CI Status ⚠️ Most checks pass, but pre-commit is currently failing and qa-changes is in progress
Functional Verification ✅ API CRUD/activation/delete, SDK routing, fallback, and missing-profile startup behavior all worked
Functional Verification

Test 1: Meta-profile API exists and works over real HTTP

Step 1 — Establish baseline on origin/main:
Ran a real main-branch agent-server on port 8138 and requested the new endpoint:

GET /server_info -> usable_tools: [file_editor, task_tool_set, ...]  # no ClassifyAndSwitchLLMTool
GET /api/meta-profiles -> HTTP/1.1 404 Not Found
{"detail":"Not Found"}

This shows the endpoint/tool did not exist before this PR.

Step 2 — Apply the PR's changes:
Used checked-out PR commit 6468a931d8b3ac7f41c7e733d076b47185bbb0f0 and started python -m openhands.agent_server on port 8137.

Step 3 — Re-run with the PR:
Ran real HTTP requests against the PR server:

GET /server_info -> usable_tools includes "ClassifyAndSwitchLLMTool"
GET /api/meta-profiles -> 200 {"meta_profiles":[],"active_meta_profile":null}
POST /api/meta-profiles/balanced -> 201 {"name":"balanced","message":"Meta-profile 'balanced' saved"}
GET /api/meta-profiles/balanced -> 200 with classifier_model/default_model/classes
POST /api/meta-profiles/balanced/activate -> 200 {"name":"balanced","message":"Meta-profile 'balanced' activated"}
GET /api/settings -> active_meta_profile="balanced" and agent_settings.enable_classify_and_switch_llm_tool=true
POST invalid body -> 422 missing classifier_model
GET missing profile -> 404 "Meta-profile 'missing' not found"
DELETE /api/meta-profiles/balanced -> 200
GET /api/settings after delete -> active_meta_profile=null and enable_classify_and_switch_llm_tool=false

This confirms the new management API works end-to-end and activation/delete update the settings that actually control tool availability.

Test 2: SDK tool classifies a task and switches profiles

Step 1 — Establish baseline on origin/main:
Ran a script that imports and uses the new meta-profile store/tool path:

ModuleNotFoundError: No module named 'openhands.sdk.llm.meta_profile_store'

This confirms the SDK functionality is new in the PR.

Step 2 — Apply the PR's changes:
On the PR branch, created real saved LLM profiles under a temporary HOME, created a meta-profile matching the PR example, enabled enable_classify_and_switch_llm_tool=True, sent a user message (This task is test oriented: just adding tests for the parser module.), and invoked classify_and_switch_llm. The classifier profile used openhands/gpt-5.5 with the provided LLM_API_KEY.

Step 3 — Re-run with the PR:
Observed:

observation_error= False
observation_text= Classified task as 'tasks is test oriented, just adding tests' and switched to LLM profile 'claude-opus-4-8' (model 'openhands/gpt-5.5'). Future agent steps will use this profile.
chosen_class= tasks is test oriented, just adding tests
profile_name= claude-opus-4-8
active_usage_id= profile:claude-opus-4-8

This confirms the router made a classifier LLM call, selected the expected class, switched to the selected saved profile, and future calls would use that profile.

Test 3: No-active-meta-profile fallback works

Step 1 — Establish baseline:
Before this PR there was no MetaProfileStore or classify_and_switch_llm tool to exercise.

Step 2 — Apply the PR's changes:
Created one saved meta-profile named aaa-first, enabled the routing tool without setting active_meta_profile, sent Please add tests for this parser., and invoked the tool.

Step 3 — Re-run with the PR:
Observed:

No active meta-profile set; falling back to first available: 'aaa-first'
observation_error= False
profile_name= tests
chosen_class= test writing
active_usage_id= profile:tests

This confirms the PR's first-available fallback path works in the SDK.

Test 4: Dangling active meta-profile no longer breaks startup

Step 1 — Establish baseline:
The PR description says this was a review-fix regression: startup previously depended on mutable disk state. On main, the routing tool/meta-profile feature does not exist.

Step 2 — Apply the PR's changes:
Created an agent with enable_classify_and_switch_llm_tool=True and active_meta_profile='does-not-exist', then initialized a real LocalConversation.

Step 3 — Re-run with the PR:
Observed:

startup_ok=True
tool_error= True
tool_text= Failed to resolve the active meta-profile: Meta-profile `does-not-exist` not found. Available meta-profiles: none

This confirms conversation startup succeeds and the missing profile surfaces as a tool observation error instead of crashing initialization.

Issues Found

None from functional QA. Note: CI currently reports a failing pre-commit check, which should be addressed separately before merge.


This QA review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

…ests

The merge with main turned PersistedSettings.agent_settings into a
kind-discriminated union (OpenHandsAgentSettings | ACPAgentSettings),
so pyright rejected the direct access to active_meta_profile /
enable_classify_and_switch_llm_tool on the union, and rejected the
list[dict] passed to MetaProfile(classes=...).

- Build the fixture profile via MetaProfile.model_validate({...}) so the
  classes payload is validated as Any (still coerced to MetaProfileClass).
- Narrow agent_settings with isinstance(OpenHandsAgentSettings) before
  asserting the routing fields (the routing tool only applies to the
  OpenHands agent variant anyway).

Co-authored-by: openhands <openhands@all-hands.dev>

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

The intelligent model router and meta-profile API worked end-to-end in real server/API and SDK executions, but GitHub CI currently has a failing pre-commit check.

Does this PR achieve its stated goal?

Yes. The PR set out to add an intelligent model router (classify_and_switch_llm) plus CRUD/activation APIs for meta-profiles. I verified the base branch did not expose the meta-profile endpoint or router tool, then on this PR created/activated/deleted a meta-profile through the running agent-server over HTTP and invoked the SDK router with a real classifier LLM call; the router classified a test-writing request and switched to the slow profile. I also verified the no-active-meta-profile fallback and dangling-active-profile startup behavior.

Phase Result
Environment Setup make build completed and installed the editable workspace successfully.
CI Status ⚠️ Latest PR checks: 20 success, 1 skipped, 7 in progress, 1 failed (pre-commit). I did not rerun CI locally.
Functional Verification ✅ Server API, settings propagation, router execution, fallback, and startup edge case all behaved as described.
Functional Verification

Test 1: Meta-profile API is new and works over real HTTP

Step 1 — Establish baseline (base branch origin/main):
Started the actual agent server with:

HOME=/tmp/oh-qa-main-home OH_PERSISTENCE_DIR=/tmp/oh-qa-main \
  uv run python -m openhands.agent_server --host 127.0.0.1 --port 18081

Then ran:

curl -i http://127.0.0.1:18081/api/meta-profiles

Observed:

HTTP/1.1 404 Not Found
{"detail":"Not Found"}
has_classify_tool= False

This confirms the base branch did not expose meta-profile management or the router tool.

Step 2 — Apply the PR changes:
Returned to commit 1ca1ebe56edcaf3b40370ac3e4ef53a9b7eec27c and started the same server entrypoint on port 18080 with isolated HOME / OH_PERSISTENCE_DIR.

Step 3 — Re-run with the PR in place:
Ran real HTTP requests against the running server:

curl -i http://127.0.0.1:18080/api/meta-profiles
curl -i -X POST http://127.0.0.1:18080/api/meta-profiles/balanced \
  -H 'Content-Type: application/json' \
  --data '{"classifier_model":"classifier","default_model":"default","classes":[{"description":"UI / images","model":"fast"},{"description":"tests","model":"slow"}]}'
curl -i http://127.0.0.1:18080/api/meta-profiles/balanced
curl -i -X POST http://127.0.0.1:18080/api/meta-profiles/balanced/activate
curl http://127.0.0.1:18080/api/settings | grep -E 'active_meta_profile|enable_classify'
curl -i -X DELETE http://127.0.0.1:18080/api/meta-profiles/balanced

Observed the expected lifecycle and settings side effects:

GET /api/meta-profiles -> 200 {"meta_profiles":[],"active_meta_profile":null}
POST /api/meta-profiles/balanced -> 201 {"name":"balanced","message":"Meta-profile 'balanced' saved"}
GET /api/meta-profiles/balanced -> 200 with classifier_model/default_model/classes
POST /api/meta-profiles/balanced/activate -> 200 {"name":"balanced","message":"Meta-profile 'balanced' activated"}
settings after activate included:
  "enable_classify_and_switch_llm_tool": true
  "active_meta_profile": "balanced"
DELETE /api/meta-profiles/balanced -> 200
settings after delete included:
  "enable_classify_and_switch_llm_tool": false
  "active_meta_profile": null

I also checked error behavior over HTTP:

POST invalid body -> 422 Unprocessable Content (missing classifier_model)
GET missing profile -> 404 Not Found (Meta-profile 'nope' not found)
server_info usable_tools included ClassifyAndSwitchLLMTool

This shows the new CRUD API, activation wiring into nested agent settings, delete cleanup, and validation/not-found paths all work through the real server.

Test 2: SDK router classifies and switches profiles with a real classifier call

Step 1 — Establish baseline:
The base server check above reported has_classify_tool=False, confirming the runtime did not previously expose this router.

Step 2 — Apply the PR changes:
On the PR branch, I created isolated real LLM profiles (classifier, default, fast, slow) and a balanced meta-profile under a temp HOME, built an OpenHandsAgentSettings(..., enable_classify_and_switch_llm_tool=True, active_meta_profile="balanced") agent, sent a user message asking to add unit tests, and executed classify_and_switch_llm. The classifier call used the configured LLM_MODEL / LLM_BASE_URL with LITELLM_PROXY_API_KEY=$LLM_API_KEY.

Step 3 — Run with the PR in place:
Ran:

LITELLM_PROXY_API_KEY=*** HOME=/tmp/oh-qa-sdk-home OPENHANDS_SUPPRESS_BANNER=1 \
  uv run python /tmp/qa_router_real.py

Observed:

observation_type= ClassifyAndSwitchLLMObservation
is_error= False
chosen_class= test-writing tasks such as adding unit tests
profile_switched_to= slow
active_model= litellm_proxy/openai/gpt-5.5
conversation_llm_model= litellm_proxy/openai/gpt-5.5
text= Classified task as 'test-writing tasks such as adding unit tests' and switched to LLM profile 'slow' ...

This confirms the router made the classifier call, matched the test-writing class, switched the conversation to the configured slow profile, and returned the expected observation.

Test 3: No-active-meta-profile fallback works

Step 1 — Baseline / prior state:
The base branch did not expose the tool, so there was no fallback behavior to exercise.

Step 2 — Apply the PR changes:
Created one meta-profile named aaa-first, enabled enable_classify_and_switch_llm_tool=True, and intentionally omitted active_meta_profile.

Step 3 — Run with the PR in place:
Ran:

LITELLM_PROXY_API_KEY=*** HOME=/tmp/oh-qa-sdk-fallback-home OPENHANDS_SUPPRESS_BANNER=1 \
  uv run python /tmp/qa_router_fallback.py

Observed:

No active meta-profile set; falling back to first available: 'aaa-first'
is_error= False
profile_switched_to= slow
chosen_class= test-writing work
text= Classified task as 'test-writing work' and switched to LLM profile 'slow' ...

This confirms the SDK fallback path works when the tool is enabled without an explicit active meta-profile.

Test 4: Dangling active meta-profile does not break startup

Step 1 — Baseline / problem scenario:
The PR description says a missing/renamed active meta-profile should surface at tool invocation time instead of breaking conversation startup.

Step 2 — Run with the PR in place:
Ran a short script that creates an agent with enable_classify_and_switch_llm_tool=True and active_meta_profile='missing-profile', then starts a LocalConversation and calls _ensure_agent_ready().

Observed:

startup_ok=True
configured_tools= ['ClassifyAndSwitchLLMTool']

This confirms conversation startup succeeds even when the active meta-profile file is dangling.

Issues Found

  • 🟡 CI: GitHub currently reports pre-commit as failed, with several other checks still in progress. Functional QA passed, but this PR is not fully green at the time of review.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Intelligent Model Router

Taste Rating

🟡 Acceptable — Solid architecture with a few opportunities for improvement


Analysis

This PR implements an intelligent model router that classifies the current task using a classifier model and switches to the most suitable LLM profile. The architecture is thoughtful: lazy loading prevents startup failures, file locking prevents TOCTOU races, and the settings propagation ensures nested state consistency.


[CRITICAL ISSUES]

None — no blocking issues found.


[IMPROVEMENT OPPORTUNITIES]

  • [openhands-agent-server/openhands/agent_server/meta_profiles_router.py, Lines 27-33] Verbose Module Docstring: The 7-line docstring is largely a restatement of the PR description ("A meta-profile names a classifier_model..."). This type of non-local documentation will drift from the code over time. Consider a brief one-liner that describes what this module is (HTTP endpoints for meta-profile CRUD) rather than what it does conceptually.

  • [openhands-sdk/openhands/sdk/llm/meta_profile_store.py, Lines 304-315] Verbose Module Docstring: Same issue — the docstring narrates the PR description rather than documenting invariants or non-obvious design decisions. A simple "Storage for meta-profiles persisted as JSON in ~/.openhands/meta-profiles." would suffice.

  • [openhands-sdk/openhands/sdk/tool/builtins/classify_and_switch_llm.py, Lines 1027-1035] Verbose Tool Description: The _DESCRIPTION constant is 9 lines that paraphrase the PR description. For tool descriptions, brevity improves AI agent comprehension. Consider: "Classify the current task and switch to the best-matching LLM profile from the active meta-profile."


[STYLE NOTES]

  • The _store_errors() context manager pattern is clean and matches existing code in the codebase.

[TESTING GAPS]

  • [tests/agent_server/test_meta_profiles_router.py] Missing concurrent delete test: You test concurrent saves respecting max_profiles, but not concurrent deletes when one is the active profile. A race between delete + activate on the same profile could expose edge cases in _set_active_meta_profile_if_matches.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a new feature with minimal risk surface:

  • File locking prevents concurrent corruption
  • Lazy loading prevents startup failures from missing profiles
  • Error handling is comprehensive with fallbacks to default_model
  • No breaking changes to existing APIs
  • Settings propagation is properly tested

VERDICT:

Worth merging: Core logic is sound. The verbose docstrings are a minor style issue, not blockers.

KEY INSIGHT:

The deferred meta-profile resolution (loading at invocation time, not agent init) is the most important design decision — it prevents a missing profile file from breaking conversation startup, which would have been a poor user experience for an opt-in feature.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Intelligent Model Router

Taste Rating

🟡 Acceptable — Solid architecture with a few opportunities for improvement


Analysis

This PR implements an intelligent model router that classifies the current task using a classifier model and switches to the most suitable LLM profile. The architecture is thoughtful: lazy loading prevents startup failures, file locking prevents TOCTOU races, and the settings propagation ensures nested state consistency.


[CRITICAL ISSUES]

None — no blocking issues found.


[IMPROVEMENT OPPORTUNITIES]

  • [openhands-agent-server/openhands/agent_server/meta_profiles_router.py, Lines 27-33] Verbose Module Docstring: The 7-line docstring is largely a restatement of the PR description ("A meta-profile names a classifier_model..."). This type of non-local documentation will drift from the code over time. Consider a brief one-liner that describes what this module is (HTTP endpoints for meta-profile CRUD) rather than what it does conceptually.

  • [openhands-sdk/openhands/sdk/llm/meta_profile_store.py, Lines 304-315] Verbose Module Docstring: Same issue — the docstring narrates the PR description rather than documenting invariants or non-obvious design decisions. A simple "Storage for meta-profiles persisted as JSON in ~/.openhands/meta-profiles." would suffice.

  • [openhands-sdk/openhands/sdk/tool/builtins/classify_and_switch_llm.py, Lines 1027-1035] Verbose Tool Description: The _DESCRIPTION constant is 9 lines that paraphrase the PR description. For tool descriptions, brevity improves AI agent comprehension. Consider: "Classify the current task and switch to the best-matching LLM profile from the active meta-profile."


[STYLE NOTES]

  • The _store_errors() context manager pattern is clean and matches existing code in the codebase.

[TESTING GAPS]

  • [tests/agent_server/test_meta_profiles_router.py] Missing concurrent delete test: You test concurrent saves respecting max_profiles, but not concurrent deletes when one is the active profile. A race between delete + activate on the same profile could expose edge cases in _set_active_meta_profile_if_matches.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a new feature with minimal risk surface:

  • File locking prevents concurrent corruption
  • Lazy loading prevents startup failures from missing profiles
  • Error handling is comprehensive with fallbacks to default_model
  • No breaking changes to existing APIs
  • Settings propagation is properly tested

VERDICT:

Worth merging: Core logic is sound. The verbose docstrings are a minor style issue, not blockers.

KEY INSIGHT:

The deferred meta-profile resolution (loading at invocation time, not agent init) is the most important design decision — it prevents a missing profile file from breaking conversation startup, which would have been a poor user experience for an opt-in feature.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Non-blocking review suggestions from PR #3744:

- meta_profiles_router/meta_profile_store: trim verbose module docstrings
  that restated the PR description down to the durable invariants.
- persistence/models.py: extract the active_meta_profile propagation out of
  PersistedSettings.update into a _apply_active_meta_profile helper to cut
  the nesting depth.
- classify_and_switch_llm.py: document why _recent_messages_text only
  consumes MessageEvent, and that the no-active fallback picks the
  alphabetically-first meta-profile (not most-recent).
- meta_profile_store.list(): document the alphabetical (not chronological)
  ordering that the fallback relies on.
- tests: add a concurrent delete(active)+activate(other) race test asserting
  the deleted profile never stays active and the nested agent_settings never
  drift from the top-level active_meta_profile.

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the new intelligent model router and meta-profile management behavior end-to-end via a real local agent-server plus SDK conversation execution.

Does this PR achieve its stated goal?

Yes. The PR set out to add meta-profile CRUD/activation APIs and a classify_and_switch_llm tool that classifies the current task and switches to the selected saved LLM profile. I verified that the new HTTP endpoints are absent on main but work on the PR, that activation propagates into nested agent settings, and that the SDK tool makes a classifier HTTP call over the recent conversation then switches the conversation model from initial-model to the routed openai/tests-routed-model.

Phase Result
Environment Setup make build completed successfully and created the project .venv.
CI Status 🟡 GitHub checks at observation time: 33 successful, 1 skipped, 1 pending (QA Changes by OpenHands/qa-changes); 0 failing.
Functional Verification ✅ Real HTTP API flows and SDK model-routing behavior worked as described.
Functional Verification

Test 1: Meta-profile API is new and works over real HTTP

Step 1 — Establish baseline on main:
Checked out origin/main (ab886d39496c93b7a2ce1a5ebcdcdb4897d533a3) and started the real server with:

OPENHANDS_SUPPRESS_BANNER=1 HOME="$tmp/home" OH_PERSISTENCE_DIR="$tmp/persist"   uv run python -m openhands.agent_server --host 127.0.0.1 --port 8810

Then called the new routes:

GET /api/meta-profiles:
HTTP/1.1 404 Not Found
{"detail":"Not Found"}

PATCH /api/settings active_meta_profile:
HTTP/1.1 400 Bad Request
{"detail":"At least one of agent_settings_diff, conversation_settings_diff, misc_settings_diff, or active_profile must be provided"}

This confirms the meta-profile API/settings surface did not exist on the base branch.

Step 2 — Apply the PR's changes:
Checked out PR commit 2f1f55e25fddfddf2370d042cdfa4bae432876ac and started the same real server on port 8811 with isolated HOME and OH_PERSISTENCE_DIR.

Step 3 — Re-run with the PR in place:
Exercised create/list/get/activate/settings/error/delete flows via curl:

1 GET empty list
{"meta_profiles":[],"active_meta_profile":null}
STATUS:200

2 POST create balanced
{"name":"balanced","message":"Meta-profile 'balanced' saved"}
STATUS:201

3 GET detail balanced
{"name":"balanced","config":{"classifier_model":"classifier","default_model":"default","classes":[{"description":"UI or image-heavy task","model":"vision"},{"description":"test-writing task","model":"test-model"}]}}
STATUS:200

4 POST activate balanced
{"name":"balanced","message":"Meta-profile 'balanced' activated"}
STATUS:200

5 GET list after activate
{"meta_profiles":[{"name":"balanced","classifier_model":"classifier","default_model":"default","num_classes":2}],"active_meta_profile":"balanced"}
STATUS:200

6 GET settings active fields
{
  "active_meta_profile": "balanced",
  "agent_active_meta_profile": "balanced",
  "agent_enable_router": true
}

7 POST invalid body
{"detail":[{"type":"missing","loc":["body","classifier_model"],"msg":"Field required","input":{"default_model":"default"}}]}
STATUS:422

8 GET missing
{"detail":"Meta-profile 'missing' not found"}
STATUS:404

9 DELETE balanced
{"name":"balanced","message":"Meta-profile 'balanced' deleted"}
STATUS:200

10 GET list after delete
{"meta_profiles":[],"active_meta_profile":null}
STATUS:200

11 GET settings after delete
{
  "active_meta_profile": null,
  "agent_active_meta_profile": null,
  "agent_enable_router": false
}

This shows the new CRUD API works, invalid/missing cases return appropriate client errors, activation persists the active meta-profile, activation enables the router in nested agent settings, and deleting the active meta-profile clears both top-level and nested settings.

Test 2: SDK router tool classifies via HTTP and switches the conversation LLM

Step 1 — Establish baseline on main:
On main, importing the new tool failed and the new agent settings fields were ignored:

BASE SDK router import check:
IMPORT_ERROR cannot import name 'ClassifyAndSwitchLLMTool' from 'openhands.sdk.tool.builtins'
SETTINGS_FIELDS {}

This confirms the router tool/settings surface was not present before the PR.

Step 2 — Apply the PR's changes:
On PR commit 2f1f55e25fddfddf2370d042cdfa4bae432876ac, I created saved LLM profiles under an isolated temp HOME, served a local OpenAI-compatible /v1/chat/completions endpoint that returns classifier choice 2, created a meta-profile mapping class 2 to the tests profile, then used a real LocalConversation and executed classify_and_switch_llm.

Step 3 — Re-run with the PR in place:
The SDK conversation produced:

TOOLS ['ClassifyAndSwitchLLMTool']
BEFORE_MODEL initial-model
OBS_ERROR False
OBS_MODEL tests
OBS_CLASS test-writing task
OBS_ACTIVE_MODEL openai/tests-routed-model
AFTER_MODEL openai/tests-routed-model
CLASSIFIER_REQUESTS 1
CLASSIFIER_PATH /v1/chat/completions
TRANSCRIPT_INCLUDED True
OBS_TEXT Classified task as 'test-writing task' and switched to LLM profile 'tests' (model 'openai/tests-routed-model'). Future agent steps will use this profile.

This confirms the tool was attached from settings, sent the recent user message to the classifier over an actual HTTP completion request, parsed classifier response 2, selected the matching class, and switched the conversation to the routed saved LLM profile.

Test 3: Router fallback and missing-active-profile behavior

No active meta-profile fallback:
With enable_classify_and_switch_llm_tool=True and no active meta-profile set, one stored meta-profile was resolved and the classifier response 0 routed to the default model:

BEFORE_MODEL initial-model
OBS_ERROR False
OBS_MODEL default
OBS_CLASS None
AFTER_MODEL openai/fallback-default
OBS_TEXT Classified task as 'default' and switched to LLM profile 'default' (model 'openai/fallback-default'). Future agent steps will use this profile.

Dangling active meta-profile:
With active_meta_profile='does-not-exist', conversation startup succeeded and the error surfaced as a tool observation at invocation time:

DANGLING_STARTUP_OK True
DANGLING_OBS_ERROR True
DANGLING_OBS_TEXT Failed to resolve the active meta-profile: Meta-profile `does-not-exist` not found. Available meta-profiles: balanced

This matches the PR's claimed deferred-load behavior and first-available fallback behavior.

Issues Found

None.


This QA review was created by an AI agent (OpenHands) on behalf of the user.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable - The feature is generally well-shaped and the API mirrors the existing profile surface, but the router currently makes a real LLM call outside the conversation accounting path. That needs to be fixed before this lands.

[CRITICAL ISSUES]

  • classify_and_switch_llm loads and calls the classifier profile directly, without registering/reusing it through the conversation's LLM registry. That makes classifier spend invisible to ConversationStats and lets it bypass the run-level budget guard.

[IMPROVEMENT OPPORTUNITIES]

  • Mirror the existing profile router's lock-timeout handling so meta-profile store contention returns a retryable API error instead of a generic 500.

[TESTING GAPS]

  • Add a regression test proving that a routing call records the classifier LLM's metrics in the conversation stats / budget path.
  • I ran the targeted tests locally: uv run pytest tests/sdk/llm/test_meta_profile_store.py tests/sdk/tool/test_classify_and_switch_llm.py tests/agent_server/test_meta_profiles_router.py → 54 passed.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
    This adds new public SDK/API surface plus a tool that makes an extra LLM call and mutates the active conversation model. The core shape is contained and covered by tests, but hidden classifier spend/budget bypass is a correctness and safety issue for production users.

VERDICT:
Needs rework: Fix classifier LLM lifecycle/accounting before merge; the rest looks close.

KEY INSIGHT:
The meta-profile data model is simple enough; the classifier call just has to participate in the same registry, metrics, and budget machinery as every other conversation LLM.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Intelligent Model Router

🟡 Taste Rating: Acceptable — Feature is well-shaped and follows existing SDK patterns. Two previously-identified issues in the existing unresolved review threads are blocking merge (see below).


Overview

This PR adds three related capabilities:

  1. **** — a single-classifier-call router that reads the active meta-profile, picks the best LLM class for the current task, and calls .
  2. **** — a file-backed store (mirrors ) under .
  3. Meta-profile management API — CRUD + activate endpoints () that mirror .

The implementation is clean and follows established patterns. The deferred meta-profile loading (resolving at invocation time rather than agent-creation time) is the correct design choice. The settings propagation through is a clean single-entry-point pattern.


Existing Unresolved Issues (Blocking)

Two review threads opened in the previous review remain unresolved and must be addressed before this can merge:

  1. 🔴 CRITICAL — Classifier LLM call bypasses conversation accounting (): is called directly on the loaded profile LLM without registering it in . This means classifier token spend is invisible to and bypasses the guard in . This is a correctness/budget-enforcement issue.

  2. 🟡 SUGGESTION — missing → 503 (): / can raise from the . The current error mapper only catches , so lock contention yields a generic 500 instead of the retryable 503 used by .


What's Good

  • Architecture: The data model is clean — / / mirror patterns faithfully. No special-casing.
  • Deferred loading: does NOT read disk. A missing/renamed meta-profile surfaces as a tool-error observation rather than breaking . This is the right call.
  • Settings propagation: is the single source of truth that also propagates into nested . The guard cleanly handles ACP agents without fields for this feature.
  • Concurrency: spans the limit check + atomic rename in , preventing the TOCTOU race. The test exercises this with 30 concurrent threads against a limit of 10.
  • Test coverage: 150+ tests pass. Tests exercise real code paths (parsing, profile-store I/O, executor switching, HTTP endpoints over FastAPI TestClient) — not just mock facades. The and tests specifically catch the top-level/nested drift that previously caused the activation bug.
  • CI: All 29 checks are green (including which was previously failing).
  • Evidence: PR description includes screenshots of live model routing and test output.

Minor Observation (Non-blocking)

** scans all events forward**: The function iterates from the beginning, collecting all instances into a list, then takes . For a long conversation with many non-message events, this builds an unbounded intermediate list just to discard everything except the tail. Reversing the iteration and stopping early once messages are found would be O(limit) instead of O(all events). This is low-priority since events are in-memory and the tool is called infrequently, but worth noting for a follow-up.


[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

This PR introduces a new built-in tool that makes an LLM call mid-conversation and mutates the conversation's active LLM — both higher-risk operations than a read-only tool. The meta-profile CRUD API adds new surface area for settings state mutation. The deferred-load design and comprehensive tests reduce risk significantly. The CRITICAL classifier-accounting issue prevents approval until fixed; the budget bypass it introduces is a correctness risk in production deployments with configured.


VERDICT:
Needs rework: Address the two unresolved review threads before merging. The architecture is sound; these are focused correctness fixes, not redesigns.

KEY INSIGHT:
The feature design is clean and the test infrastructure is excellent, but the classifier LLM call being invisible to conversation accounting is a real correctness gap that must close before this ships.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a file to your branch (or edit it if one already exists) with the trigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.
  2. Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

Was this review helpful? React with 👍 or 👎 to give feedback.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the intelligent model router SDK path and meta-profile HTTP management API end-to-end; both deliver the PR's stated feature.

Does this PR achieve its stated goal?

Yes. The PR set out to add an intelligent model router (classify_and_switch_llm) and CRUD/activation APIs for meta-profiles. I verified that main did not expose /api/meta-profiles and did not have the SDK meta-profile module, while this branch accepts real HTTP CRUD/activation requests and an SDK conversation classified a test-writing request with a real openhands/minimax-m3 classifier call, then switched to saved profile tests whose active model was openhands/minimax-2.5.

Phase Result
Environment Setup make build completed on both PR and baseline worktrees
CI Status 🟡 At time checked: 34 successful, 7 skipped, 1 qa-changes check still in progress; no failing checks reported
Functional Verification ✅ Real HTTP API flow and real SDK tool execution both worked
Functional Verification

Test 1: Meta-profile management API over real HTTP

Step 1 — Establish baseline on main:
Ran a baseline agent-server from /tmp/ohqa-base-worktree on port 18273, then requested the new endpoint:

curl -i http://127.0.0.1:18273/api/meta-profiles

HTTP/1.1 404 Not Found
{"detail":"Not Found"}

This confirms the endpoint is new behavior; main has no meta-profile API route.

Step 2 — Apply the PR's changes:
Used the checked-out PR branch jmj/intelligent-model-router at 2f1f55e25fddfddf2370d042cdfa4bae432876ac, ran make build, and started python -m openhands.agent_server with isolated HOME=/tmp/ohqa-home and OH_PERSISTENCE_DIR=/tmp/ohqa-persist on port 18274.

Step 3 — Exercise the API on the PR branch:
Ran real HTTP requests for create, read, list, activate, settings propagation, validation failure, missing profile, delete, and post-delete list:

POST /api/meta-profiles/router-qa -> HTTP 201
{"name":"router-qa","message":"Meta-profile 'router-qa' saved"}

GET /api/meta-profiles/router-qa -> HTTP 200
{"name":"router-qa","config":{"classifier_model":"classifier-profile","default_model":"default-profile","classes":[...]}}

GET /api/meta-profiles -> HTTP 200
{"meta_profiles":[{"name":"router-qa","classifier_model":"classifier-profile","default_model":"default-profile","num_classes":2}],"active_meta_profile":null}

POST /api/meta-profiles/router-qa/activate -> HTTP 200
GET /api/settings -> {"active_meta_profile": "router-qa", "agent_active_meta_profile": "router-qa", "agent_enable_classify_and_switch_llm_tool": true}

POST invalid body -> HTTP 422
GET missing profile -> HTTP 404
DELETE /api/meta-profiles/router-qa -> HTTP 200
GET /api/meta-profiles -> HTTP 200, {"meta_profiles":[],"active_meta_profile":null}

This shows the new API works as a real client would use it, including the important activation side effect that propagates into nested agent settings and delete clearing the active profile.

Test 2: SDK classify_and_switch_llm tool routes and switches a conversation

Step 1 — Establish baseline on main:
Ran the same SDK routing script against /tmp/ohqa-base-worktree:

HOME=/tmp/ohqa-sdk-base-home LITELLM_PROXY_API_KEY=$LLM_API_KEY uv run python /tmp/ohqa_sdk_route.py

ModuleNotFoundError: No module named 'openhands.sdk.llm.meta_profile_store'
EXIT_CODE=1

This confirms the SDK meta-profile/router functionality is absent on main.

Step 2 — Apply the PR's changes:
Ran the script from the PR checkout. The script created real saved LLM profiles under an isolated ~/.openhands/profiles, created a real meta-profile under ~/.openhands/meta-profiles, constructed OpenHandsAgentSettings(enable_classify_and_switch_llm_tool=True, active_meta_profile="balanced"), sent the conversation message Please add unit tests for the parser module., and executed classify_and_switch_llm.

Step 3 — Re-run with the PR branch:

HOME=/tmp/ohqa-sdk-home LITELLM_PROXY_API_KEY=$LLM_API_KEY uv run python /tmp/ohqa_sdk_route.py

Loaded profile `classifier` ...
Loaded profile `tests` ...
{
  "conversation_active_model": "openhands/minimax-2.5",
  "observation_chosen_class": "test-writing tasks",
  "observation_is_error": false,
  "observation_model": "tests",
  "observation_text": "Classified task as 'test-writing tasks' and switched to LLM profile 'tests' (model 'openhands/minimax-2.5'). Future agent steps will use this profile.",
  "saved_meta_profiles": ["balanced"]
}

This shows the tool performed the promised classifier call, selected the test-writing class for a test-writing task, loaded the target saved profile, and changed the conversation's active model from the initial openhands/minimax-m3 setup to the target profile's openhands/minimax-2.5.

Issues Found

None.

This review was generated by an AI agent (OpenHands) on behalf of the user.

…o 503

Addresses review feedback on the intelligent model router (PR #3744).

CRITICAL — classifier completion now goes through conversation accounting.
`classify_and_switch_llm` previously loaded the classifier profile straight
from the profile store and called `completion()` on it without registering it
in `conversation.llm_registry`. Its tokens/cost therefore never entered
`conversation_stats`, so the call was invisible to serialized metrics and
escaped the `max_budget_per_run` guard in `LocalConversation`. The classifier
LLM is now loaded once and registered under a stable `classifier:<profile>`
usage id (mirroring the existing `ask-agent-llm` pattern), so its spend is
tracked and budget-enforced; repeated routing calls reuse the cached entry.

SUGGESTION — meta-profile store lock contention now returns 503.
`MetaProfileStore.save()/delete()` can raise `TimeoutError` from the file
lock, but `meta_profiles_router._store_errors()` only mapped `ValueError`,
yielding a generic 500. It now mirrors `profiles_router._store_errors()` and
maps `TimeoutError` to a retryable 503.

Tests:
- classifier call is recorded in `conversation.conversation_stats` and the
  registry (regression for the budget bypass), plus a repeated-call test
  asserting one shared usage bucket.
- save/delete endpoints return 503 on store `TimeoutError`.

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the new meta-profile API and classify_and_switch_llm SDK path end-to-end with a running agent-server and a local OpenAI-compatible classifier; behavior matched the PR goal.

Does this PR achieve its stated goal?

Yes. The PR set out to add intelligent model routing plus meta-profile CRUD/activation APIs. On the PR branch, real HTTP requests created, listed, fetched, activated, validated, and deleted meta-profiles, and activation/clearing propagated into the nested agent_settings fields that enable the routing tool. The SDK tool also made one real HTTP classifier call, selected the expected class, switched the conversation to the target profile, tracked classifier usage, and handled a dangling active meta-profile as an invocation error instead of a startup failure.

Phase Result
Environment Setup make build completed for PR and baseline worktrees
CI Status ⚠️ Most checks green; unresolved-review-threads failing and qa-changes in progress at observation time
Functional Verification ✅ Agent-server API and SDK routing behavior exercised with real execution
Functional Verification

Test 1: Meta-profile CRUD API over real HTTP

Step 1 — Establish baseline (main, without the PR):
Started the main-branch agent-server on 127.0.0.1:8876 and ran curl -i http://127.0.0.1:8876/api/meta-profiles:

HTTP/1.1 404 Not Found
content-type: application/json

{"detail":"Not Found"}

This confirms the meta-profile management endpoint did not exist on the base branch.

Step 2 — Apply the PR's changes:
Checked out PR commit 716e7dd5cf598bb22d4ac3097810c43aa2c363fd, ran make build, and started the PR agent-server on 127.0.0.1:8877 with isolated HOME/OH_PERSISTENCE_DIR.

Step 3 — Re-run with the PR in place:
Ran real curl requests against /api/meta-profiles:

GET /api/meta-profiles                 -> 200 {"meta_profiles":[],"active_meta_profile":null}
POST /api/meta-profiles/balanced       -> 201 {"name":"balanced","message":"Meta-profile 'balanced' saved"}
GET /api/meta-profiles                 -> 200 summary with classifier_model=classifier, default_model=default, num_classes=2
GET /api/meta-profiles/balanced        -> 200 full config with two classes
POST /api/meta-profiles/balanced/activate -> 200 {"name":"balanced","message":"Meta-profile 'balanced' activated"}
POST invalid body                      -> 422 missing classifier_model
GET /api/meta-profiles/missing         -> 404 Meta-profile 'missing' not found
DELETE /api/meta-profiles/balanced     -> 200 deleted

After activation, GET /api/settings showed both top-level and nested agent settings wired:

"enable_classify_and_switch_llm_tool": true,
"active_meta_profile": "balanced",
...
"active_meta_profile": "balanced"

After deleting the active profile, settings were cleared consistently:

"enable_classify_and_switch_llm_tool": false,
"active_meta_profile": null,
...
"active_meta_profile": null

This confirms the new CRUD API works over HTTP and activation/deletion update the behavior-controlling settings, not just the facade field.

Test 2: Direct settings API accepts and clears active_meta_profile

Step 1 — Baseline:
The main branch has no active_meta_profile settings field exposed through the new meta-profile flow; the endpoint coverage above established the management surface was absent.

Step 2 — Apply the PR's changes:
Used the running PR agent-server.

Step 3 — Exercise the settings path:
Ran PATCH /api/settings with active_meta_profile and then with null:

PATCH {"active_meta_profile":"balanced"} -> 200
"enable_classify_and_switch_llm_tool": true,
"active_meta_profile": "balanced"

PATCH {"active_meta_profile":null} -> 200
"enable_classify_and_switch_llm_tool": false,
"active_meta_profile": null

This confirms settings updates propagate into agent_settings, which is what attaches or removes the routing tool for new conversations.

Test 3: SDK classify_and_switch_llm routes and switches profiles

Step 1 — Establish baseline (main, without the PR):
Ran the SDK exercise script in the main worktree. It failed before execution because the new routing module does not exist there:

EXIT_CODE=1
ModuleNotFoundError: No module named 'openhands.sdk.llm.meta_profile_store'

This establishes the SDK routing/meta-profile feature is new in the PR.

Step 2 — Apply the PR's changes:
Ran the same SDK exercise in the PR worktree. The script created real LLM profiles and a meta-profile under an isolated HOME, started a local OpenAI-compatible HTTP server that returned classifier category 2, created an OpenHandsAgentSettings(..., enable_classify_and_switch_llm_tool=True, active_meta_profile="balanced") agent, and invoked classify_and_switch_llm on a conversation containing a user request to add unit tests.

Step 3 — Re-run with the PR in place:
The script output these markers:

ROUTING_OBS_ERROR False
ROUTING_OBS_MODEL test-model
ROUTING_OBS_CLASS tests or QA
ROUTING_OBS_TEXT Classified task as 'tests or QA' and switched to LLM profile 'test-model' (model 'openai/test-underlying'). Future agent steps will use this profile.
ACTIVE_LLM_MODEL openai/test-underlying
CLASSIFIER_HTTP_CALLS 1
CLASSIFIER_USAGE_TRACKED True
MISSING_PROFILE_STARTUP_OK True
MISSING_PROFILE_OBS_ERROR True
MISSING_PROFILE_OBS_TEXT Failed to resolve the active meta-profile: Meta-profile `does-not-exist` not found. Available meta-profiles: balanced

This confirms the changed behavior end-to-end: the tool made a classifier HTTP call, parsed the chosen class, switched to the target saved profile, tracked classifier usage in conversation stats, and deferred missing-profile failure until tool invocation instead of breaking conversation startup.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — The main router architecture is sound, but there is one state-consistency bug in the activation path that should be fixed before merge.

[CRITICAL ISSUES]

  • [openhands-agent-server/openhands/agent_server/persistence/models.py, Line 275] State Consistency: Activating a meta-profile can record router state for agent variants that cannot actually use the routing tool.

[IMPROVEMENT OPPORTUNITIES]

  • None beyond the blocking consistency issue above.

[TESTING GAPS]

  • Add coverage for active_meta_profile updates while agent_settings is ACPAgentSettings (and/or for switching agent kinds while a meta-profile is active). The current tests only cover the default OpenHands settings variant, so they miss the silent top-level/nested drift for non-OpenHands agents.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
    This adds a new model-routing tool that performs an LLM call and mutates the active conversation LLM, plus new persistent settings and CRUD/activation endpoints. The core design is reasonable and the tests cover the happy path well, but the activation state must not report that routing is active when the selected agent implementation cannot attach the routing tool.

VERDICT:
Needs rework: Fix the unsupported-agent state drift before merging.

KEY INSIGHT:
The top-level active_meta_profile facade and the behavior-controlling nested agent_settings need to remain an invariant, not a best-effort propagation that silently no-ops for some agent variants.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

settings lack these fields, so guard on their presence.
"""
self.active_meta_profile = name
if "active_meta_profile" in type(self.agent_settings).model_fields:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: This sets the top-level active_meta_profile even when agent_settings is ACPAgentSettings, but ACP agents do not expose active_meta_profile / enable_classify_and_switch_llm_tool and never attach ClassifyAndSwitchLLMTool. In that state POST /api/meta-profiles/{name}/activate returns success and GET /api/meta-profiles reports an active router, while new conversations built from the persisted ACP settings cannot use it; switching agent kinds later can also leave the facade field stale. Please keep this invariant strict: either reject/clear active_meta_profile when the current agent settings variant cannot support the routing tool, or re-apply it when converting back to OpenHandsAgentSettings so persisted state cannot claim routing is active when it is not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants