Skip to content

feat(sdk): add ask_oracle tool#3673

Open
enyst wants to merge 6 commits into
mainfrom
feat/ask-oracle-tool
Open

feat(sdk): add ask_oracle tool#3673
enyst wants to merge 6 commits into
mainfrom
feat/ask-oracle-tool

Conversation

@enyst

@enyst enyst commented Jun 11, 2026

Copy link
Copy Markdown
Member

HUMAN:
This PR proposes an Oracle tool, for the agent to ask a more capable LLM when it encounters a difficulty, when it needs a second opinion, or when the user tells it to.

  • A human has tested these changes.

AGENT:

Why

Agents sometimes need a second opinion from a stronger or more specialized saved LLM profile without permanently switching the active conversation profile. This adds a minimal ask_oracle tool powered by an OpenHandsAgentSettings.oracle_llm_profile profile name so the agent can consult that Oracle profile statelessly and continue with its current LLM.

Summary

  • Add a built-in ask_oracle tool that asks a saved Oracle LLM profile for stateless second-opinion guidance.
  • Add OpenHandsAgentSettings.oracle_llm_profile so users can declare the saved profile that powers the Oracle tool.
  • Add unit coverage, settings schema coverage, an example, and temporary .pr/ evidence for reviewers.

Closes #3672.

Evidence

The .pr/ directory is intentional. Per this repository's PR artifact policy, temporary design notes, live-test evidence, JSON results, and reviewer-facing validation summaries that should not merge to main belong under .pr/ during review. The source for that policy is the repository guidance in AGENTS.md, section PR_ARTIFACTS: it says .pr/ is for PR-specific documents/scripts/artifacts, that reviewers are notified when it exists, and that the directory is automatically removed by workflow when the PR is approved. In other words, approving this PR will not merge the .pr/ artifacts; approval triggers the cleanup workflow to remove them before merge.

What the evidence proves:

  • .pr/ask_oracle_live_validation.json shows a live run where the regular profile used OpenAI direct openai/gpt-5-nano, the Oracle profile used litellm_proxy/openai/gpt-5-mini through https://llm-proxy.eval.all-hands.dev, and ask_oracle returned a successful non-error response from the Oracle.
  • .pr/ask_oracle_live_validation.py is the exact script used to create that JSON. It creates an isolated temporary profile store, saves an oracle profile, builds OpenHandsAgentSettings(oracle_llm_profile="oracle"), executes ask_oracle, records the response, and removes the temporary profile store in finally.
  • .pr/ask_oracle_test_results.json records the targeted pytest, example pytest, pre-commit, and live-validation commands/results.
  • .pr/ask_oracle_validation_summary.md summarizes the behavior for reviewers: the tool consults the saved Oracle profile statelessly, sends only the Oracle system prompt plus the agent's question/context (no conversation history or tools), and does not switch the active conversation LLM.

How to Test

  • uv run pre-commit run --files openhands-sdk/openhands/sdk/settings/model.py openhands-sdk/openhands/sdk/tool/builtins/__init__.py openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py tests/sdk/tool/test_ask_oracle.py tests/sdk/test_settings.py tests/examples/test_examples.py examples/01_standalone_sdk/54_ask_oracle_tool/main.py .pr/ask_oracle_live_validation.py .pr/ask_oracle_live_validation.json .pr/ask_oracle_test_results.json .pr/ask_oracle_validation_summary.md
  • uv run pytest tests/sdk/tool/test_ask_oracle.py tests/sdk/tool/test_builtins.py tests/sdk/test_settings.py::test_llm_agent_settings_export_schema_groups_sections tests/examples/test_examples.py::test_directory_example_is_discovered
  • uv run pytest tests/examples/test_examples.py --run-examples -k 54_ask_oracle_tool
  • CI=true uv run python -m pytest -q tests/sdk
  • Live validation in .pr/ask_oracle_live_validation.json using openai/gpt-5-nano as the regular profile and litellm_proxy/openai/gpt-5-mini through the eval proxy as the Oracle profile.

This PR was created by an AI agent (OpenHands) on behalf of the user.

@enyst can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:994c508-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-994c508-python \
  ghcr.io/openhands/agent-server:994c508-python

All tags pushed for this build

ghcr.io/openhands/agent-server:994c508-golang-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang-amd64
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:994c508-golang-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang-arm64
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:994c508-java-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java-amd64
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:994c508-java-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java-arm64
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:994c508-python-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python-amd64
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:994c508-python-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python-arm64
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:994c508-golang
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:994c508-java
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:994c508-python
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 994c508-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 994c508-python-amd64) are also available if needed

@enyst enyst added the integration-test Runs the integration tests and comments the results label Jun 11, 2026 — with OpenHands AI
@github-actions

This comment was marked as resolved.

@github-actions

Copy link
Copy Markdown
Contributor

📁 PR Artifacts Notice

This PR contains a .pr/ directory with PR-specific documents. This directory will be automatically removed when the PR is approved.

For fork PRs: Manual removal is required before merging.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions

This comment was marked as outdated.

all-hands-bot

This comment was marked as outdated.

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst force-pushed the feat/ask-oracle-tool branch from 08d4edd to 9c2b227 Compare June 11, 2026 21:06
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/settings
   model.py7144893%103, 402, 420, 600, 610–613, 616, 629, 633, 639, 649, 655, 660, 883, 908, 910, 912, 914, 916, 918, 920, 922, 924, 1233, 1235, 1649, 1669, 1837, 1966, 2005, 2031, 2167–2169, 2171, 2225, 2257, 2267, 2269, 2274, 2292, 2305, 2307, 2309, 2311, 2318
openhands-sdk/openhands/sdk/tool/builtins
   ask_oracle.py721677%42–48, 63, 106, 124–125, 129–130, 157–158, 195
TOTAL31241856672% 

@enyst enyst added the review-this This label triggers a PR review by OpenHands label Jun 11, 2026

all-hands-bot commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

This comment was marked as outdated.

Updated the description of the Oracle to clarify its purpose and capabilities.
@enyst enyst added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels Jun 11, 2026

This comment was marked as outdated.

all-hands-bot

This comment was marked as outdated.

Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Comment thread openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py Outdated
Co-authored-by: openhands <openhands@all-hands.dev>
@enyst

This comment was marked as outdated.

Co-authored-by: openhands <openhands@all-hands.dev>

@enyst enyst left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — the core feature is small and mostly sane, but there is a real configuration footgun hiding in the data flow.

[CRITICAL ISSUES]

  • [openhands-sdk/openhands/sdk/settings/model.py, lines 1154-1163] Data Structure / two sources of truth: oracle_llm_profile is supposed to be the canonical setting that powers ask_oracle, but create_agent() refuses to wire that profile if self.tools already contains an ask_oracle entry. That silently creates an unconfigured Oracle even when the user explicitly set oracle_llm_profile="oracle":

    OpenHandsAgentSettings(
        oracle_llm_profile="oracle",
        tools=[Tool(name="ask_oracle")],
    ).create_agent()
    # ask_oracle params stay {}

    Then the agent sees the tool, calls it, and gets “The Oracle is not configured.” That's exactly the kind of special-case behavior this settings field should eliminate. Pick one source of truth: when oracle_llm_profile is set, normalize/replace the existing ask_oracle tool params with {"profile_name": ...}, or reject the conflicting configuration loudly. Add a regression test for oracle_llm_profile plus a pre-existing Tool(name="ask_oracle").

[IMPROVEMENT OPPORTUNITIES]

  • [openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py, line 215] Tool metadata lies: openWorldHint=False says the tool has a closed interaction domain, but this tool calls an external LLM provider and sends the agent's question/context out of process. That is open-world behavior. Mark it True unless there is a very deliberate reason to pretend an LLM network call is closed-world.

[TESTING GAPS]

  • The tests cover the happy-path append and omission cases, but not the configuration collision above. This is not a mock-the-world nit: it is a real OpenHandsAgentSettings.create_agent() behavior regression test that should fail if the settings field stops being the actual source of truth.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

This adds a new agent-visible tool that can call another LLM and influence agent behavior, so the blast radius is larger than the code size. The implementation is mostly contained, and there is unit/live evidence, but the config precedence bug can make a user-visible “configured but unusable” state. Also, because this touches tool calling / agent behavior, I am leaving a COMMENT review rather than approving; per repo guidelines, a human maintainer should decide after appropriate eval/CI validation.

VERDICT:
Needs rework: fix the config precedence footgun before merge.

KEY INSIGHT:
The feature is only simple if oracle_llm_profile is the single source of truth; right now tools can silently override it into a broken half-configured state.

This PR review comment was generated by an AI agent (OpenHands) on behalf of the user.

@openhands-ai

This comment was marked as duplicate.

Co-authored-by: openhands <openhands@all-hands.dev>
@VascoSch92

Copy link
Copy Markdown
Member

@enyst Should we add this in tools instead that in the builtin ones?

@OpenHands OpenHands deleted a comment from openhands-ai Bot Jun 15, 2026
@OpenHands OpenHands deleted a comment from openhands-ai Bot Jun 15, 2026
@enyst

enyst commented Jun 15, 2026

Copy link
Copy Markdown
Member Author

@OpenHands understand and do this please: #3673 (comment)

@openhands-ai

openhands-ai Bot commented Jun 15, 2026

Copy link
Copy Markdown

I'm on it! enyst can track my progress at all-hands.dev

@OpenHands OpenHands deleted a comment from openhands-ai Bot Jun 15, 2026
Co-authored-by: openhands <openhands@all-hands.dev>

enyst commented Jun 15, 2026

Copy link
Copy Markdown
Member Author

@VascoSch92 @enyst I looked into this. I don't think ask_oracle should move to openhands-tools in this PR.

Why:

  • ask_oracle does not interact with the workspace/runtime environment; it is an SDK meta-tool that calls another saved SDK LLM profile.
  • Its configuration is owned by OpenHandsAgentSettings.oracle_llm_profile, so keeping the implementation in openhands-sdk avoids making the core SDK import from openhands-tools.
  • The package dependency direction is currently openhands-tools -> openhands-sdk; moving this tool while keeping SDK settings integration would invert that dependency.
  • It is closer to optional SDK built-ins like switch_llm than to runtime tools like terminal/file editor/browser.

I pushed 994c508 to clarify the builtins package docstring: BUILT_IN_TOOLS are the default tools, while BUILT_IN_TOOL_CLASSES can also include optional SDK-resident tools such as ask_oracle. Validation: uv run pre-commit run --files openhands-sdk/openhands/sdk/tool/builtins/__init__.py passed.

This comment was created by an AI agent (OpenHands) on behalf of the user.

@openhands-ai

openhands-ai Bot commented Jun 15, 2026

Copy link
Copy Markdown

OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server
**

See the conversation for more information.

@VascoSch92

Copy link
Copy Markdown
Member

mmmh... so why do we have openhands-tools?

I'm not against the tool. Just could be cool to have some order: what goes where? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration-test Runs the integration tests and comments the results review-this This label triggers a PR review by OpenHands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ask_oracle tool backed by a configured LLM profile

4 participants