feat(sdk): add ask_oracle tool by enyst · Pull Request #3673 · OpenHands/software-agent-sdk

enyst · 2026-06-11T20:45:32Z

HUMAN:
This PR proposes an Oracle tool, for the agent to ask a more capable LLM when it encounters a difficulty, when it needs a second opinion, or when the user tells it to.

A human has tested these changes.

AGENT:

Why

Agents sometimes need a second opinion from a stronger or more specialized saved LLM profile without permanently switching the active conversation profile. This adds a minimal ask_oracle tool powered by an OpenHandsAgentSettings.oracle_llm_profile profile name so the agent can consult that Oracle profile statelessly and continue with its current LLM.

Summary

Add a built-in ask_oracle tool that asks a saved Oracle LLM profile for stateless second-opinion guidance.
Add OpenHandsAgentSettings.oracle_llm_profile so users can declare the saved profile that powers the Oracle tool.
Add unit coverage, settings schema coverage, an example, and temporary .pr/ evidence for reviewers.

Closes #3672.

Evidence

The .pr/ directory is intentional. Per this repository's PR artifact policy, temporary design notes, live-test evidence, JSON results, and reviewer-facing validation summaries that should not merge to main belong under .pr/ during review. The source for that policy is the repository guidance in AGENTS.md, section PR_ARTIFACTS: it says .pr/ is for PR-specific documents/scripts/artifacts, that reviewers are notified when it exists, and that the directory is automatically removed by workflow when the PR is approved. In other words, approving this PR will not merge the .pr/ artifacts; approval triggers the cleanup workflow to remove them before merge.

What the evidence proves:

.pr/ask_oracle_live_validation.json shows a live run where the regular profile used OpenAI direct openai/gpt-5-nano, the Oracle profile used litellm_proxy/openai/gpt-5-mini through https://llm-proxy.eval.all-hands.dev, and ask_oracle returned a successful non-error response from the Oracle.
.pr/ask_oracle_live_validation.py is the exact script used to create that JSON. It creates an isolated temporary profile store, saves an oracle profile, builds OpenHandsAgentSettings(oracle_llm_profile="oracle"), executes ask_oracle, records the response, and removes the temporary profile store in finally.
.pr/ask_oracle_test_results.json records the targeted pytest, example pytest, pre-commit, and live-validation commands/results.
.pr/ask_oracle_validation_summary.md summarizes the behavior for reviewers: the tool consults the saved Oracle profile statelessly, sends only the Oracle system prompt plus the agent's question/context (no conversation history or tools), and does not switch the active conversation LLM.

How to Test

uv run pre-commit run --files openhands-sdk/openhands/sdk/settings/model.py openhands-sdk/openhands/sdk/tool/builtins/__init__.py openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py tests/sdk/tool/test_ask_oracle.py tests/sdk/test_settings.py tests/examples/test_examples.py examples/01_standalone_sdk/54_ask_oracle_tool/main.py .pr/ask_oracle_live_validation.py .pr/ask_oracle_live_validation.json .pr/ask_oracle_test_results.json .pr/ask_oracle_validation_summary.md
uv run pytest tests/sdk/tool/test_ask_oracle.py tests/sdk/tool/test_builtins.py tests/sdk/test_settings.py::test_llm_agent_settings_export_schema_groups_sections tests/examples/test_examples.py::test_directory_example_is_discovered
uv run pytest tests/examples/test_examples.py --run-examples -k 54_ask_oracle_tool
CI=true uv run python -m pytest -q tests/sdk
Live validation in .pr/ask_oracle_live_validation.json using openai/gpt-5-nano as the regular profile and litellm_proxy/openai/gpt-5-mini through the eval proxy as the Oracle profile.

This PR was created by an AI agent (OpenHands) on behalf of the user.

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:994c508-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-994c508-python \
  ghcr.io/openhands/agent-server:994c508-python

All tags pushed for this build

ghcr.io/openhands/agent-server:994c508-golang-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang-amd64
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:994c508-golang-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang-arm64
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:994c508-java-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java-amd64
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:994c508-java-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java-arm64
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:994c508-python-amd64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python-amd64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python-amd64
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:994c508-python-arm64
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python-arm64
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python-arm64
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:994c508-golang
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-golang
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-golang
ghcr.io/openhands/agent-server:994c508-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:994c508-java
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-java
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-java
ghcr.io/openhands/agent-server:994c508-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:994c508-python
ghcr.io/openhands/agent-server:994c50802f6ccc0d6395763e6e5bc3554a0518bd-python
ghcr.io/openhands/agent-server:feat-ask-oracle-tool-python
ghcr.io/openhands/agent-server:994c508-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 994c508-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 994c508-python-amd64) are also available if needed

github-actions · 2026-06-11T20:45:44Z

📁 PR Artifacts Notice

This PR contains a .pr/ directory with PR-specific documents. This directory will be automatically removed when the PR is approved.

For fork PRs: Manual removal is required before merging.

github-actions · 2026-06-11T20:46:05Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-06-11T20:46:09Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-11T21:15:05Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/settings
model.py	714	48	93%	103, 402, 420, 600, 610–613, 616, 629, 633, 639, 649, 655, 660, 883, 908, 910, 912, 914, 916, 918, 920, 922, 924, 1233, 1235, 1649, 1669, 1837, 1966, 2005, 2031, 2167–2169, 2171, 2225, 2257, 2267, 2269, 2274, 2292, 2305, 2307, 2309, 2311, 2318
openhands-sdk/openhands/sdk/tool/builtins
ask_oracle.py	72	16	77%	42–48, 63, 106, 124–125, 129–130, 157–158, 195
TOTAL	31241	8566	72%

all-hands-bot · 2026-06-11T21:43:03Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Updated the description of the Oracle to clarify its purpose and capabilities.

Co-authored-by: openhands <openhands@all-hands.dev>

enyst

🟡 Taste Rating: Acceptable — the core feature is small and mostly sane, but there is a real configuration footgun hiding in the data flow.

[CRITICAL ISSUES]

[openhands-sdk/openhands/sdk/settings/model.py, lines 1154-1163] Data Structure / two sources of truth: oracle_llm_profile is supposed to be the canonical setting that powers ask_oracle, but create_agent() refuses to wire that profile if self.tools already contains an ask_oracle entry. That silently creates an unconfigured Oracle even when the user explicitly set oracle_llm_profile="oracle":
```
OpenHandsAgentSettings(
    oracle_llm_profile="oracle",
    tools=[Tool(name="ask_oracle")],
).create_agent()
# ask_oracle params stay {}
```
Then the agent sees the tool, calls it, and gets “The Oracle is not configured.” That's exactly the kind of special-case behavior this settings field should eliminate. Pick one source of truth: when oracle_llm_profile is set, normalize/replace the existing ask_oracle tool params with {"profile_name": ...}, or reject the conflicting configuration loudly. Add a regression test for oracle_llm_profile plus a pre-existing Tool(name="ask_oracle").

[IMPROVEMENT OPPORTUNITIES]

[openhands-sdk/openhands/sdk/tool/builtins/ask_oracle.py, line 215] Tool metadata lies: openWorldHint=False says the tool has a closed interaction domain, but this tool calls an external LLM provider and sends the agent's question/context out of process. That is open-world behavior. Mark it True unless there is a very deliberate reason to pretend an LLM network call is closed-world.

[TESTING GAPS]

The tests cover the happy-path append and omission cases, but not the configuration collision above. This is not a mock-the-world nit: it is a real OpenHandsAgentSettings.create_agent() behavior regression test that should fail if the settings field stops being the actual source of truth.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

This adds a new agent-visible tool that can call another LLM and influence agent behavior, so the blast radius is larger than the code size. The implementation is mostly contained, and there is unit/live evidence, but the config precedence bug can make a user-visible “configured but unusable” state. Also, because this touches tool calling / agent behavior, I am leaving a COMMENT review rather than approving; per repo guidelines, a human maintainer should decide after appropriate eval/CI validation.

VERDICT:
❌ Needs rework: fix the config precedence footgun before merge.

KEY INSIGHT:
The feature is only simple if oracle_llm_profile is the single source of truth; right now tools can silently override it into a broken half-configured state.

This PR review comment was generated by an AI agent (OpenHands) on behalf of the user.

Co-authored-by: openhands <openhands@all-hands.dev>

VascoSch92 · 2026-06-15T08:44:56Z

@enyst Should we add this in tools instead that in the builtin ones?

enyst · 2026-06-15T23:29:55Z

@OpenHands understand and do this please: #3673 (comment)

openhands-ai · 2026-06-15T23:30:10Z

I'm on it! enyst can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2026-06-15T23:36:00Z

@VascoSch92 @enyst I looked into this. I don't think ask_oracle should move to openhands-tools in this PR.

Why:

ask_oracle does not interact with the workspace/runtime environment; it is an SDK meta-tool that calls another saved SDK LLM profile.
Its configuration is owned by OpenHandsAgentSettings.oracle_llm_profile, so keeping the implementation in openhands-sdk avoids making the core SDK import from openhands-tools.
The package dependency direction is currently openhands-tools -> openhands-sdk; moving this tool while keeping SDK settings integration would invert that dependency.
It is closer to optional SDK built-ins like switch_llm than to runtime tools like terminal/file editor/browser.

I pushed 994c508 to clarify the builtins package docstring: BUILT_IN_TOOLS are the default tools, while BUILT_IN_TOOL_CLASSES can also include optional SDK-resident tools such as ask_oracle. Validation: uv run pre-commit run --files openhands-sdk/openhands/sdk/tool/builtins/__init__.py passed.

This comment was created by an AI agent (OpenHands) on behalf of the user.

openhands-ai · 2026-06-15T23:37:02Z

OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server
**

See the conversation for more information.

VascoSch92 · 2026-06-16T05:47:00Z

mmmh... so why do we have openhands-tools?

I'm not against the tool. Just could be cool to have some order: what goes where? :-)

enyst added the integration-test Runs the integration tests and comments the results label Jun 11, 2026 — with OpenHands AI

This comment was marked as resolved.

Sign in to view

This comment was marked as outdated.

Sign in to view

enyst mentioned this pull request Jun 11, 2026

docs(sdk): document ask_oracle tool OpenHands/docs#566

Open

feat(sdk): add ask oracle tool

9c2b227

Co-authored-by: openhands <openhands@all-hands.dev>

enyst force-pushed the feat/ask-oracle-tool branch from 08d4edd to 9c2b227 Compare June 11, 2026 21:06

enyst added the review-this This label triggers a PR review by OpenHands label Jun 11, 2026

This comment was marked as outdated.

Sign in to view

Revise Oracle description for clarity and intent

3daa1ce

Updated the description of the Oracle to clarify its purpose and capabilities.

enyst added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels Jun 11, 2026

This comment was marked as outdated.

Sign in to view