Skip to content

docs(sdk): document /goal judge-driven goal-completion loop#580

Open
VascoSch92 wants to merge 2 commits into
mainfrom
vasco/goal-sdk
Open

docs(sdk): document /goal judge-driven goal-completion loop#580
VascoSch92 wants to merge 2 commits into
mainfrom
vasco/goal-sdk

Conversation

@VascoSch92

Copy link
Copy Markdown
Member

Why

Companion docs page for the new SDK /goal command introduced in
OpenHands/software-agent-sdk#3769.

The /goal loop drives a conversation toward a verifiable objective: after
every agent run, a second "judge" LLM audits the transcript for authoritative
evidence (file contents, command output, test results) and either declares the
goal complete or re-prompts the agent with what is still missing — until the
objective is provably done or max_iterations is hit.

What

  • New page: sdk/guides/convo-goal.mdx — modeled after sdk/guides/critic.mdx and sdk/guides/convo-fork.mdx. Covers overview, how-it-works, quick start, GoalOutcome / GoalVerdict fields, parameters, composition with a critic, the lower-level GoalController / judge_goal building blocks, and embeds the runnable example.
  • Navigation: added sdk/guides/convo-goal to docs.json under the Conversation Features group, alongside fork, pause/resume, ask-agent, hooks, etc.

Linked Example

The page renders examples/01_standalone_sdk/54_goal_completion_loop.py (added in the SDK PR) as the ready-to-run example.

Notes

  • Marked as draft so it stays in sync with the SDK PR — once that PR lands and ships on main, this PR can be marked ready for review.
  • Branch name vasco/goal-sdk mirrors the SDK PR branch, as requested.

This PR was created by an AI agent (OpenHands) on behalf of @VascoSch92.

@VascoSch92 can click here to continue refining the PR

Adds sdk/guides/convo-goal.mdx and a navigation entry for the new
/goal SDK feature (OpenHands/software-agent-sdk#3769).

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

This comment was marked as outdated.

Address review feedback on convo-goal.mdx:
- GoalController: clarify it does no conversation transport I/O but does own
  the synchronous (blocking) judge LLM call in on_run_finished().
- judge_goal: drop the misleading "pure" wording; document the real
  (judge_llm, objective, events) signature and that it calls the judge LLM.
- Sync the embedded example byte-for-byte with the upstream SDK file so the
  docs code-block sync job won't rewrite it.
- Drop unused imports (IterativeRefinementConfig, GoalContinue); fix the
  Quick Start focus lines.
@mintlify

mintlify Bot commented Jun 18, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
all-hands-ai 🟢 Ready View Preview Jun 18, 2026, 9:58 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

all-hands-bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste — previous review feedback has been fully addressed; the page is accurate and well-structured.

Checked the three items raised in the prior review:

  1. GoalController I/O wording — Now correctly reads: "It does no conversation transport I/O … but it does own the judge call: on_run_finished() synchronously invokes the judge LLM, so treat that call as blocking." The no-transport / yes-judge-call split is the key integration constraint, and it is now stated clearly.

  2. judge_goal pure-function claim — Removed. The current text reads: "It calls the judge LLM each time, so it is not a pure function." Correct and unambiguous.

  3. Example sync — The embedded 54_goal_completion_loop.py block is now byte-for-byte with the upstream SDK file (unused GoalContinue import dropped, model defaulting fixed), so the docs sync job will leave it alone.

One minor note before the page goes live: the Ready-to-run Example section links to blob/main/examples/01_standalone_sdk/54_goal_completion_loop.py. Make sure the companion SDK PR (#3769) has been merged to main before this docs PR is published — otherwise that link will 404. Everything else looks good.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW
    Documentation-only change. Navigation entry is additive. No runtime code is modified.

VERDICT:
Worth merging: All prior blocking issues are resolved; the page accurately documents the goal-completion loop.

KEY INSIGHT:
The revised wording makes the most important integration constraint unmistakable: on_run_finished() is a synchronous, LLM-backed call, not pure in-process logic — critical information for anyone building an async or server-side driver.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants