feat(agent-server): add /goal agent-server endpoint, background loop, and stop/resume by VascoSch92 · Pull Request #3770 · OpenHands/software-agent-sdk

VascoSch92 · 2026-06-17T08:26:35Z

HUMAN:

Added the agent-server part to create /goal command in the agent-canvas UI.

A human has tested these changes.

AGENT:

Stacked on #3769 (SDK /goal core). Base branch is vasco/goal-sdk; review/merge
that first. Kept as a draft until #3769 lands.

Why

Builds on the SDK /goal core (#3769) to make /goal usable from the agent
server / UI: start a goal on the live conversation, stream its work inline, and
control it (stop / resume / interject) — per #3569.

Summary

EventService.start_goal / _run_goal: a background driver that drives the
shared conversation (no fork) and publishes
ConversationStateUpdateEvent(key="goal") lifecycle updates
(running / complete / capped / interrupted) for a UI chip.
stop_goal / resume_goal / _last_goal_status: cancel a running goal (a
normal user message also interrupts it via the _from_goal send_message
hook) and resume from the persisted iteration — works within a session and
across a server restart (the status events are persisted).
Endpoints: POST /conversations/{id}/goal, /goal/stop, /goal/resume.

Issue Number

#3569

How to Test

uv run pytest tests/agent_server/test_goal_loop.py tests/agent_server/test_conversation_router.py -q

test_goal_loop.py is end-to-end against a real EventService +
LocalConversation (scripted TestLLM agent + judge, no mocks of the unit under
test): it starts a goal and drives it to complete/capped, stops a running
goal (asserting the interrupted status is persisted to the shared event
log), resumes from an interrupted state, verifies a user message interrupts a
running goal, and checks the out-of-credits path (agent run error → interrupted,
resumable). test_conversation_router.py exercises the HTTP endpoints
(200 / 404 / 409 / 400).

Video/Screenshots

N/A — backend. UI wiring consumes the new event: filter the stream for
kind == "ConversationStateUpdateEvent" && key == "goal" and render
value.{status, iteration, max_iterations} as a chip (see Notes).

Type

Feature

Notes

UI integration: render the goal chip from the key="goal" state-update events,
and call the stop/resume endpoints from chip controls.
stop is a graceful cancel — an in-flight LLM call finishes before the
conversation goes quiescent; hard-abort via the server's interrupt() is a
possible follow-up.
Optional follow-up: a GET /conversations/{id}/goal so the UI can fetch the
current goal status on load without scanning events.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:6e9e355-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-6e9e355-python \
  ghcr.io/openhands/agent-server:6e9e355-python

All tags pushed for this build

ghcr.io/openhands/agent-server:6e9e355-golang-amd64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-golang-amd64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-golang-amd64
ghcr.io/openhands/agent-server:6e9e355-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:6e9e355-golang-arm64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-golang-arm64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-golang-arm64
ghcr.io/openhands/agent-server:6e9e355-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:6e9e355-java-amd64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-java-amd64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-java-amd64
ghcr.io/openhands/agent-server:6e9e355-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:6e9e355-java-arm64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-java-arm64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-java-arm64
ghcr.io/openhands/agent-server:6e9e355-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:6e9e355-python-amd64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-python-amd64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-python-amd64
ghcr.io/openhands/agent-server:6e9e355-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:6e9e355-python-arm64
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-python-arm64
ghcr.io/openhands/agent-server:vasco-goal-agent-server-python-arm64
ghcr.io/openhands/agent-server:6e9e355-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:6e9e355-golang
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-golang
ghcr.io/openhands/agent-server:vasco-goal-agent-server-golang
ghcr.io/openhands/agent-server:6e9e355-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:6e9e355-java
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-java
ghcr.io/openhands/agent-server:vasco-goal-agent-server-java
ghcr.io/openhands/agent-server:6e9e355-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:6e9e355-python
ghcr.io/openhands/agent-server:6e9e355f97e1e945837c3355c3e1cca357063eac-python
ghcr.io/openhands/agent-server:vasco-goal-agent-server-python
ghcr.io/openhands/agent-server:6e9e355-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 6e9e355-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 6e9e355-python-amd64) are also available if needed

Add openhands.sdk.conversation.goal: a conversation-level "/goal" driver that pursues an objective by running the agent, judging completion with a second LLM, and re-prompting until the goal is done or a cap is reached. - judge_goal + GoalVerdict: the reusable objective+transcript -> verdict kernel (renders the transcript, excluding the system prompt, and asks a judge LLM for a strict-JSON verdict). - GoalController: transport-agnostic continue-vs-stop decision logic and the iteration cap. - run_goal: a thin synchronous driver over the controller that composes with any existing critic (the critic governs each inner run(); this loop governs the overall objective). Self-contained, with no agent-server dependency. Includes a runnable demo under .pr/ proving the goal work lands in the same conversation history. Relates to #3569.

Wire the SDK /goal loop into the agent server so a UI can run, watch, stop, and resume goals on the live conversation (no fork). - EventService.start_goal/_run_goal: a background driver that drives the shared conversation and publishes ConversationStateUpdateEvent(key="goal") lifecycle updates (running/complete/capped/interrupted) for a UI chip. - stop_goal/resume_goal/_last_goal_status: cancel a running goal (a normal user message also interrupts it) and resume from the persisted iteration, including across a server restart. - POST /conversations/{id}/goal, /goal/stop, /goal/resume. Stacked on the SDK /goal core. Relates to #3569.

github-actions · 2026-06-17T08:29:18Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
conversation_router.py	218	13	94%	249, 331, 377, 437, 571–574, 586–589, 624
event_service.py	680	126	81%	114–115, 145, 148–149, 153–154, 161, 165, 171, 181–185, 188–191, 250, 271–272, 343, 394, 414, 421, 445–446, 450, 458, 461, 477, 509, 520, 527, 533, 587, 589, 593–595, 599, 608–609, 611, 615, 621, 623, 670, 700, 703, 758, 779, 887, 999–1002, 1006, 1035, 1039, 1046, 1060, 1075, 1122–1124, 1195, 1220, 1226, 1228, 1238, 1240, 1247, 1257, 1259–1260, 1264, 1278–1283, 1285, 1312, 1317–1320, 1324–1327, 1335–1338, 1377–1379, 1432–1433, 1435–1442, 1444–1445, 1454–1455, 1457–1458, 1465–1466, 1468–1469, 1489, 1495, 1501, 1510–1511
TOTAL	33012	6935	78%

…d future imports

all-hands-bot · 2026-06-17T13:13:54Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

Code Review: `/goal` agent-server endpoints

Taste Rating

🟡 Acceptable — Solid implementation with minor test gap

Analysis

The implementation adds a background /goal driver loop that drives a shared conversation (not a fork), with proper lifecycle events for UI integration. The architecture is sound: GoalController from the SDK owns decisions, EventService owns I/O and event persistence.

[IMPROVEMENT OPPORTUNITIES]

[test_goal_loop.py, test_user_message_stops_running_goal] Test Coverage Gap: The test creates a dummy asyncio.sleep(30) task instead of a real goal loop, then asserts send_message cancels it. This tests the cancellation path, but not the actual integration: the goal task never runs send_message with _from_goal=True. A real goal task would pass _from_goal=True on subsequent iterations, so this test should be updated to either (a) use a real goal that waits for a gate, or (b) add a comment explaining this tests the "early stop" path only.
[event_service.py, start_goal] Minor: The check if self._goal_task is not None and not self._goal_task.done() has a small race window. Concurrent calls could both pass the check. For this use case the window is negligible, but if strict serialization is needed, consider an asyncio Lock.

[TESTING GAPS]

[test_goal_loop.py] The test suite covers happy paths well (test_goal_outcomes, test_resume_from_interrupted_status) and error paths (test_start_goal_rejects_empty_objective, test_resume_without_resumable_goal_raises). The test_goal_halts_on_run_error_as_interrupted case is a good addition for the "out of credits" scenario. Overall, the testing is thorough with scripted LLMs.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW
This PR adds new endpoints behind the existing routing infrastructure, modifies send_message with a backward-compatible parameter, and introduces well-isolated async task management. No breaking changes to existing APIs. The testing uses real EventService with scripted LLMs, not mocks of the unit under test.

VERDICT

✅ Worth merging — The core implementation is solid, with appropriate separation of concerns and comprehensive end-to-end testing. The test coverage gap in test_user_message_stops_running_goal is a minor concern but doesn't block merge.

KEY INSIGHT

The design correctly isolates decisions (SDK's GoalController) from transport (EventService), making the goal loop reusable across different runtime contexts.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation.

all-hands-bot

✅ QA Report: PASS

The PR achieves its stated goal: I exercised the live agent-server over HTTP and verified /goal, /goal/stop, and /goal/resume drive the shared conversation, emit/persist goal state updates, and resume after interruption/restart.

Does this PR achieve its stated goal?

Yes. On the base branch, /api/conversations/{id}/goal was not present and returned 404; on the PR branch, the same real server flow returned 200 and produced persisted ConversationStateUpdateEvent(key="goal") lifecycle updates. Stop/resume also worked: an in-flight goal emitted interrupted, /goal/resume completed it, and the same resume flow worked after restarting the server.

Phase	Result
Environment Setup	✅ `uv sync --dev` completed; live FastAPI/uvicorn server started successfully
CI Status	🟡 13 successful, 0 failing, 1 pending (`QA Changes by OpenHands`) at time of review
Functional Verification	✅ Live HTTP scenarios passed for baseline absence, start/complete, stop/resume, and resume after restart

Functional Verification

Test 1: Baseline branch has no `/goal` endpoint

Step 1 — Reproduce / establish baseline (without the fix):
Checked out vasco/goal-sdk and ran uv run python /tmp/qa_goal_live.py baseline against a live uvicorn agent-server, creating a real conversation and POSTing to /api/conversations/{id}/goal:

{
  "goal_paths_present": [],
  "goal_post_body": "{"detail":"Not Found"}",
  "goal_post_status": 404,
  "mode": "baseline"
}

This confirms the base branch did not expose the claimed goal-control API.

Step 2 — Apply the PR's changes:
Checked out vasco/goal-agent-server at e384cd5c5a1039e2c429d412b6020f0244f58d6d.

Test 2: Start `/goal` and drive it to completion

Step 3 — Re-run with the fix in place:
Ran uv run python /tmp/qa_goal_live.py complete against the PR branch live server. The script created a conversation through POST /api/conversations, injected a scripted TestLLM into that live conversation, then called POST /api/conversations/{id}/goal with a real objective:

{
  "start_status": 200,
  "start_body": {"success": true},
  "goal_statuses": [
    {"active": true, "iteration": 0, "status": "running"},
    {"active": false, "iteration": 1, "status": "complete", "verdict": {"complete": true, "score": 1.0}}
  ]
}

This shows the new endpoint accepts the request, starts the background loop, and persists the UI-consumable goal state updates through completion.

Test 3: Stop and resume a running goal

Ran uv run python /tmp/qa_goal_live.py stop-resume. The goal was started with a gated LLM call to simulate a user stopping while work was in flight, then /goal/stop and /goal/resume were called over HTTP:

{
  "start_status": 200,
  "llm_entered": true,
  "stop_status": 200,
  "stop_was_pending_until_llm_released": true,
  "status_after_stop": {"active": false, "iteration": 0, "status": "interrupted"},
  "resume_status": 200,
  "final_goal_status": {"active": false, "iteration": 1, "status": "complete", "verdict": {"complete": true, "score": 1.0}}
}

This confirms the stop endpoint records a resumable interrupted state and resume continues the shared conversation to completion. The observed pending stop call matches the PR note that stop is graceful and waits for the in-flight LLM call to finish.

Test 4: Resume after server restart

Ran a live two-server scenario: start goal → stop/interrupted → shut down uvicorn → start a new uvicorn server → call /goal/resume for the same persisted conversation:

{
  "start_status": 200,
  "stop_status": 200,
  "status_before_restart": {"active": false, "iteration": 0, "objective": "survive restart", "status": "interrupted"},
  "resume_after_restart_status": 200,
  "final_goal_status_after_restart": {"active": false, "iteration": 1, "objective": "survive restart", "status": "complete", "verdict": {"complete": true, "score": 1.0}}
}

This verifies the persisted status events are sufficient for the new server instance to resume the goal.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

VascoSch92 · 2026-06-18T09:54:56Z

@OpenHands resolve the conflicts here

openhands-development · 2026-06-18T09:55:00Z

@VascoSch92 it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

openhands-ai · 2026-06-18T09:55:11Z

I'm on it! VascoSch92 can track my progress at all-hands.dev

# Conflicts: # openhands-sdk/openhands/sdk/conversation/goal/controller.py # tests/sdk/conversation/goal/test_controller.py

github-actions · 2026-06-18T09:58:05Z

✅ PR Artifacts Cleaned Up

The .pr/ directory has been automatically removed.

VascoSch92 · 2026-06-18T09:58:14Z

Resolved merge conflicts with main in 610bbcd.

Conflicts were in two files (openhands-sdk/openhands/sdk/conversation/goal/controller.py and tests/sdk/conversation/goal/test_controller.py) where the goal SDK core landed on main via #3769, while this branch carries the follow-up commit c96ebb44 that adds the verdict field to GoalContinue (so the agent-server driver can publish per-round judge feedback). Kept our side in both files — the agent-server _run_goal loop relies on step.verdict to emit the per-round ConversationStateUpdateEvent(key="goal").

Verified locally: uv run pytest tests/sdk/conversation/goal/test_controller.py tests/agent_server/test_goal_loop.py tests/agent_server/test_conversation_router.py -q → 99 passed.

This comment was posted by an AI agent (OpenHands) on behalf of @VascoSch92.

github-actions · 2026-06-18T09:58:30Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-06-18T09:58:31Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

openhands-ai · 2026-06-18T09:59:19Z

OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server
**

See the conversation for more information.

all-hands-bot · 2026-06-18T10:23:11Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

🟡 Acceptable — the architecture is clean and the test suite is comprehensive; one meaningful correctness gap in the exception path needs fixing before merge.

Summary: This PR wires the SDK /goal loop into the agent server with a tidy three-endpoint design (/goal, /goal/stop, /goal/resume). The persistence-via-ConversationStateUpdateEvent approach for cross-restart resumability is elegant. Tests cover the happy path, stop/resume lifecycle, user-message interruption, and the run-error path with real EventService + scripted LLMs — solid test hygiene.

[CRITICAL ISSUES]
The except Exception handler in _run_goal swallows unexpected failures (e.g. a judge LLM network error, a GoalController internal bug) without recording a terminal status event. The last persisted event will still have active=True, status="running", so the UI will display the goal as running when it has actually died. The CancelledError path two lines above already handles this correctly — the exception path should mirror it (see inline comment).

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
New background-task logic on a live conversation with asyncio cancel/resume semantics. The logic is sound and well-tested, but the exception-path status gap means a judge LLM failure leaves persistent incorrect UI state (active=True when the loop is dead). Low probability in practice; trivially easy to fix.

VERDICT:
❌ Needs rework: Fix the exception-path status emission before merging.

KEY INSIGHT:
The except Exception handler must emit an interrupted status just like the CancelledError path does, or any unexpected failure leaves the goal permanently "running" in the UI.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.

Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

all-hands-bot

✅ QA Report: PASS

Verified the new /goal agent-server API and goal-loop behavior by running the server and exercising real EventService flows; no functional issues found.

Does this PR achieve its stated goal?

Yes. The PR sets out to make /goal usable from agent-server/UI by adding HTTP endpoints, background goal execution, goal lifecycle events, and stop/resume behavior. I verified main lacks the endpoint, the PR exposes /goal, /goal/stop, and /goal/resume, real HTTP requests succeed, goal status events are persisted, and direct EventService execution completes, stops, resumes, and resumes after service recreation from persisted state.

Phase	Result
Environment Setup	✅ `uv sync --dev` completed and the agent server started locally.
CI Status	✅ Current checks are passing; `qa-changes` is still in progress for this review, with artifact cleanup checks skipped/success as expected.
Functional Verification	✅ HTTP endpoint behavior and goal-loop lifecycle behavior worked as described.

Functional Verification

Test 1: Baseline vs PR endpoint exposure

Step 1 — Establish baseline on main:
Started python -m openhands.agent_server on 127.0.0.1:8010 and queried the new route/OpenAPI.

POST /api/conversations/{id}/goal -> HTTP 404
OpenAPI goal paths on main: []

This shows the /goal API is not present before the PR.

Step 2 — Apply the PRs changes:
Checked out vasco/goal-agent-server at 610bbcd1abcd4f593f1afc5383429a7370e5c066 and started the agent server.

Step 3 — Re-run with the PR:
Queried server metadata and OpenAPI.

GET / -> version 1.28.0
OpenAPI goal paths:
- /api/conversations/{conversation_id}/goal
- /api/conversations/{conversation_id}/goal/stop
- /api/conversations/{conversation_id}/goal/resume

This confirms the PR exposes the intended backend entry points.

Test 2: HTTP start/stop/resume behavior

Step 1 — Baseline:
On main, the same POST /goal request returned 404, so there was no route to start a goal.

Step 2 — Apply the PRs changes:
Created a real conversation through the PR server with a local workspace and an agent payload.

POST /api/conversations -> HTTP 201
conversation id: c4d50c95-8cad-4977-9347-4da0c987770e
execution_status: idle

Step 3 — Exercise the new API:
Started, stopped, and resumed a goal via HTTP.

POST /api/conversations/c4d50c95-8cad-4977-9347-4da0c987770e/goal -> HTTP 200
{"success":true}

POST /api/conversations/c4d50c95-8cad-4977-9347-4da0c987770e/goal/stop -> HTTP 200
{"success":true}

POST /api/conversations/c4d50c95-8cad-4977-9347-4da0c987770e/goal/resume -> HTTP 200
{"success":true}

Because the HTTP conversation used a placeholder LLM instead of external credentials, the run halted as interrupted, which is the resumable failure behavior described by the PR. The persisted event stream showed the goal lifecycle updates:

[(ConversationStateUpdateEvent, goal, {active: True, status: running, iteration: 0, max_iterations: 1, ...}),
 (ConversationStateUpdateEvent, goal, {active: False, status: interrupted, iteration: 0, max_iterations: 1, ...}),
 (ConversationStateUpdateEvent, goal, {active: True, status: running, iteration: 0, max_iterations: 1, ...}),
 (ConversationStateUpdateEvent, goal, {active: False, status: interrupted, iteration: 0, max_iterations: 1, ...})]

I also verified the user-facing error case:

POST /api/conversations/{new_conversation_id}/goal/resume -> HTTP 400
{"detail":"no_resumable_goal"}

Test 3: Goal loop completion, verdict events, stop, and resume

Step 1 — Baseline:
On main, there is no agent-server goal-loop API to drive; only the PR branch includes the EventService goal driver.

Step 2 — Apply the PRs changes:
Ran a small user script against a real EventService + LocalConversation, using deterministic TestLLM responses for the agent and judge so no external credentials were required.

Step 3 — Exercise the behavior:
The goal completed after two judge rounds and emitted per-round status/verdict data:

COMPLETE_SCENARIO
event_count 1 -> 16
goal_statuses [running, running, complete]
running_verdict_missing [tests not passing]
final {active: False, status: complete, iteration: 2, max_iterations: 5, objective: Make the QA proof goal complete, verdict: {score: 1.0, complete: True, missing: }}

A running goal could be stopped and then resumed to completion:

STOP_SCENARIO
judge_entered True
stop_returned True
after_stop {active: False, status: interrupted, iteration: 1, max_iterations: 4, objective: Resume this QA goal after interruption, verdict: None}
after_resume {active: False, status: complete, iteration: 2, max_iterations: 4, objective: Resume this QA goal after interruption, verdict: {score: 1.0, complete: True, missing: }}

This confirms the background loop, per-round UI chip data, graceful stop, and same-session resume behavior.

Test 4: Resume after EventService recreation

Step 1 — Baseline:
Without the PR goal-status persistence/resume logic, there is no resumable agent-server goal state to reconstruct.

Step 2 — Apply the PRs changes:
Started a goal, stopped it mid-audit, closed the first EventService, recreated another EventService with the same persisted conversation directory and conversation id, then called resume_goal(...).

Step 3 — Re-run with recreated service:

RESTART_SCENARIO
same_conversation_id 2eccccdb-c50a-48a6-ba14-47d9a1c3bbe6
judge_entered_before_stop True
stop_returned True
before_close {active: False, status: interrupted, iteration: 1, max_iterations: 3, objective: Persist and resume this goal, verdict: None}
after_recreated_service_resume {active: False, status: complete, iteration: 2, max_iterations: 3, objective: Persist and resume this goal, verdict: {score: 1.0, complete: True, missing: }}

This verifies the PRs claim that persisted goal status can be used to resume after a server/service restart.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

… unexpected error

enyst · 2026-06-18T14:12:06Z

@OpenHands Describe to me what does it mean to start a goal, to stop a goal, and to resume. I mean, the REST API endpoints, what do they do, in essence. Keep it concise, the high level execution path

openhands-development · 2026-06-18T14:12:11Z

@enyst it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

openhands-ai · 2026-06-18T14:12:23Z

I'm on it! enyst can track my progress at all-hands.dev

enyst · 2026-06-18T14:13:43Z

@enyst High-level execution path:

POST /conversations/{id}/goal: finds the live EventService, rejects if the conversation or another goal is already running, builds a GoalController(objective, max_iterations), then starts a background task and returns. That task emits ConversationStateUpdateEvent(key="goal", status="running"), sends the objective as a normal user message into the same conversation history, runs the agent, judges the resulting events, and either emits complete/capped or sends a continuation prompt and loops.
POST /conversations/{id}/goal/stop: cancels the background goal driver if present. The cancel path persists/publishes key="goal" with status="interrupted", so the UI can stop showing it as active and the goal remains resumable. It does not delete any conversation history.
POST /conversations/{id}/goal/resume: reads the last persisted key="goal" status from the conversation events, requires it to be resumable (not complete/capped), rebuilds the controller with the same objective and stored iteration, then starts the same background loop using a resume prompt.

A normal user message also calls stop_goal() first, so user interjection interrupts the goal before new user input is appended.

This comment was created by an AI agent (OpenHands) on behalf of the requester.

openhands-ai · 2026-06-18T14:14:52Z

OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server
**

See the conversation for more information.

openhands-development · 2026-06-18T14:32:48Z

@enyst it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

Co-authored-by: openhands <openhands@all-hands.dev>

VascoSch92 · 2026-06-18T14:38:21Z

@OpenHands in this Pr rename every method: *_goal_conversation -> *_goal_in_conversation. Moreover, update the docstring and the comments with the description given here: #3770 (comment). Make coincise and clear docstring.

openhands-development · 2026-06-18T14:38:27Z

@VascoSch92 it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

openhands-ai · 2026-06-18T14:38:39Z

I'm on it! VascoSch92 can track my progress at all-hands.dev

VascoSch92 · 2026-06-18T14:40:17Z

@enyst I blocked my agent. I think we are good to go if for you is good

enyst

Thank you, and sorry for the confusion! My slow brain is… yeah, slow

not anymore valid

The `_return_metrics` parameter was removed from LLM/TestLLM.completion in 2bedbd8 (removed_in 1.29.0), but tests/agent_server/test_goal_loop.py's _GatedLLM override (added later in #3770) still declared it as a positional param and forwarded it to super().completion(), which pyright flagged as an incompatible override and a 5-positional call against a 4-positional signature. Drop _return_metrics to match the current signature.

The `_return_metrics` parameter was removed from LLM/TestLLM.completion in 2bedbd8 (removed_in 1.29.0, on this release branch), but tests/agent_server/test_goal_loop.py's _GatedLLM override (added in #3770) still declared it as a positional param and forwarded it to super().completion(), which pyright flagged as an incompatible override and a 5-positional call against the new 4-positional signature. Drop _return_metrics to match.

VascoSch92 added 2 commits June 17, 2026 10:24

VascoSch92 changed the title ~~feat: add /goal agent-server endpoint, background loop, and stop/resume~~ feat(agent-server): add /goal agent-server endpoint, background loop, and stop/resume Jun 17, 2026

VascoSch92 and others added 2 commits June 17, 2026 14:58

chore(sdk): tidy /goal core — temp-dir demo persistence, drop unneede…

2e37bea

…d future imports

Merge branch 'vasco/goal-sdk' into vasco/goal-agent-server

68a7ca9

VascoSch92 requested a review from all-hands-bot June 17, 2026 13:12

all-hands-bot reviewed Jun 17, 2026

View reviewed changes

Comment thread tests/agent_server/test_goal_loop.py Outdated

Comment thread openhands-agent-server/openhands/agent_server/event_service.py

VascoSch92 added 2 commits June 17, 2026 15:17

Merge branch 'main' into vasco/goal-sdk

5ee822d

Merge branch 'vasco/goal-sdk' into vasco/goal-agent-server

e384cd5

VascoSch92 marked this pull request as ready for review June 17, 2026 13:18

all-hands-bot reviewed Jun 17, 2026

View reviewed changes

VascoSch92 requested a review from xingyaoww June 17, 2026 14:45

feat(goal): carry per-round judge verdict on goal status events

c96ebb4

Base automatically changed from vasco/goal-sdk to main June 18, 2026 09:54

Merge remote-tracking branch 'origin/main' into vasco/goal-agent-server

610bbcd

# Conflicts: # openhands-sdk/openhands/sdk/conversation/goal/controller.py # tests/sdk/conversation/goal/test_controller.py

VascoSch92 requested a review from all-hands-bot June 18, 2026 10:22

all-hands-bot previously requested changes Jun 18, 2026

View reviewed changes

Comment thread openhands-agent-server/openhands/agent_server/event_service.py

all-hands-bot reviewed Jun 18, 2026

View reviewed changes

fix(agent-server): emit interrupted status when the goal loop hits an…

a7f463b

… unexpected error

Merge branch 'main' into vasco/goal-agent-server

0131ce8

VascoSch92 enabled auto-merge (squash) June 18, 2026 13:24

VascoSch92 disabled auto-merge June 18, 2026 13:26

enyst reviewed Jun 18, 2026

View reviewed changes

Comment thread openhands-agent-server/openhands/agent_server/conversation_router.py Outdated

enyst reviewed Jun 18, 2026

View reviewed changes

Comment thread openhands-agent-server/openhands/agent_server/conversation_router.py Outdated

This was referenced Jun 18, 2026

Add REST API contract summaries to PR descriptions #3789

Open

Release v1.29.0 #3787

Open

Merge branch 'main' into vasco/goal-agent-server

2c61ff2

Clarify goal loop naming in agent server

dc3f8dc

Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'main' into vasco/goal-agent-server

6e9e355

VascoSch92 enabled auto-merge (squash) June 18, 2026 14:41

enyst approved these changes Jun 18, 2026

View reviewed changes

VascoSch92 merged commit 1a6c10e into main Jun 18, 2026
32 of 44 checks passed

VascoSch92 deleted the vasco/goal-agent-server branch June 18, 2026 14:46

enyst mentioned this pull request Jun 18, 2026

Document conversation goal endpoints OpenHands/docs#581

Draft

VascoSch92 mentioned this pull request Jun 18, 2026

Fix _GatedLLM.completion override after _return_metrics removal #3794

Closed

VascoSch92 mentioned this pull request Jun 18, 2026

Fix _GatedLLM.completion override after _return_metrics removal (rel-1.29.0) #3795

Merged

Conversation

VascoSch92 commented Jun 17, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Issue Number

How to Test

Video/Screenshots

Type

Notes

Uh oh!

github-actions Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Review: /goal agent-server endpoints

Taste Rating

Analysis

[IMPROVEMENT OPPORTUNITIES]

[TESTING GAPS]

[RISK ASSESSMENT]

VERDICT

KEY INSIGHT

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Baseline branch has no /goal endpoint

Test 2: Start /goal and drive it to completion

Test 3: Stop and resume a running goal

Test 4: Resume after server restart

Issues Found

Uh oh!

VascoSch92 commented Jun 18, 2026

Uh oh!

openhands-development Bot commented Jun 18, 2026

Uh oh!

openhands-ai Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VascoSch92 commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

openhands-ai Bot commented Jun 18, 2026

Uh oh!

all-hands-bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Baseline vs PR endpoint exposure

Test 2: HTTP start/stop/resume behavior

Test 3: Goal loop completion, verdict events, stop, and resume

Test 4: Resume after EventService recreation

Issues Found

Uh oh!

Uh oh!

Uh oh!

VascoSch92 commented Jun 17, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented Jun 17, 2026 •

edited

Loading

all-hands-bot commented Jun 17, 2026 •

edited

Loading

Code Review: `/goal` agent-server endpoints

Test 1: Baseline branch has no `/goal` endpoint

Test 2: Start `/goal` and drive it to completion

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

all-hands-bot commented Jun 18, 2026 •

edited

Loading