Release v1.29.0 by all-hands-bot · Pull Request #3787 · OpenHands/software-agent-sdk

all-hands-bot · 2026-06-18T12:47:53Z

HUMAN:

Cutting the v1.29.0 release. I verified the deprecation-deadline, Python/REST API breakage, and persisted-settings-compat gates plus the LLM/ACP/settings test suites pass locally before pushing.

AGENT:

Why

This is the v1.29.0 release PR. The Deprecation deadlines release gate fails once the project version reaches a feature's removed_in, so cutting 1.29.0 requires deleting every API surface scheduled for removal in 1.29.0. Two deprecated features were due:

The no-op _return_metrics / return_metrics parameter (deprecated in 1.24.0).
The acp_env field on the ACP agent/settings (deprecated in 1.24.0).

Summary

Bump all packages to 1.29.0 (via the Prepare Release workflow).
Remove _return_metrics / return_metrics from LLM.{completion,acompletion,responses,aresponses}, RouterLLM.completion, and the TestLLM doubles. Metrics are always available via LLMResponse.metrics.
Remove acp_env end-to-end: drop the field from ACPAgentSettings and ACPAgent (field + validators + serializers), delete ACPAgentSettings.resolve_acp_env(), simplify the ACP spawn-time env build (registry → os.environ precedence; file-secret materialisation and data-dir isolation no longer honour an acp_env pin), and drop acp_env from REDACT_ALL_VALUES_KEYS. Provide arbitrary ACP subprocess env vars via the conversation secrets channel instead.
Update tests, the v1 ACP persisted-settings golden fixture, the persisted-settings-compat generator, and docstrings / AGENTS.md accordingly.
No removed_in == 1.29.0 deprecations remain. The acp_env removal is sanctioned by both breakage gates (deprecated in the 1.28.1 baseline with removed_in reached).

Issue Number

N/A — routine release.

How to Test

From the rel-1.29.0 checkout:

uv run python .github/scripts/check_deprecations.py                                  # no overdue deadlines
uv run --with packaging python .github/scripts/check_sdk_api_breakage.py             # exit 0
uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py  # exit 0
uv run python .github/scripts/check_persisted_settings_compat.py                     # exit 0
uv run pytest tests/sdk tests/agent_server --ignore=tests/agent_server/stress -q     # green

Type

Notes

Downstream consumers (OpenHands, agent-canvas, typescript-client) that still reference acp_env will be updated separately; ACP acp_env storage was never used in production.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:579ad28-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-579ad28-python \
  ghcr.io/openhands/agent-server:579ad28-python

All tags pushed for this build

ghcr.io/openhands/agent-server:579ad28-golang-amd64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-golang-amd64
ghcr.io/openhands/agent-server:rel-1.29.0-golang-amd64
ghcr.io/openhands/agent-server:579ad28-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:579ad28-golang-arm64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-golang-arm64
ghcr.io/openhands/agent-server:rel-1.29.0-golang-arm64
ghcr.io/openhands/agent-server:579ad28-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:579ad28-java-amd64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-java-amd64
ghcr.io/openhands/agent-server:rel-1.29.0-java-amd64
ghcr.io/openhands/agent-server:579ad28-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:579ad28-java-arm64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-java-arm64
ghcr.io/openhands/agent-server:rel-1.29.0-java-arm64
ghcr.io/openhands/agent-server:579ad28-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:579ad28-python-amd64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-python-amd64
ghcr.io/openhands/agent-server:rel-1.29.0-python-amd64
ghcr.io/openhands/agent-server:579ad28-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:579ad28-python-arm64
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-python-arm64
ghcr.io/openhands/agent-server:rel-1.29.0-python-arm64
ghcr.io/openhands/agent-server:579ad28-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:579ad28-golang
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-golang
ghcr.io/openhands/agent-server:rel-1.29.0-golang
ghcr.io/openhands/agent-server:579ad28-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:579ad28-java
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-java
ghcr.io/openhands/agent-server:rel-1.29.0-java
ghcr.io/openhands/agent-server:579ad28-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:579ad28-python
ghcr.io/openhands/agent-server:579ad28ee95f84fda5af1579a79cdd653b3b016d-python
ghcr.io/openhands/agent-server:rel-1.29.0-python
ghcr.io/openhands/agent-server:579ad28-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 579ad28-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 579ad28-python-amd64) are also available if needed

github-actions · 2026-06-18T12:48:02Z

Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-06-18T12:48:02Z

Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-06-18T12:48:05Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-06-18T12:48:08Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-06-18T12:48:31Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-06-18T12:48:31Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

⚠️ QA Report: PASS WITH ISSUES

Release-version behavior works in the installed packages, agent-server endpoint, workflow default, and built artifacts, but the PR is not release-ready because the required deprecation-deadline CI gate is failing for v1.29.0.

Does this PR achieve its stated goal?

Partially. The PR successfully moves the user-visible package/runtime version surfaces from 1.28.0 to 1.29.0: editable installs report 1.29.0, the SDK banner imports cleanly, /server_info reports all OpenHands component versions as 1.29.0, run-eval defaults to v1.29.0, and uv build --all-packages creates 1.29.0 artifacts. However, the PR description’s release checklist includes fixing deprecation deadlines, and CI currently fails that gate with multiple APIs whose removal target is 1.29.0, so this PR does not fully prepare a mergeable release yet.

Phase	Result
Environment Setup	✅ `make build` completed successfully and installed the editable 1.29.0 packages.
CI Status	⚠️ Refreshed status: 1 failing (`Deprecation deadlines/check`), 18 successful, 16 pending, 14 skipped; I did not rerun CI.
Functional Verification	✅ Versioned package/runtime behavior works; release readiness has one blocking CI issue.

Functional Verification

Test 1: Installed package metadata and SDK import banner

Step 1 — Establish baseline on origin/main:
Ran cd /tmp/qa-sdk-main && uv sync --dev && uv run python - <<'PY' ...:

openhands-sdk=1.28.0
openhands-tools=1.28.0
openhands-workspace=1.28.0
openhands-agent-server=1.28.0
OpenHands SDK v1.28.0
sdk_imports=ok Agent LLM Tool

This shows the currently released branch exposes 1.28.0 through installed distribution metadata and the SDK runtime banner.

Step 2 — Apply the PR's changes:
Used the PR checkout at commit a3acac5b5b3f889f976f1759ac0a6915d2351655 on rel-1.29.0 and ran the same import/metadata check after make build.

Step 3 — Re-run with the PR in place:

openhands-sdk=1.29.0
openhands-tools=1.29.0
openhands-workspace=1.29.0
openhands-agent-server=1.29.0
OpenHands SDK v1.29.0
sdk_imports=ok Agent LLM Tool

This confirms a real SDK user importing the package sees 1.29.0 consistently across all four distributions.

Test 2: Agent server `/server_info` runtime versions

Step 1 — Establish baseline on origin/main:
Ran uv run python -m openhands.agent_server --host 127.0.0.1 --port 18081 and queried curl http://127.0.0.1:18081/server_info:

{
  "version": "1.28.0",
  "sdk_version": "1.28.0",
  "tools_version": "1.28.0",
  "workspace_version": "1.28.0"
}

This confirms the old server runtime reports 1.28.0 to API clients.

Step 2 — Apply the PR's changes:
Started the PR checkout’s server on a separate local port.

Step 3 — Re-run with the PR in place:
Ran uv run python -m openhands.agent_server --host 127.0.0.1 --port 18082 and queried curl http://127.0.0.1:18082/server_info:

{
  "version": "1.29.0",
  "sdk_version": "1.29.0",
  "tools_version": "1.29.0",
  "workspace_version": "1.29.0"
}

This confirms a real API client sees the agent-server and component versions as 1.29.0.

Test 3: `run-eval` workflow dispatch default

Step 1 — Establish baseline on origin/main:
Ran a YAML parse/extract check for .github/workflows/run-eval.yml:

workflow_yaml_parse=ok
run_eval_sdk_ref_default=v1.28.0

This shows the previous dispatch default pointed at the old release tag.

Step 2 — Apply the PR's changes:
Repeated the same YAML parse/extract check on the PR checkout.

Step 3 — Re-run with the PR in place:

workflow_yaml_parse=ok
run_eval_sdk_ref_default=v1.29.0

This confirms the workflow remains parseable and now defaults evaluations to v1.29.0.

Test 4: Release artifact build

Step 1 — Establish baseline:
The metadata checks above establish the pre-PR package version as 1.28.0.

Step 2 — Apply the PR's changes:
Built the PR artifacts with uv build --all-packages.

Step 3 — Verify built artifacts:

Successfully built dist/openhands_agent_server-1.29.0.tar.gz
Successfully built dist/openhands_agent_server-1.29.0-py3-none-any.whl
Successfully built dist/openhands_sdk-1.29.0.tar.gz
Successfully built dist/openhands_sdk-1.29.0-py3-none-any.whl
Successfully built dist/openhands_tools-1.29.0.tar.gz
Successfully built dist/openhands_tools-1.29.0-py3-none-any.whl
Successfully built dist/openhands_workspace-1.29.0.tar.gz
Successfully built dist/openhands_workspace-1.29.0-py3-none-any.whl

This confirms the release packaging path produces 1.29.0 wheels and sdists for all four packages.

CI evidence for the release-readiness issue

Fetched the failed Deprecation deadlines/check log with gh run view 27760514092 --repo OpenHands/software-agent-sdk --log-failed:

The following deprecated features have passed their removal deadline:

- [openhands-sdk] 'ACPAgentSettings.acp_env' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'ACPAgent.acp_env' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'LLM.completion(_return_metrics=...)' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'LLM.acompletion(_return_metrics=...)' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'LLM.responses(_return_metrics=...)' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'LLM.aresponses(_return_metrics=...)' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

- [openhands-sdk] 'RouterLLM.completion(return_metrics=...)' (warn_call)
  deprecated in: 1.24.0
  removed in:    1.29.0

Update or remove the listed features before publishing a version that meets or exceeds their removal deadline.

Issues Found

🟠 Issue: The release metadata/runtime behavior verifies correctly, but Deprecation deadlines/check is failing because several APIs have removed in: 1.29.0. This blocks the PR from fully achieving “prepare the release for version 1.29.0” until those deadlines are fixed or retargeted.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

github-actions · 2026-06-18T12:54:57Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
api.py	276	26	90%	109, 111–116, 118, 120, 122, 157, 169, 184, 190, 243, 248, 257–259, 503, 506, 510–512, 514, 521
settings_router.py	125	8	93%	262, 264–265, 366, 368–369, 407, 412
openhands-agent-server/openhands/agent_server/persistence
models.py	199	26	86%	281, 286, 324, 372, 407–413, 415, 417, 420, 424, 450, 467–468, 513, 517, 519, 548–551, 554
openhands-sdk/openhands/sdk/agent
acp_agent.py	1233	102	91%	553, 686, 901–903, 946–947, 1249–1250, 1293, 1295, 1299, 1303, 1329, 1392–1393, 1398, 1465, 1756, 1759–1760, 1777–1778, 1814, 1819, 1927, 1932, 2506–2509, 2513–2515, 2518–2522, 2524, 2772, 2786–2787, 2790–2792, 2800, 2804, 2808–2809, 2815–2816, 2828–2829, 2832, 2880, 2884–2886, 2890–2891, 2923, 3007, 3194–3196, 3199–3200, 3240, 3386, 3394–3396, 3434–3435, 3438, 3446–3448, 3450, 3452, 3456, 3459, 3468–3470, 3472, 3508–3509, 3527–3530, 3533, 3537–3539, 3541, 3545–3546, 3776–3777
openhands-sdk/openhands/sdk/context
agent_context.py	162	6	96%	337, 409–410, 478, 501, 507
openhands-sdk/openhands/sdk/llm
llm.py	943	116	87%	555, 571, 610–611, 616, 702, 718, 885, 919–920, 923–927, 929, 937–939, 943, 960–961, 965, 967–968, 970–972, 1102, 1225, 1408–1410, 1510, 1551, 1563–1565, 1568–1571, 1577, 1638, 1686, 1699–1701, 1704–1707, 1713, 1810, 1812, 1814, 1840, 1842, 1851–1852, 1902, 1965–1970, 2040, 2182–2183, 2524–2525, 2534, 2552, 2579–2580, 2582, 2584, 2586, 2594, 2597, 2599, 2601, 2612–2613, 2621, 2624, 2627–2628, 2639–2641, 2645, 2649–2650, 2655, 2665, 2670, 2734, 2736, 2738–2741, 2743–2746, 2751–2754, 2769, 2780, 2837, 2839
openhands-sdk/openhands/sdk/llm/router
base.py	42	7	83%	45, 74–75, 77, 80, 112, 119
openhands-sdk/openhands/sdk/settings
model.py	691	48	93%	101, 400, 418, 598, 608–611, 614, 627, 631, 637, 647, 653, 658, 881, 906, 908, 910, 912, 914, 916, 918, 920, 922, 1187, 1189, 1533, 1553, 1714, 1843, 1882, 1908, 2044–2046, 2048, 2102, 2134, 2144, 2146, 2151, 2169, 2182, 2184, 2186, 2188, 2195
openhands-sdk/openhands/sdk/testing
test_llm.py	67	4	94%	180, 190, 250, 324
openhands-sdk/openhands/sdk/utils
redact.py	88	14	84%	87, 226–227, 250–256, 272–275
TOTAL	33437	6814	79%

github-actions · 2026-06-18T12:59:49Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-06-18 13:21:52 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	24.1s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	23.0s	$0.03
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	8.8s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	35.1s	$0.03
01_standalone_sdk/09_pause_example.py	✅ PASS	10.9s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	23.9s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	31.9s	$0.03
01_standalone_sdk/12_custom_secrets.py	✅ PASS	13.3s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	36.9s	$0.05
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 17s	$0.16
01_standalone_sdk/17_image_input.py	✅ PASS	19.9s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	25.8s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	16.3s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	12.7s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	9.8s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	13.0s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 6s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	4m 51s	$0.34
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 3s	$0.07
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	17.8s	$0.03
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	36.7s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	38.7s	$0.02
01_standalone_sdk/30_tom_agent.py	✅ PASS	8.9s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	4m 57s	$0.35
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	24.9s	$0.02
01_standalone_sdk/33_hooks/main.py	✅ PASS	39.0s	$0.04
01_standalone_sdk/34_critic_example.py	✅ PASS	9m 16s	$0.74
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	13.0s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	9.5s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	42.3s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	9.7s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	34.5s	$0.31
01_standalone_sdk/41_task_tool_set.py	✅ PASS	32.4s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	52.3s	$0.05
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	3.5s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	9.0s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	8m 4s	$0.62
01_standalone_sdk/46_agent_settings.py	✅ PASS	9.7s	$0.00
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.3s	$0.00
01_standalone_sdk/48_conversation_fork.py	✅ PASS	13.9s	$0.00
01_standalone_sdk/49_switch_llm_tool.py	✅ PASS	8.4s	$0.03
01_standalone_sdk/50_async_cancellation.py	✅ PASS	12.3s	$0.00
01_standalone_sdk/51_agent_hooks/main.py	✅ PASS	44.2s	$0.06
01_standalone_sdk/52_dynamic_workflow.py	✅ PASS	4m 15s	$0.15
01_standalone_sdk/53_client_defined_tools.py	✅ PASS	11.1s	$0.01
01_standalone_sdk/54_goal_completion_loop.py	✅ PASS	27.4s	$0.03
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	37.8s	$0.02
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 47s	$0.05
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	1m 25s	$0.10
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 21s	$0.04
02_remote_agent_server/06_custom_tool/main.py	✅ PASS	5m 42s	$0.06
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	38.4s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	4m 10s	$0.03
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	1m 2s	$0.32
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	40.3s	$0.06
02_remote_agent_server/11_conversation_fork.py	✅ PASS	51.3s	$0.00
02_remote_agent_server/12_settings_and_secrets_api.py	✅ PASS	2m 51s	$0.04
02_remote_agent_server/13_workspace_get_llm.py	✅ PASS	50.1s	$0.03
02_remote_agent_server/14_client_defined_tools.py	✅ PASS	1m 4s	$0.04
02_remote_agent_server/15_openai_compatible_gateway.py	✅ PASS	17.7s	$0.01
02_remote_agent_server/16_deferred_init.py	❌ FAIL Exit code 1	3m 9s	--
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	28.8s	$0.02
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	1m 51s	$0.06
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	12.7s	$0.02
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	14.9s	$0.02

❌ Some tests failed

Total: 65 | Passed: 64 | Failed: 1 | Total Cost: $4.45

Failed examples:

examples/02_remote_agent_server/16_deferred_init.py: Exit code 1

View full workflow run

github-actions · 2026-06-18T12:59:57Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-06-18 13:23:08 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	34.6s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	21.4s	$0.02
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	9.7s	$0.00
01_standalone_sdk/07_mcp_integration.py	✅ PASS	33.5s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	11.2s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	26.2s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	31.7s	$0.03
01_standalone_sdk/12_custom_secrets.py	✅ PASS	14.3s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	30.5s	$0.04
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 8s	$0.13
01_standalone_sdk/17_image_input.py	✅ PASS	26.5s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	18.4s	$0.01
01_standalone_sdk/19_llm_routing.py	✅ PASS	21.6s	$0.01
01_standalone_sdk/20_stuck_detector.py	✅ PASS	13.5s	$0.01
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.7s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	14.8s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	38.9s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	6m 14s	$0.47
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 13s	$0.06
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	20.0s	$0.03
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	41.6s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	43.3s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	11.1s	$0.00
01_standalone_sdk/31_iterative_refinement.py	❌ FAIL Timed out after 600 seconds	10m 0s	--
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	20.6s	$0.01
01_standalone_sdk/33_hooks/main.py	✅ PASS	40.7s	$0.04
01_standalone_sdk/34_critic_example.py	✅ PASS	3m 29s	$0.18
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	10.7s	$0.00
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	5.8s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	40.8s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	14.3s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	32.6s	$0.31
01_standalone_sdk/41_task_tool_set.py	✅ PASS	32.0s	$0.02
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	36.3s	$0.04
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	7.4s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	7.8s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	7m 39s	$0.65
01_standalone_sdk/46_agent_settings.py	✅ PASS	10.1s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.3s	$0.00
01_standalone_sdk/48_conversation_fork.py	✅ PASS	14.1s	$0.00
01_standalone_sdk/49_switch_llm_tool.py	✅ PASS	7.2s	$0.03
01_standalone_sdk/50_async_cancellation.py	✅ PASS	12.7s	$0.00
01_standalone_sdk/51_agent_hooks/main.py	✅ PASS	54.7s	$0.06
01_standalone_sdk/52_dynamic_workflow.py	✅ PASS	6m 57s	$0.22
01_standalone_sdk/53_client_defined_tools.py	✅ PASS	13.6s	$0.01
01_standalone_sdk/54_goal_completion_loop.py	✅ PASS	54.0s	$0.04
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	38.5s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 44s	$0.04
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	1m 1s	--
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 45s	$0.03
02_remote_agent_server/06_custom_tool/main.py	✅ PASS	5m 25s	$0.04
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	53.7s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 55s	$0.02
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	1m 10s	$0.18
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	36.4s	$0.03
02_remote_agent_server/11_conversation_fork.py	✅ PASS	56.9s	$0.00
02_remote_agent_server/12_settings_and_secrets_api.py	✅ PASS	2m 23s	$0.02
02_remote_agent_server/13_workspace_get_llm.py	✅ PASS	41.0s	$0.01
02_remote_agent_server/14_client_defined_tools.py	✅ PASS	42.9s	$0.02
02_remote_agent_server/15_openai_compatible_gateway.py	✅ PASS	13.9s	$0.00
02_remote_agent_server/16_deferred_init.py	❌ FAIL Exit code 1	2m 30s	--
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	37.0s	$0.03
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	1m 43s	$0.07
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	16.2s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	15.3s	$0.02

❌ Some tests failed

Total: 65 | Passed: 63 | Failed: 2 | Total Cost: $3.25

Failed examples:

examples/01_standalone_sdk/31_iterative_refinement.py: Timed out after 600 seconds
examples/02_remote_agent_server/16_deferred_init.py: Exit code 1

View full workflow run

Co-authored-by: openhands <openhands@all-hands.dev>

Both features reached their scheduled removal version in 1.29.0: - Drop the no-op _return_metrics/return_metrics parameter from LLM.{completion,acompletion,responses,aresponses}, RouterLLM.completion, and the TestLLM doubles. Metrics are always on LLMResponse.metrics. - Remove the acp_env field end-to-end: ACPAgentSettings and ACPAgent (field/validators/serializers), ACPAgentSettings.resolve_acp_env(), the ACP spawn-time env-injection/precedence logic, and the REDACT_ALL_VALUES_KEYS entry. Arbitrary ACP subprocess env vars now ride the conversation secrets channel. Updates tests, the v1 ACP persisted-settings golden fixture, the persisted-settings-compat generator, and docstrings/AGENTS.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

enyst · 2026-06-18T13:49:09Z

@enyst @simonrosenberg @VascoSch92 — manual REST API contract summary for the PRs discussed.

This is the same kind of concise public /api/** OpenAPI contract diff that PR #3789 is adding to PR descriptions. Issue #3790 tracks redesigning that automation as a safe two-stage workflow so fork PR descriptions can be updated too.

This comment was created by an AI agent (OpenHands) on behalf of the user.

Status at a glance

[AgentProfile][agent-server] agent_profile_id at conversation start + LaunchedProfile provenance #3784 is merged and adds agent_profile_id on conversation start plus launched-profile provenance on ConversationInfo.
feat: API design: launched agent profile provenance names #3788 is merged and renames that provenance API to the more explicit agent-profile wording (launched_agent_profile, LaunchedAgentProfile, agent_profile_id).
feat(agent-server): add /goal agent-server endpoint, background loop, and stop/resume #3770 is still open and adds the /goal conversation endpoints. It is not part of the release branch unless it merges and the release branch is updated.
Release v1.29.0 #3787 currently appears stale relative to latest main: its PR contract diff would remove the feat: API design: launched agent profile provenance names #3788 API names from the release branch and restore the older [AgentProfile][agent-server] agent_profile_id at conversation start + LaunchedProfile provenance #3784 names. It also drops ACPAgent.acp_env from the public schema. If feat: API design: launched agent profile provenance names #3788 should be in this release, the release branch likely needs to be regenerated/rebased before cutting.

#3784 — agent profile at conversation start

--- PR #3784 base public OpenAPI
+++ PR #3784 head public OpenAPI
@@ -978,0 +979 @@
+schema ConversationInfo property launched_profile optional schema=anyOf=[LaunchedProfile,type="null"]
@@ -1469,0 +1471,3 @@
+schema LaunchedProfile property profile_id required schema=type="string" format="uuid"
+schema LaunchedProfile property revision required schema=type="integer" minimum=0.0
+schema LaunchedProfile type="object"
@@ -1863,0 +1868 @@
+schema StartConversationRequest property agent_profile_id optional schema=anyOf=[type="string" format="uuid",type="null"]

#3788 — clarify launched agent profile provenance names

--- PR #3788 base public OpenAPI
+++ PR #3788 head public OpenAPI
@@ -979 +979 @@
-schema ConversationInfo property launched_profile optional schema=anyOf=[LaunchedProfile,type="null"]
+schema ConversationInfo property launched_agent_profile optional schema=anyOf=[LaunchedAgentProfile,type="null"]
@@ -1471,3 +1471,3 @@
-schema LaunchedProfile property profile_id required schema=type="string" format="uuid"
-schema LaunchedProfile property revision required schema=type="integer" minimum=0.0
-schema LaunchedProfile type="object"
+schema LaunchedAgentProfile property agent_profile_id required schema=type="string" format="uuid"
+schema LaunchedAgentProfile property revision required schema=type="integer" minimum=0.0
+schema LaunchedAgentProfile type="object"

#3770 — goal conversation endpoints

--- PR #3770 base public OpenAPI
+++ PR #3770 head public OpenAPI
@@ -69,0 +70,3 @@
+operation POST /api/conversations/{conversation_id}/goal operationId=start_goal_conversation_api_conversations__conversation_id__goal_post
+operation POST /api/conversations/{conversation_id}/goal/resume operationId=resume_goal_conversation_api_conversations__conversation_id__goal_resume_post
+operation POST /api/conversations/{conversation_id}/goal/stop operationId=stop_goal_conversation_api_conversations__conversation_id__goal_stop_post
@@ -173,0 +177,3 @@
+parameter POST /api/conversations/{conversation_id}/goal path:conversation_id required=true schema=type="string" format="uuid"
+parameter POST /api/conversations/{conversation_id}/goal/resume path:conversation_id required=true schema=type="string" format="uuid"
+parameter POST /api/conversations/{conversation_id}/goal/stop path:conversation_id required=true schema=type="string" format="uuid"
@@ -201,0 +208 @@
+requestBody POST /api/conversations/{conversation_id}/goal application/json required=true schema=StartGoalRequest
@@ -356,0 +364,11 @@
+response POST /api/conversations/{conversation_id}/goal 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal 404 no-content
+response POST /api/conversations/{conversation_id}/goal 409 no-content
+response POST /api/conversations/{conversation_id}/goal 422 application/json schema=HTTPValidationError
+response POST /api/conversations/{conversation_id}/goal/resume 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal/resume 404 no-content
+response POST /api/conversations/{conversation_id}/goal/resume 409 no-content
+response POST /api/conversations/{conversation_id}/goal/resume 422 application/json schema=HTTPValidationError
+response POST /api/conversations/{conversation_id}/goal/stop 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal/stop 404 no-content
+response POST /api/conversations/{conversation_id}/goal/stop 422 application/json schema=HTTPValidationError
@@ -1890,0 +1909,3 @@
+schema StartGoalRequest property max_iterations optional schema=type="integer" default=10 minimum=1.0
+schema StartGoalRequest property objective required schema=type="string"
+schema StartGoalRequest type="object"

#3787 current release PR diff vs latest `main`

--- PR #3787 base public OpenAPI
+++ PR #3787 head public OpenAPI
@@ -421 +420,0 @@
-schema ACPAgent-Input property acp_env optional schema=type="object" additionalProperties=type="string"
@@ -446 +444,0 @@
-schema ACPAgent-Output property acp_env optional schema=type="object" additionalProperties=type="string"
@@ -979 +977 @@
-schema ConversationInfo property launched_agent_profile optional schema=anyOf=[LaunchedAgentProfile,type="null"]
+schema ConversationInfo property launched_profile optional schema=anyOf=[LaunchedProfile,type="null"]
@@ -1471,3 +1469,3 @@
-schema LaunchedAgentProfile property agent_profile_id required schema=type="string" format="uuid"
-schema LaunchedAgentProfile property revision required schema=type="integer" minimum=0.0
-schema LaunchedAgentProfile type="object"
+schema LaunchedProfile property profile_id required schema=type="string" format="uuid"
+schema LaunchedProfile property revision required schema=type="integer" minimum=0.0
+schema LaunchedProfile type="object"

enyst · 2026-06-18T14:03:10Z

Can we also pick d7392de , though please note that my agent is looking into something failing on github actions?

github-actions · 2026-06-18T14:51:34Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-06-18T14:57:23Z

🧪 Integration Tests Results

Overall Success Rate: 97.1%
Total Cost: $1.52
Models Tested: 4
Timestamp: 2026-06-18 14:57:14 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_minimax_MiniMax_M2.7: 📥 View & Download Logs
litellm_proxy_openai_gpt_5.5: 📥 View & Download Logs
litellm_proxy_gemini_3.1_pro_preview: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_v4_flash: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_minimax_MiniMax_M2.7	100.0%	8/8	1	9	$0.00	304,554
litellm_proxy_openai_gpt_5.5	100.0%	9/9	0	9	$0.94	263,057
litellm_proxy_gemini_3.1_pro_preview	88.9%	8/9	0	9	$0.56	325,638
litellm_proxy_deepseek_deepseek_v4_flash	100.0%	8/8	1	9	$0.02	415,253

📋 Detailed Results

litellm_proxy_minimax_MiniMax_M2.7

Success Rate: 100.0% (8/8)
Total Cost: $0.00
Token Usage: prompt: 300,618, completion: 3,936, cache_read: 222,479, reasoning: 817
Run Suffix: litellm_proxy_minimax_MiniMax_M2.7_884c996_minimax_m2_7_run_N9_20260618_145313
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_openai_gpt_5.5

Success Rate: 100.0% (9/9)
Total Cost: $0.94
Token Usage: prompt: 258,788, completion: 4,269, cache_read: 107,520, reasoning: 1,313
Run Suffix: litellm_proxy_openai_gpt_5.5_884c996_gpt_5_5_run_N9_20260618_145408

litellm_proxy_gemini_3.1_pro_preview

Success Rate: 88.9% (8/9)
Total Cost: $0.56
Token Usage: prompt: 321,455, completion: 4,183, cache_read: 73,476, reasoning: 2,580
Run Suffix: litellm_proxy_gemini_3.1_pro_preview_884c996_gemini_3_1_pro_run_N9_20260618_145331

Failed Tests:

t08_image_file_viewing: Agent did not identify yellow color in the logo. Response: . (Cost: $0.03)

litellm_proxy_deepseek_deepseek_v4_flash

Success Rate: 100.0% (8/8)
Total Cost: $0.02
Token Usage: prompt: 410,303, completion: 4,950, cache_read: 228,224, reasoning: 1,631
Run Suffix: litellm_proxy_deepseek_deepseek_v4_flash_884c996_deepseek_v4_flash_run_N9_20260618_145322
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

github-actions · 2026-06-18T14:59:32Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-06-18 15:19:40 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	24.3s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	20.7s	$0.03
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	9.5s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	42.2s	$0.03
01_standalone_sdk/09_pause_example.py	✅ PASS	14.1s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	24.9s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	33.9s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	12.3s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	24.9s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 15s	$0.15
01_standalone_sdk/17_image_input.py	✅ PASS	22.4s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	29.5s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	18.2s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	15.3s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	9.6s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	20.8s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 29s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	3m 18s	$0.25
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 12s	$0.08
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	17.3s	$0.02
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	40.8s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	47.4s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	9.0s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	5m 39s	$0.40
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	17.5s	$0.02
01_standalone_sdk/33_hooks/main.py	✅ PASS	39.2s	$0.04
01_standalone_sdk/34_critic_example.py	✅ PASS	8m 11s	$0.87
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	10.7s	$0.00
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	5.7s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	37.7s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	10.1s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	38.4s	$0.30
01_standalone_sdk/41_task_tool_set.py	✅ PASS	27.6s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	32.3s	$0.04
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	3.7s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	8.9s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	4m 5s	$0.30
01_standalone_sdk/46_agent_settings.py	✅ PASS	10.7s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.4s	$0.00
01_standalone_sdk/48_conversation_fork.py	✅ PASS	11.9s	$0.00
01_standalone_sdk/49_switch_llm_tool.py	✅ PASS	7.9s	$0.03
01_standalone_sdk/50_async_cancellation.py	✅ PASS	12.8s	$0.00
01_standalone_sdk/51_agent_hooks/main.py	✅ PASS	41.0s	$0.05
01_standalone_sdk/52_dynamic_workflow.py	✅ PASS	6m 38s	$0.17
01_standalone_sdk/53_client_defined_tools.py	✅ PASS	13.5s	$0.01
01_standalone_sdk/54_goal_completion_loop.py	✅ PASS	31.3s	$0.03
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	36.5s	$0.02
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 39s	$0.04
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	56.4s	$0.05
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 40s	$0.03
02_remote_agent_server/06_custom_tool/main.py	✅ PASS	5m 4s	$0.04
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	36.1s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	4m 14s	$0.03
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	1m 6s	$0.35
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	45.2s	$0.06
02_remote_agent_server/11_conversation_fork.py	✅ PASS	51.3s	$0.00
02_remote_agent_server/12_settings_and_secrets_api.py	✅ PASS	2m 50s	$0.04
02_remote_agent_server/13_workspace_get_llm.py	✅ PASS	34.8s	$0.02
02_remote_agent_server/14_client_defined_tools.py	✅ PASS	1m 7s	$0.04
02_remote_agent_server/15_openai_compatible_gateway.py	✅ PASS	17.9s	$0.01
02_remote_agent_server/16_deferred_init.py	❌ FAIL Exit code 1	2m 27s	--
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	21.3s	$0.02
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	1m 6s	$0.09
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	14.0s	$0.02
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	17.4s	$0.02

❌ Some tests failed

Total: 65 | Passed: 64 | Failed: 1 | Total Cost: $4.19

Failed examples:

examples/02_remote_agent_server/16_deferred_init.py: Exit code 1

View full workflow run

…1.29.0) (#3795)

enyst · 2026-06-18T15:23:07Z

@OpenHands look at the last comment on running test examples workflow and how it works. run the failed example yourself now, same llm, you can use eval proxy and eval key though. does it work now? on latest version of the branch. then /codereview this release pr. reply on github.

openhands-development · 2026-06-18T15:23:12Z

@enyst it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again.

openhands-ai · 2026-06-18T15:23:22Z

I'm on it! enyst can track my progress at all-hands.dev

enyst

@enyst I checked the latest Run Examples comment and reproduced the failed example locally on the latest branch head (b86ba42cab5150c0a979f35b0d63486dc0f3da19).

Local reproduction command (same workflow model, eval proxy/key):

env -u TMUX_TMPDIR -u TMUX -u SESSION_API_KEY -u OH_SESSION_API_KEYS_0 \
  LLM_API_KEY="$LITELLM_API_KEY" \
  LLM_MODEL="openhands/claude-haiku-4-5-20251001" \
  LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  ANTHROPIC_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  ANTHROPIC_API_KEY="$LITELLM_API_KEY" \
  OPENAI_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  OPENAI_API_KEY="$LITELLM_API_KEY" \
  uv run pytest tests/examples/test_examples.py \
    --run-examples \
    --examples-results-dir .example-test-results-local \
    -k '16_deferred_init' -s

Result: still fails. The deferred-init flow itself reaches ready state and runs the conversation, but the example times out its polling loop because the server reports execution_status: "finished" and the example only breaks/accepts ("stopped", "paused", "error") / asserts ("stopped", "paused"):

AssertionError: Unexpected final status: finished

That matches the workflow log for Run Examples Scripts run 27768040721, which also failed with Unexpected final status: finished.

/codereview

🟡 Taste Rating: Acceptable - The release cleanup is mostly straightforward and removes deadline-expired compatibility code, but the PR is not release-ready because required release validation is failing/stale.

[CRITICAL ISSUES]

None found in the release cleanup itself during this pass.

[TESTING GAPS]

[PR validation] Release workflow validation is not green/current: per this repo's release review policy, release PRs need current PR-specific validation for Run tests, Run Examples Scripts, and Run Integration Tests before approval. Current state:
- Run tests: ✅ current on b86ba42cab5150c0a979f35b0d63486dc0f3da19.
- Run Examples Scripts: ❌ latest PR comment/run fails examples/02_remote_agent_server/16_deferred_init.py; I reproduced the same failure locally on current HEAD with the requested model/proxy.
- Run Integration Tests: ambiguous/stale for current HEAD. Latest visible result comment is for run 27768049468 on older head 884c996f9ad515c66a7c9b9440ea9cf0a17938e6 and reports 97.1% with a Gemini t08_image_file_viewing failure, even though the GitHub workflow conclusion is success. I do not see a current green integration result for b86ba42 in the PR comments.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
This is an intentional breaking release PR that removes deprecated public API surface (_return_metrics/return_metrics, acp_env). The code direction is reasonable for a release deadline cleanup, and the breakage gates are reported green, but failed/stale release workflow evidence makes auto-approval unsafe.

VERDICT:
❌ Needs rework / human maintainer decision before merge: fix or intentionally waive the failing 16_deferred_init.py example, then rerun Run Examples Scripts; also provide a current unambiguous Run Integration Tests result for the latest head.

KEY INSIGHT:
The blocking issue is not the release cleanup mechanics; it is that the release validation evidence does not currently prove the latest branch is shippable.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This PR review was created by an AI agent (OpenHands) on behalf of the requester.

…3793) (cherry picked from commit 9e29079) Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2026-06-18T16:03:21Z

@enyst confirmed — I cherry-picked the squashed main commit for #3793 onto rel-1.29.0, and nothing else.

Details:

Source commit from main: 9e290793addcf7397bce315b8c85803a8b70f804
New release-branch commit: 579ad28ee95f84fda5af1579a79cdd653b3b016d
Only changed file:
- examples/02_remote_agent_server/16_deferred_init.py

Verification run after cherry-pick:

uv run pre-commit run --files examples/02_remote_agent_server/16_deferred_init.py

Passed.

I also reran the previously failing example with the same workflow model and eval proxy/key:

env -u TMUX_TMPDIR -u TMUX -u SESSION_API_KEY -u OH_SESSION_API_KEYS_0 \
  LLM_API_KEY="$LITELLM_API_KEY" \
  LLM_MODEL="openhands/claude-haiku-4-5-20251001" \
  LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  ANTHROPIC_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  ANTHROPIC_API_KEY="$LITELLM_API_KEY" \
  OPENAI_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
  OPENAI_API_KEY="$LITELLM_API_KEY" \
  uv run pytest tests/examples/test_examples.py \
    --run-examples \
    --examples-results-dir .example-test-results-local \
    -k '16_deferred_init' -q

Result:

1 passed, 65 deselected, 5 warnings in 46.08s

This update/comment was created by an AI agent (OpenHands) on behalf of the requester.

enyst

Thank you all!

all-hands-bot added integration-test Runs the integration tests and comments the results test-examples Run all applicable "examples/" files. Expensive operation. behavior-test labels Jun 18, 2026

all-hands-bot commented Jun 18, 2026

View reviewed changes

Comment thread openhands-sdk/pyproject.toml

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

github-actions Bot and others added 2 commits June 18, 2026 15:41

Release v1.29.0

a7977ff

Co-authored-by: openhands <openhands@all-hands.dev>

simonrosenberg force-pushed the rel-1.29.0 branch from 45f0a62 to 2bedbd8 Compare June 18, 2026 13:42

This comment was marked as outdated.

Sign in to view

Merge branch 'main' into rel-1.29.0

884c996

VascoSch92 mentioned this pull request Jun 18, 2026

Fix 16_deferred_init example: poll on real execution-status values #3793

Merged

VascoSch92 mentioned this pull request Jun 18, 2026

Fix _GatedLLM.completion override after _return_metrics removal (rel-1.29.0) #3795

Merged

Fix _GatedLLM.completion override after _return_metrics removal (rel-…

b86ba42

…1.29.0) (#3795)

simonrosenberg mentioned this pull request Jun 18, 2026

[AgentProfile] Epic: first-class Agent Profiles (named launch setups) across SDK, agent-server, canvas, cloud #3713

Open

14 tasks

enyst reviewed Jun 18, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

Fix 16_deferred_init example: poll on real execution-status values (#…

579ad28

…3793) (cherry picked from commit 9e29079) Co-authored-by: openhands <openhands@all-hands.dev>

enyst approved these changes Jun 18, 2026

View reviewed changes

enyst enabled auto-merge (squash) June 18, 2026 16:08

enyst merged commit f4feb8f into main Jun 18, 2026
32 of 33 checks passed

enyst deleted the rel-1.29.0 branch June 18, 2026 16:10

Conversation

all-hands-bot commented Jun 18, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Issue Number

How to Test

Type

Notes

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

⚠️ QA Report: PASS WITH ISSUES

Does this PR achieve its stated goal?

Test 1: Installed package metadata and SDK import banner

Test 2: Agent server /server_info runtime versions

Test 3: run-eval workflow dispatch default

Test 4: Release artifact build

Issues Found

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

This comment was marked as resolved.

enyst commented Jun 18, 2026

Status at a glance

#3784 — agent profile at conversation start

#3788 — clarify launched agent profile provenance names

#3770 — goal conversation endpoints

#3787 current release PR diff vs latest main

Uh oh!

enyst commented Jun 18, 2026

Uh oh!

This comment was marked as outdated.

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_minimax_MiniMax_M2.7

litellm_proxy_openai_gpt_5.5

litellm_proxy_gemini_3.1_pro_preview

litellm_proxy_deepseek_deepseek_v4_flash

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

all-hands-bot commented Jun 18, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Test 2: Agent server `/server_info` runtime versions

Test 3: `run-eval` workflow dispatch default

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions Bot commented Jun 18, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

#3787 current release PR diff vs latest `main`

github-actions Bot commented Jun 18, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`