Release v1.29.0#3787
Conversation
|
Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly. |
1 similar comment
|
Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly. |
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
1 similar comment
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
Release-version behavior works in the installed packages, agent-server endpoint, workflow default, and built artifacts, but the PR is not release-ready because the required deprecation-deadline CI gate is failing for v1.29.0.
Does this PR achieve its stated goal?
Partially. The PR successfully moves the user-visible package/runtime version surfaces from 1.28.0 to 1.29.0: editable installs report 1.29.0, the SDK banner imports cleanly, /server_info reports all OpenHands component versions as 1.29.0, run-eval defaults to v1.29.0, and uv build --all-packages creates 1.29.0 artifacts. However, the PR description’s release checklist includes fixing deprecation deadlines, and CI currently fails that gate with multiple APIs whose removal target is 1.29.0, so this PR does not fully prepare a mergeable release yet.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the editable 1.29.0 packages. |
| CI Status | Deprecation deadlines/check), 18 successful, 16 pending, 14 skipped; I did not rerun CI. |
| Functional Verification | ✅ Versioned package/runtime behavior works; release readiness has one blocking CI issue. |
Functional Verification
Test 1: Installed package metadata and SDK import banner
Step 1 — Establish baseline on origin/main:
Ran cd /tmp/qa-sdk-main && uv sync --dev && uv run python - <<'PY' ...:
openhands-sdk=1.28.0
openhands-tools=1.28.0
openhands-workspace=1.28.0
openhands-agent-server=1.28.0
OpenHands SDK v1.28.0
sdk_imports=ok Agent LLM Tool
This shows the currently released branch exposes 1.28.0 through installed distribution metadata and the SDK runtime banner.
Step 2 — Apply the PR's changes:
Used the PR checkout at commit a3acac5b5b3f889f976f1759ac0a6915d2351655 on rel-1.29.0 and ran the same import/metadata check after make build.
Step 3 — Re-run with the PR in place:
openhands-sdk=1.29.0
openhands-tools=1.29.0
openhands-workspace=1.29.0
openhands-agent-server=1.29.0
OpenHands SDK v1.29.0
sdk_imports=ok Agent LLM Tool
This confirms a real SDK user importing the package sees 1.29.0 consistently across all four distributions.
Test 2: Agent server /server_info runtime versions
Step 1 — Establish baseline on origin/main:
Ran uv run python -m openhands.agent_server --host 127.0.0.1 --port 18081 and queried curl http://127.0.0.1:18081/server_info:
{
"version": "1.28.0",
"sdk_version": "1.28.0",
"tools_version": "1.28.0",
"workspace_version": "1.28.0"
}This confirms the old server runtime reports 1.28.0 to API clients.
Step 2 — Apply the PR's changes:
Started the PR checkout’s server on a separate local port.
Step 3 — Re-run with the PR in place:
Ran uv run python -m openhands.agent_server --host 127.0.0.1 --port 18082 and queried curl http://127.0.0.1:18082/server_info:
{
"version": "1.29.0",
"sdk_version": "1.29.0",
"tools_version": "1.29.0",
"workspace_version": "1.29.0"
}This confirms a real API client sees the agent-server and component versions as 1.29.0.
Test 3: run-eval workflow dispatch default
Step 1 — Establish baseline on origin/main:
Ran a YAML parse/extract check for .github/workflows/run-eval.yml:
workflow_yaml_parse=ok
run_eval_sdk_ref_default=v1.28.0
This shows the previous dispatch default pointed at the old release tag.
Step 2 — Apply the PR's changes:
Repeated the same YAML parse/extract check on the PR checkout.
Step 3 — Re-run with the PR in place:
workflow_yaml_parse=ok
run_eval_sdk_ref_default=v1.29.0
This confirms the workflow remains parseable and now defaults evaluations to v1.29.0.
Test 4: Release artifact build
Step 1 — Establish baseline:
The metadata checks above establish the pre-PR package version as 1.28.0.
Step 2 — Apply the PR's changes:
Built the PR artifacts with uv build --all-packages.
Step 3 — Verify built artifacts:
Successfully built dist/openhands_agent_server-1.29.0.tar.gz
Successfully built dist/openhands_agent_server-1.29.0-py3-none-any.whl
Successfully built dist/openhands_sdk-1.29.0.tar.gz
Successfully built dist/openhands_sdk-1.29.0-py3-none-any.whl
Successfully built dist/openhands_tools-1.29.0.tar.gz
Successfully built dist/openhands_tools-1.29.0-py3-none-any.whl
Successfully built dist/openhands_workspace-1.29.0.tar.gz
Successfully built dist/openhands_workspace-1.29.0-py3-none-any.whl
This confirms the release packaging path produces 1.29.0 wheels and sdists for all four packages.
CI evidence for the release-readiness issue
Fetched the failed Deprecation deadlines/check log with gh run view 27760514092 --repo OpenHands/software-agent-sdk --log-failed:
The following deprecated features have passed their removal deadline:
- [openhands-sdk] 'ACPAgentSettings.acp_env' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'ACPAgent.acp_env' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'LLM.completion(_return_metrics=...)' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'LLM.acompletion(_return_metrics=...)' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'LLM.responses(_return_metrics=...)' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'LLM.aresponses(_return_metrics=...)' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
- [openhands-sdk] 'RouterLLM.completion(return_metrics=...)' (warn_call)
deprecated in: 1.24.0
removed in: 1.29.0
Update or remove the listed features before publishing a version that meets or exceeds their removal deadline.
Issues Found
- 🟠 Issue: The release metadata/runtime behavior verifies correctly, but
Deprecation deadlines/checkis failing because several APIs haveremoved in: 1.29.0. This blocks the PR from fully achieving “prepare the release for version 1.29.0” until those deadlines are fixed or retargeted.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 24.1s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 23.0s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 8.8s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 35.1s | $0.03 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 10.9s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 23.9s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 31.9s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 13.3s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 36.9s | $0.05 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 17s | $0.16 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 19.9s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 25.8s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 16.3s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 12.7s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 9.8s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 13.0s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 6s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 4m 51s | $0.34 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 3s | $0.07 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 17.8s | $0.03 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 36.7s | $0.04 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 38.7s | $0.02 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 8.9s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 4m 57s | $0.35 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 24.9s | $0.02 |
| 01_standalone_sdk/33_hooks/main.py | ✅ PASS | 39.0s | $0.04 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 9m 16s | $0.74 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 13.0s | $0.01 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 9.5s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 42.3s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 9.7s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 34.5s | $0.31 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 32.4s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 52.3s | $0.05 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 3.5s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 9.0s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 8m 4s | $0.62 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 9.7s | $0.00 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.3s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 13.9s | $0.00 |
| 01_standalone_sdk/49_switch_llm_tool.py | ✅ PASS | 8.4s | $0.03 |
| 01_standalone_sdk/50_async_cancellation.py | ✅ PASS | 12.3s | $0.00 |
| 01_standalone_sdk/51_agent_hooks/main.py | ✅ PASS | 44.2s | $0.06 |
| 01_standalone_sdk/52_dynamic_workflow.py | ✅ PASS | 4m 15s | $0.15 |
| 01_standalone_sdk/53_client_defined_tools.py | ✅ PASS | 11.1s | $0.01 |
| 01_standalone_sdk/54_goal_completion_loop.py | ✅ PASS | 27.4s | $0.03 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 37.8s | $0.02 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 47s | $0.05 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 25s | $0.10 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 21s | $0.04 |
| 02_remote_agent_server/06_custom_tool/main.py | ✅ PASS | 5m 42s | $0.06 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 38.4s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 4m 10s | $0.03 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 1m 2s | $0.32 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 40.3s | $0.06 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 51.3s | $0.00 |
| 02_remote_agent_server/12_settings_and_secrets_api.py | ✅ PASS | 2m 51s | $0.04 |
| 02_remote_agent_server/13_workspace_get_llm.py | ✅ PASS | 50.1s | $0.03 |
| 02_remote_agent_server/14_client_defined_tools.py | ✅ PASS | 1m 4s | $0.04 |
| 02_remote_agent_server/15_openai_compatible_gateway.py | ✅ PASS | 17.7s | $0.01 |
| 02_remote_agent_server/16_deferred_init.py | ❌ FAIL Exit code 1 |
3m 9s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 28.8s | $0.02 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 1m 51s | $0.06 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 12.7s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 14.9s | $0.02 |
❌ Some tests failed
Total: 65 | Passed: 64 | Failed: 1 | Total Cost: $4.45
Failed examples:
- examples/02_remote_agent_server/16_deferred_init.py: Exit code 1
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 34.6s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 21.4s | $0.02 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 9.7s | $0.00 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 33.5s | $0.02 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 11.2s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 26.2s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 31.7s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 14.3s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 30.5s | $0.04 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 8s | $0.13 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 26.5s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 18.4s | $0.01 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 21.6s | $0.01 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 13.5s | $0.01 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 10.7s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 14.8s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 38.9s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 6m 14s | $0.47 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 13s | $0.06 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 20.0s | $0.03 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 41.6s | $0.04 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 43.3s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 11.1s | $0.00 |
| 01_standalone_sdk/31_iterative_refinement.py | ❌ FAIL Timed out after 600 seconds |
10m 0s | -- |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 20.6s | $0.01 |
| 01_standalone_sdk/33_hooks/main.py | ✅ PASS | 40.7s | $0.04 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 3m 29s | $0.18 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 10.7s | $0.00 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 5.8s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 40.8s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 14.3s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 32.6s | $0.31 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 32.0s | $0.02 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 36.3s | $0.04 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 7.4s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 7.8s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 7m 39s | $0.65 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 10.1s | $0.01 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.3s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 14.1s | $0.00 |
| 01_standalone_sdk/49_switch_llm_tool.py | ✅ PASS | 7.2s | $0.03 |
| 01_standalone_sdk/50_async_cancellation.py | ✅ PASS | 12.7s | $0.00 |
| 01_standalone_sdk/51_agent_hooks/main.py | ✅ PASS | 54.7s | $0.06 |
| 01_standalone_sdk/52_dynamic_workflow.py | ✅ PASS | 6m 57s | $0.22 |
| 01_standalone_sdk/53_client_defined_tools.py | ✅ PASS | 13.6s | $0.01 |
| 01_standalone_sdk/54_goal_completion_loop.py | ✅ PASS | 54.0s | $0.04 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 38.5s | $0.03 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 44s | $0.04 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 1s | -- |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 45s | $0.03 |
| 02_remote_agent_server/06_custom_tool/main.py | ✅ PASS | 5m 25s | $0.04 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 53.7s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 3m 55s | $0.02 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 1m 10s | $0.18 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 36.4s | $0.03 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 56.9s | $0.00 |
| 02_remote_agent_server/12_settings_and_secrets_api.py | ✅ PASS | 2m 23s | $0.02 |
| 02_remote_agent_server/13_workspace_get_llm.py | ✅ PASS | 41.0s | $0.01 |
| 02_remote_agent_server/14_client_defined_tools.py | ✅ PASS | 42.9s | $0.02 |
| 02_remote_agent_server/15_openai_compatible_gateway.py | ✅ PASS | 13.9s | $0.00 |
| 02_remote_agent_server/16_deferred_init.py | ❌ FAIL Exit code 1 |
2m 30s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 37.0s | $0.03 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 1m 43s | $0.07 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 16.2s | $0.01 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 15.3s | $0.02 |
❌ Some tests failed
Total: 65 | Passed: 63 | Failed: 2 | Total Cost: $3.25
Failed examples:
- examples/01_standalone_sdk/31_iterative_refinement.py: Timed out after 600 seconds
- examples/02_remote_agent_server/16_deferred_init.py: Exit code 1
This comment was marked as resolved.
This comment was marked as resolved.
Co-authored-by: openhands <openhands@all-hands.dev>
Both features reached their scheduled removal version in 1.29.0:
- Drop the no-op _return_metrics/return_metrics parameter from
LLM.{completion,acompletion,responses,aresponses}, RouterLLM.completion,
and the TestLLM doubles. Metrics are always on LLMResponse.metrics.
- Remove the acp_env field end-to-end: ACPAgentSettings and ACPAgent
(field/validators/serializers), ACPAgentSettings.resolve_acp_env(), the
ACP spawn-time env-injection/precedence logic, and the
REDACT_ALL_VALUES_KEYS entry. Arbitrary ACP subprocess env vars now ride
the conversation secrets channel.
Updates tests, the v1 ACP persisted-settings golden fixture, the
persisted-settings-compat generator, and docstrings/AGENTS.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
45f0a62 to
2bedbd8
Compare
|
@enyst @simonrosenberg @VascoSch92 — manual REST API contract summary for the PRs discussed. This is the same kind of concise public This comment was created by an AI agent (OpenHands) on behalf of the user. Status at a glance
#3784 — agent profile at conversation start--- PR #3784 base public OpenAPI
+++ PR #3784 head public OpenAPI
@@ -978,0 +979 @@
+schema ConversationInfo property launched_profile optional schema=anyOf=[LaunchedProfile,type="null"]
@@ -1469,0 +1471,3 @@
+schema LaunchedProfile property profile_id required schema=type="string" format="uuid"
+schema LaunchedProfile property revision required schema=type="integer" minimum=0.0
+schema LaunchedProfile type="object"
@@ -1863,0 +1868 @@
+schema StartConversationRequest property agent_profile_id optional schema=anyOf=[type="string" format="uuid",type="null"]#3788 — clarify launched agent profile provenance names--- PR #3788 base public OpenAPI
+++ PR #3788 head public OpenAPI
@@ -979 +979 @@
-schema ConversationInfo property launched_profile optional schema=anyOf=[LaunchedProfile,type="null"]
+schema ConversationInfo property launched_agent_profile optional schema=anyOf=[LaunchedAgentProfile,type="null"]
@@ -1471,3 +1471,3 @@
-schema LaunchedProfile property profile_id required schema=type="string" format="uuid"
-schema LaunchedProfile property revision required schema=type="integer" minimum=0.0
-schema LaunchedProfile type="object"
+schema LaunchedAgentProfile property agent_profile_id required schema=type="string" format="uuid"
+schema LaunchedAgentProfile property revision required schema=type="integer" minimum=0.0
+schema LaunchedAgentProfile type="object"#3770 — goal conversation endpoints--- PR #3770 base public OpenAPI
+++ PR #3770 head public OpenAPI
@@ -69,0 +70,3 @@
+operation POST /api/conversations/{conversation_id}/goal operationId=start_goal_conversation_api_conversations__conversation_id__goal_post
+operation POST /api/conversations/{conversation_id}/goal/resume operationId=resume_goal_conversation_api_conversations__conversation_id__goal_resume_post
+operation POST /api/conversations/{conversation_id}/goal/stop operationId=stop_goal_conversation_api_conversations__conversation_id__goal_stop_post
@@ -173,0 +177,3 @@
+parameter POST /api/conversations/{conversation_id}/goal path:conversation_id required=true schema=type="string" format="uuid"
+parameter POST /api/conversations/{conversation_id}/goal/resume path:conversation_id required=true schema=type="string" format="uuid"
+parameter POST /api/conversations/{conversation_id}/goal/stop path:conversation_id required=true schema=type="string" format="uuid"
@@ -201,0 +208 @@
+requestBody POST /api/conversations/{conversation_id}/goal application/json required=true schema=StartGoalRequest
@@ -356,0 +364,11 @@
+response POST /api/conversations/{conversation_id}/goal 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal 404 no-content
+response POST /api/conversations/{conversation_id}/goal 409 no-content
+response POST /api/conversations/{conversation_id}/goal 422 application/json schema=HTTPValidationError
+response POST /api/conversations/{conversation_id}/goal/resume 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal/resume 404 no-content
+response POST /api/conversations/{conversation_id}/goal/resume 409 no-content
+response POST /api/conversations/{conversation_id}/goal/resume 422 application/json schema=HTTPValidationError
+response POST /api/conversations/{conversation_id}/goal/stop 200 application/json schema=Success
+response POST /api/conversations/{conversation_id}/goal/stop 404 no-content
+response POST /api/conversations/{conversation_id}/goal/stop 422 application/json schema=HTTPValidationError
@@ -1890,0 +1909,3 @@
+schema StartGoalRequest property max_iterations optional schema=type="integer" default=10 minimum=1.0
+schema StartGoalRequest property objective required schema=type="string"
+schema StartGoalRequest type="object"#3787 current release PR diff vs latest
|
|
Can we also pick d7392de , though please note that my agent is looking into something failing on github actions? |
This comment was marked as outdated.
This comment was marked as outdated.
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
🧪 Integration Tests ResultsOverall Success Rate: 97.1% 📁 Detailed Logs & ArtifactsClick the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.
📊 Summary
📋 Detailed Resultslitellm_proxy_minimax_MiniMax_M2.7
Skipped Tests:
litellm_proxy_openai_gpt_5.5
litellm_proxy_gemini_3.1_pro_preview
Failed Tests:
litellm_proxy_deepseek_deepseek_v4_flash
Skipped Tests:
|
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 24.3s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 20.7s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 9.5s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 42.2s | $0.03 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 14.1s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 24.9s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 33.9s | $0.04 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 12.3s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 24.9s | $0.03 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 15s | $0.15 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 22.4s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 29.5s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 18.2s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 15.3s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 9.6s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 20.8s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 29s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 3m 18s | $0.25 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 12s | $0.08 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 17.3s | $0.02 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 40.8s | $0.04 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 47.4s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 9.0s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 5m 39s | $0.40 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 17.5s | $0.02 |
| 01_standalone_sdk/33_hooks/main.py | ✅ PASS | 39.2s | $0.04 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 8m 11s | $0.87 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 10.7s | $0.00 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 5.7s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 37.7s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 10.1s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 38.4s | $0.30 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 27.6s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 32.3s | $0.04 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 3.7s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 8.9s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 4m 5s | $0.30 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 10.7s | $0.01 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.4s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 11.9s | $0.00 |
| 01_standalone_sdk/49_switch_llm_tool.py | ✅ PASS | 7.9s | $0.03 |
| 01_standalone_sdk/50_async_cancellation.py | ✅ PASS | 12.8s | $0.00 |
| 01_standalone_sdk/51_agent_hooks/main.py | ✅ PASS | 41.0s | $0.05 |
| 01_standalone_sdk/52_dynamic_workflow.py | ✅ PASS | 6m 38s | $0.17 |
| 01_standalone_sdk/53_client_defined_tools.py | ✅ PASS | 13.5s | $0.01 |
| 01_standalone_sdk/54_goal_completion_loop.py | ✅ PASS | 31.3s | $0.03 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 36.5s | $0.02 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 39s | $0.04 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 56.4s | $0.05 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 40s | $0.03 |
| 02_remote_agent_server/06_custom_tool/main.py | ✅ PASS | 5m 4s | $0.04 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 36.1s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 4m 14s | $0.03 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 1m 6s | $0.35 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 45.2s | $0.06 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 51.3s | $0.00 |
| 02_remote_agent_server/12_settings_and_secrets_api.py | ✅ PASS | 2m 50s | $0.04 |
| 02_remote_agent_server/13_workspace_get_llm.py | ✅ PASS | 34.8s | $0.02 |
| 02_remote_agent_server/14_client_defined_tools.py | ✅ PASS | 1m 7s | $0.04 |
| 02_remote_agent_server/15_openai_compatible_gateway.py | ✅ PASS | 17.9s | $0.01 |
| 02_remote_agent_server/16_deferred_init.py | ❌ FAIL Exit code 1 |
2m 27s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 21.3s | $0.02 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 1m 6s | $0.09 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 14.0s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 17.4s | $0.02 |
❌ Some tests failed
Total: 65 | Passed: 64 | Failed: 1 | Total Cost: $4.19
Failed examples:
- examples/02_remote_agent_server/16_deferred_init.py: Exit code 1
|
@OpenHands look at the last comment on running test examples workflow and how it works. run the failed example yourself now, same llm, you can use eval proxy and eval key though. does it work now? on latest version of the branch. then /codereview this release pr. reply on github. |
|
@enyst it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again. |
|
I'm on it! enyst can track my progress at all-hands.dev |
enyst
left a comment
There was a problem hiding this comment.
@enyst I checked the latest Run Examples comment and reproduced the failed example locally on the latest branch head (b86ba42cab5150c0a979f35b0d63486dc0f3da19).
Local reproduction command (same workflow model, eval proxy/key):
env -u TMUX_TMPDIR -u TMUX -u SESSION_API_KEY -u OH_SESSION_API_KEYS_0 \
LLM_API_KEY="$LITELLM_API_KEY" \
LLM_MODEL="openhands/claude-haiku-4-5-20251001" \
LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
ANTHROPIC_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
ANTHROPIC_API_KEY="$LITELLM_API_KEY" \
OPENAI_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
OPENAI_API_KEY="$LITELLM_API_KEY" \
uv run pytest tests/examples/test_examples.py \
--run-examples \
--examples-results-dir .example-test-results-local \
-k '16_deferred_init' -sResult: still fails. The deferred-init flow itself reaches ready state and runs the conversation, but the example times out its polling loop because the server reports execution_status: "finished" and the example only breaks/accepts ("stopped", "paused", "error") / asserts ("stopped", "paused"):
AssertionError: Unexpected final status: finished
That matches the workflow log for Run Examples Scripts run 27768040721, which also failed with Unexpected final status: finished.
/codereview
🟡 Taste Rating: Acceptable - The release cleanup is mostly straightforward and removes deadline-expired compatibility code, but the PR is not release-ready because required release validation is failing/stale.
[CRITICAL ISSUES]
- None found in the release cleanup itself during this pass.
[TESTING GAPS]
- [PR validation] Release workflow validation is not green/current: per this repo's release review policy, release PRs need current PR-specific validation for
Run tests,Run Examples Scripts, andRun Integration Testsbefore approval. Current state:Run tests: ✅ current onb86ba42cab5150c0a979f35b0d63486dc0f3da19.Run Examples Scripts: ❌ latest PR comment/run failsexamples/02_remote_agent_server/16_deferred_init.py; I reproduced the same failure locally on current HEAD with the requested model/proxy.Run Integration Tests: ambiguous/stale for current HEAD. Latest visible result comment is for run27768049468on older head884c996f9ad515c66a7c9b9440ea9cf0a17938e6and reports 97.1% with a Geminit08_image_file_viewingfailure, even though the GitHub workflow conclusion is success. I do not see a current green integration result forb86ba42in the PR comments.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
This is an intentional breaking release PR that removes deprecated public API surface (_return_metrics/return_metrics,acp_env). The code direction is reasonable for a release deadline cleanup, and the breakage gates are reported green, but failed/stale release workflow evidence makes auto-approval unsafe.
VERDICT:
❌ Needs rework / human maintainer decision before merge: fix or intentionally waive the failing 16_deferred_init.py example, then rerun Run Examples Scripts; also provide a current unambiguous Run Integration Tests result for the latest head.
KEY INSIGHT:
The blocking issue is not the release cleanup mechanics; it is that the release validation evidence does not currently prove the latest branch is shippable.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This PR review was created by an AI agent (OpenHands) on behalf of the requester.
This comment has been minimized.
This comment has been minimized.
|
@enyst confirmed — I cherry-picked the squashed main commit for #3793 onto Details:
Verification run after cherry-pick: uv run pre-commit run --files examples/02_remote_agent_server/16_deferred_init.pyPassed. I also reran the previously failing example with the same workflow model and eval proxy/key: env -u TMUX_TMPDIR -u TMUX -u SESSION_API_KEY -u OH_SESSION_API_KEYS_0 \
LLM_API_KEY="$LITELLM_API_KEY" \
LLM_MODEL="openhands/claude-haiku-4-5-20251001" \
LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
ANTHROPIC_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
ANTHROPIC_API_KEY="$LITELLM_API_KEY" \
OPENAI_BASE_URL="https://llm-proxy.eval.all-hands.dev" \
OPENAI_API_KEY="$LITELLM_API_KEY" \
uv run pytest tests/examples/test_examples.py \
--run-examples \
--examples-results-dir .example-test-results-local \
-k '16_deferred_init' -qResult: This update/comment was created by an AI agent (OpenHands) on behalf of the requester. |
HUMAN:
Cutting the v1.29.0 release. I verified the deprecation-deadline, Python/REST API breakage, and persisted-settings-compat gates plus the LLM/ACP/settings test suites pass locally before pushing.
AGENT:
Why
This is the v1.29.0 release PR. The
Deprecation deadlinesrelease gate fails once the project version reaches a feature'sremoved_in, so cutting1.29.0requires deleting every API surface scheduled for removal in1.29.0. Two deprecated features were due:_return_metrics/return_metricsparameter (deprecated in 1.24.0).acp_envfield on the ACP agent/settings (deprecated in 1.24.0).Summary
Prepare Releaseworkflow)._return_metrics/return_metricsfromLLM.{completion,acompletion,responses,aresponses},RouterLLM.completion, and theTestLLMdoubles. Metrics are always available viaLLMResponse.metrics.acp_envend-to-end: drop the field fromACPAgentSettingsandACPAgent(field + validators + serializers), deleteACPAgentSettings.resolve_acp_env(), simplify the ACP spawn-time env build (registry →os.environprecedence; file-secret materialisation and data-dir isolation no longer honour anacp_envpin), and dropacp_envfromREDACT_ALL_VALUES_KEYS. Provide arbitrary ACP subprocess env vars via the conversation secrets channel instead.AGENTS.mdaccordingly.removed_in == 1.29.0deprecations remain. Theacp_envremoval is sanctioned by both breakage gates (deprecated in the 1.28.1 baseline withremoved_inreached).Issue Number
N/A — routine release.
How to Test
From the
rel-1.29.0checkout:Type
Notes
Downstream consumers (OpenHands, agent-canvas, typescript-client) that still reference
acp_envwill be updated separately; ACPacp_envstorage was never used in production.Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:579ad28-pythonRun
All tags pushed for this build
About Multi-Architecture Support
579ad28-python) is a multi-arch manifest supporting both amd64 and arm64579ad28-python-amd64) are also available if needed