Skip to content

fix(acp): mask injected secrets in tool-call output and streamed text#3463

Merged
simonrosenberg merged 1 commit into
mainfrom
fix-acp-mask-injected-secrets
Jun 3, 2026
Merged

fix(acp): mask injected secrets in tool-call output and streamed text#3463
simonrosenberg merged 1 commit into
mainfrom
fix-acp-mask-injected-secrets

Conversation

@simonrosenberg

@simonrosenberg simonrosenberg commented Jun 2, 2026

Copy link
Copy Markdown
Member

Problem

Priority: P0 (security hole, widened on cloud). Implements pre-cloud refactor P0-2OpenHands/agent-canvas#1023, part of OpenHands/agent-canvas#988.

The regular terminal masks every observation through secret_registry.mask_secrets_in_output(). The ACP agent masked nothing — so a subprocess that echoes an injected credential (env, a failing curl printing an Authorization header) landed in the persisted event stream + WebSocket relay in cleartext.

What this does

Routes ACP output through state.secret_registry.mask_secrets_in_output() before emitting:

  • Streamed textAgentMessageChunk text (on_token + accumulated_text) and AgentThoughtChunk text.
  • Tool-call fieldstitle / raw_input / raw_output / content, masked at ingestion in session_update so the early "started" event, the single terminal-transition event (post-refactor(acp): dedup tool-call progress at the source (O(n²)→O(1)) #3465 dedup), and the flush/supersede path all carry masked values, and the accumulator never holds plaintext. A recursive _mask_json_value helper masks string leaves of the arbitrary-JSON fields.
  • Persisted turn — the joined response_text / thought_text in _finalize_successful_turn is re-masked at the persistence boundary, which also catches a secret split across two stream chunks (each chunk alone won't match; only the join reassembles it).

The bridge gains a per-turn mask callable (set in _reset_client_for_turn to the live mask_secrets_in_output bound method, cleared in reset() / _clear_turn_callbacks). No-op when unset; falls back to the original value if it ever raises — matching the terminal tool's contract.

Mask set / scope

The mask set is the registry's _exported_values, seeded by get_secret_value when _start_acp_server reads state.secret_registry — the canonical channel that #3464 routes provider creds (and, on the caller/cloud path, agent_context.secrets) through. No SecretRegistry API changes. The deprecated acp_env and the canvas-local agent_context.secrets drain bypass get_secret_value and are out of scope here (both are transitional/deprecated; seeding them would mean masking general non-secret env too).

The live on_token relay still forwards a secret split mid-value across chunks as fragments; the persisted stream is fully masked. Follow-up for streaming prefix-redaction: OpenHands/agent-canvas#1049.

Tests

test_acp_agent.py: recursive masker; bridge masking of message/thought chunks and tool-call start/terminal fields (incl. accumulator stays masked); no-op when mask unset; mask-error fallback; reset() clears mask; end-to-end step() masks a secret split across chunks in the persisted FinishAction (seeded via the canonical update_secrets + get_secret_value path).

Verification

ruff + pyright clean; tests/sdk/agent/ + tests/sdk/conversation/ green.

Docs follow-up (acp_env deprecation guidance): OpenHands/docs#544.

🤖 Generated with Claude Code


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:ec5d309-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-ec5d309-python \
  ghcr.io/openhands/agent-server:ec5d309-python

All tags pushed for this build

ghcr.io/openhands/agent-server:ec5d309-golang-amd64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-golang-amd64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-golang-amd64
ghcr.io/openhands/agent-server:ec5d309-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:ec5d309-golang-arm64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-golang-arm64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-golang-arm64
ghcr.io/openhands/agent-server:ec5d309-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:ec5d309-java-amd64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-java-amd64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-java-amd64
ghcr.io/openhands/agent-server:ec5d309-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:ec5d309-java-arm64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-java-arm64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-java-arm64
ghcr.io/openhands/agent-server:ec5d309-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:ec5d309-python-amd64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-python-amd64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-python-amd64
ghcr.io/openhands/agent-server:ec5d309-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:ec5d309-python-arm64
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-python-arm64
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-python-arm64
ghcr.io/openhands/agent-server:ec5d309-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:ec5d309-golang
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-golang
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-golang
ghcr.io/openhands/agent-server:ec5d309-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:ec5d309-java
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-java
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-java
ghcr.io/openhands/agent-server:ec5d309-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:ec5d309-python
ghcr.io/openhands/agent-server:ec5d30926bf9e7a3e30a6f2aed37a653ba97524a-python
ghcr.io/openhands/agent-server:fix-acp-mask-injected-secrets-python
ghcr.io/openhands/agent-server:ec5d309-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., ec5d309-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., ec5d309-python-amd64) are also available if needed

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/agent
   acp_agent.py9899690%504–506, 712–713, 756, 758, 762, 766, 792, 855–856, 861, 928, 1147, 1150–1151, 1168–1169, 1198, 1203, 1279, 1289, 1650–1653, 1657–1659, 1662–1666, 1668, 1879, 1893–1894, 1897–1899, 1907, 1911, 1935–1936, 1939, 1948–1949, 1973, 1977–1979, 1983–1984, 2016, 2243–2245, 2248–2249, 2289, 2429, 2433, 2441–2443, 2481–2482, 2485, 2493–2495, 2497, 2499, 2503, 2506, 2515–2517, 2519, 2555–2556, 2574–2577, 2580, 2584–2586, 2588, 2592–2593, 2801–2802
TOTAL28746826571% 

@simonrosenberg simonrosenberg self-assigned this Jun 2, 2026

all-hands-bot commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — fix(acp): mask injected secrets in tool-call output and streamed text

This is a solid, well-structured P0 security fix. The core strategy is correct: mask at ingestion (per streaming chunk, per tool-call entry) and re-mask at the persistence boundary after joining accumulated chunks — the only point where a secret split across two streamed chunks can be caught. The _mask_json_value recursive walker is clean, the defensive error-handling in _mask_value (fall back to original value, log, never crash) matches the existing terminal tool contract, and the test coverage is thorough, including the split-chunk edge case.

A few observations flagged inline below.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
Comment thread openhands-sdk/openhands/sdk/conversation/secret_registry.py Outdated
Comment thread tests/sdk/agent/test_acp_agent.py

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the ACPAgent path with real SDK Conversations and temporary ACP stdio subprocesses: injected ACP env secrets now mask in streamed text, tool-call raw fields, and persisted turn output.

Does this PR achieve its stated goal?

Yes. The PR set out to prevent ACP subprocess output from leaking injected credentials into streamed callbacks, tool-call events, and persisted finish/observation text. I reproduced the leak on main with a real Conversation(ACPAgent(...)) talking to an ACP stdio server that echoed ACP_QA_TOKEN, then reran the same scenarios on a2eec14e; the secret was replaced with <secret-hidden> everywhere expected, including when the secret was split across two streamed chunks and only reassembled at final persistence.

Phase Result
Environment Setup uv environment usable; SDK and ACP subprocess flow ran successfully
CI Status ⚠️ Most checks green; Python API check is failing and qa-changes was still in progress when checked
Functional Verification ✅ Real ACPAgent conversations verified direct stream/tool output and split-chunk persistence masking
Functional Verification

Test 1: ACP subprocess echoes an injected env secret in streamed text and tool-call fields

Step 1 — Reproduce / establish baseline (without the fix):
Checked out main and ran a real SDK conversation using ACPAgent(acp_env={"ACP_QA_TOKEN": "QA_SUPER_SECRET_12345"}) against /tmp/fake_acp_echo_agent.py, a small ACP stdio subprocess that read the env var and emitted it in AgentMessageChunk, ToolCallStart.rawInput, ToolCallStart.rawOutput, and ToolCallProgress.rawOutput.

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_conversation_qa.py:

token_outputs=['stream:QA_SUPER_SECRET_12345']
tool_event={'status': 'in_progress', 'raw_input': {'command': 'printf QA_SUPER_SECRET_12345'}, 'raw_output': 'starting QA_SUPER_SECRET_12345'}
tool_event={'status': 'completed', 'raw_input': {'command': 'printf QA_SUPER_SECRET_12345'}, 'raw_output': 'done QA_SUPER_SECRET_12345'}
action_event=message='stream:QA_SUPER_SECRET_12345' kind='FinishAction'
observation=content=[TextContent(cache_prompt=False, type='text', text='stream:QA_SUPER_SECRET_12345')] is_error=False kind='FinishObservation'
event_count=6

This confirms the original bug: an ACP subprocess can echo an injected credential and it reaches streamed token callbacks, ACP tool-call events, and final turn output in cleartext.

Step 2 — Apply the PR's changes:
Checked out fix-acp-mask-injected-secrets at a2eec14ed9f5b7ba36780abb74c170060283e653.

Step 3 — Re-run with the fix in place:
Ran the same command:

token_outputs=['stream:<secret-hidden>']
tool_event={'status': 'in_progress', 'raw_input': {'command': 'printf <secret-hidden>'}, 'raw_output': 'starting <secret-hidden>'}
tool_event={'status': 'completed', 'raw_input': {'command': 'printf <secret-hidden>'}, 'raw_output': 'done <secret-hidden>'}
action_event=message='stream:<secret-hidden>' kind='FinishAction'
observation=content=[TextContent(cache_prompt=False, type='text', text='stream:<secret-hidden>')] is_error=False kind='FinishObservation'
event_count=6

This shows the PR masks the injected env secret before it reaches streamed callbacks, raw tool-call input/output fields, and the final finish/observation events.

Test 2: Secret split across two stream chunks is masked at final persistence

Step 1 — Reproduce / establish baseline (without the fix):
Checked out main and ran /tmp/acp_split_conversation_qa.py against /tmp/fake_acp_split_agent.py, which emits the injected secret as two AgentMessageChunks: stream:QA_SUPER_ and SECRET_12345.

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_split_conversation_qa.py:

token_outputs=['stream:QA_SUPER_', 'SECRET_12345']
action_event=message='stream:QA_SUPER_SECRET_12345' kind='FinishAction'
observation=content=[TextContent(cache_prompt=False, type='text', text='stream:QA_SUPER_SECRET_12345')] is_error=False kind='FinishObservation'
event_count=4

This confirms the baseline leak at the persistence boundary: the full secret is reconstructed in FinishAction / FinishObservation.

Step 2 — Apply the PR's changes:
Checked out fix-acp-mask-injected-secrets at a2eec14ed9f5b7ba36780abb74c170060283e653.

Step 3 — Re-run with the fix in place:
Ran the same command:

token_outputs=['stream:QA_SUPER_', 'SECRET_12345']
action_event=message='stream:<secret-hidden>' kind='FinishAction'
observation=content=[TextContent(cache_prompt=False, type='text', text='stream:<secret-hidden>')] is_error=False kind='FinishObservation'
event_count=4

This verifies the re-mask at finalization catches a secret that individual streaming chunks cannot match.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

Final verdict: PASS

simonrosenberg pushed a commit that referenced this pull request Jun 2, 2026
- Add a minimum-length guard to SecretRegistry.track_exported_value so
  trivially short non-secret config injected via acp_env (e.g.
  VERBOSITY=1) is not registered for masking — avoids destructive
  over-masking of common substrings. Real credentials clear it trivially.
- Mask the tool-call `title` field too (a misbehaving ACP server could
  echo a credential in the title).
- Reword _mask_value docstring to accurately describe the crash-safe
  fallback (may transiently leak on mask failure rather than "never").
- Expand the split-chunk test comment explaining the intentional
  per-chunk masking bypass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@simonrosenberg

Copy link
Copy Markdown
Member Author

Addressed all five review suggestions in 41d1b73 (min-length guard on track_exported_value, mask tool-call title, accurate _mask_value docstring, expanded test comment). All five threads resolved. Ready for another look.

all-hands-bot commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@simonrosenberg simonrosenberg added the acp About ACP label Jun 2, 2026

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — fix(acp): mask injected secrets in tool-call output and streamed text

This is a well-executed, layered P0 security fix. The strategy is correct: mask at ingestion per-chunk, mask tool-call fields at accumulation time in the _OpenHandsACPBridge accumulator, and re-mask at the persistence boundary to catch secrets split across two streamed chunks. All four feedback items from the prior review have been addressed in 41d1b73 (min-length guard, title field in _mask_tool_call_entry, docstring wording, test comment). A few additional observations below.


Summary

Correctness ✅ Masking is applied at every emission path: streamed text, thought chunks, tool-call start/progress, and the persistence join.
Test coverage ✅ Tests cover the recursive masker, each bridge path, split-chunk join-boundary masking, injection seeding, no-op when unset, and the crash-safe fallback contract.
Known gap ⚠️ ask_agent() fork path is explicitly deferred — acceptable; per PR description it is a smaller, separate surface.
Temporary seam 🟡 acp_env / agent_context.secrets bypass-seeding acknowledged as temporary pending #1022.

Observations

1. _MIN_TRACKED_VALUE_LENGTH = 4 does not protect against common 4-char config strings

The comment motivates the guard with VERBOSITY=1 (1 char), but >= 4 still admits very common 4-char non-secret config values — "true", "prod", "info", "main", "home" — that could easily appear in acp_env. If registered in the mask set, those substrings would be destructively redacted everywhere in ACP output. This is a known and documented trade-off, but the comment currently frames the constant as a general solution rather than a short-lived pragmatic guard. Noting explicitly that #1022 is the correct fix (not raising the constant) would make the intent clearer to future readers.

2. ToolCallProgress masking relies on merge-before-mask ordering — not visible in the diff

The new _mask_tool_call_entry(target) call (line 729) is placed correctly after the existing merge loop that writes progress fields into tc (raw_input, raw_output, content, etc.) and before _emit_tool_call_event(target). This ordering is essential for correctness, but the merge loop is above the diff hunk and invisible to diff-only reviewers. The test test_tool_call_progress_masks_new_raw_output validates it, but a one-line comment ("merge loop above has written progress fields into target; mask before emitting") would make the invariant self-documenting.

3. agent_context.secrets values already in env are neither injected nor seeded

The if name in env: continue guard skips names already present in env. Such values are not passed to track_exported_value, so if they originated from os.environ (rather than acp_env) their actual value would be in the subprocess environment without being registered in the mask set. In practice this is a narrow gap (values from acp_env are handled by the prior loop; ambient os.environ values were not injected by the agent), but a comment would help future readers understand the exemption.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-sdk/openhands/sdk/conversation/secret_registry.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

The main ACP secret-masking behavior works in a real ACP conversation, but split streamed chunks still expose reconstructable secret fragments on the live token path.

Does this PR achieve its stated goal?

Partially. I verified with a real Conversation using ACPAgent against a temporary ACP stdio server that secrets injected through both acp_env and AgentContext.secrets are masked from ACP tool-call events, full streamed chunks, thought accumulation, and persisted FinishAction output. However, when the ACP server streams the same secret split across two consecutive message chunks, the persisted event is masked but the live token callback still receives adjacent fragments (split-start:qaSecretV, then alue12345:split-end), which is still reconstructable on the streamed/WebSocket path.

Phase Result
Environment Setup make build completed successfully; dependencies installed into .venv
CI Status ⚠️ At review time: 25 successful, 1 failing (Python API breakage checks/Python API), 3 pending, 17 skipped
Functional Verification ⚠️ Before/after ACP conversation confirms the main leak is fixed; split live token streaming remains an issue
Functional Verification

Test 1: ACP output leaking injected secrets before the PR

Step 1 — Reproduce / establish baseline without the fix:

I created a temporary ACP stdio server that reads QA_ACP_SECRET from its environment and emits it through:

  • AgentMessageChunk streamed text
  • AgentThoughtChunk text
  • ToolCallStart title/raw input/raw output
  • ToolCallProgress raw output
  • two consecutive split AgentMessageChunk chunks

Ran:

git checkout origin/main
uv run python /tmp/qa_acp_mask_probe.py

Relevant output excerpt:

[
  {
    "mode": "acp_env",
    "secret_in_callback_events": true,
    "secret_in_state_events": true,
    "secret_in_tokens": true,
    "accumulated_thoughts": ["thought-full:qaSecretValue12345\n"],
    "tokens": ["streamed-full:qaSecretValue12345\n", "split-start:qaSecretV", "alue12345:split-end"],
    "callback_event_summaries": [
      {"type": "ACPToolCallEvent", "title": "Run echo qaSecretValue12345", "raw_output": "started:qaSecretValue12345"},
      {"type": "ActionEvent", "action": {"message": "streamed-full:qaSecretValue12345\nsplit-start:qaSecretValue12345:split-end"}}
    ]
  },
  {
    "mode": "agent_context",
    "secret_in_callback_events": true,
    "secret_in_state_events": true,
    "secret_in_tokens": true
  }
]

This confirms the original bug: an ACP subprocess echoing an injected value put the secret into live callbacks, persisted state events, tool-call fields, thought text, and final response text.

Step 2 — Apply the PR's changes:

Ran:

git checkout fix-acp-mask-injected-secrets

Step 3 — Re-run with the fix in place:

Ran:

uv run python /tmp/qa_acp_mask_probe.py

Relevant output excerpt:

[
  {
    "mode": "acp_env",
    "secret_in_callback_events": false,
    "secret_in_state_events": false,
    "secret_in_tokens": false,
    "accumulated_thoughts": ["thought-full:<secret-hidden>\n"],
    "tokens": ["streamed-full:<secret-hidden>\n", "split-start:qaSecretV", "alue12345:split-end"],
    "callback_event_summaries": [
      {"type": "ACPToolCallEvent", "title": "Run echo <secret-hidden>", "raw_input": {"cmd": "echo <secret-hidden>", "nested": ["<secret-hidden>", 7]}, "raw_output": "started:<secret-hidden>"},
      {"type": "ACPToolCallEvent", "raw_output": "completed:<secret-hidden>"},
      {"type": "ActionEvent", "action": {"message": "streamed-full:<secret-hidden>\nsplit-start:<secret-hidden>:split-end"}}
    ]
  },
  {
    "mode": "agent_context",
    "secret_in_callback_events": false,
    "secret_in_state_events": false,
    "secret_in_tokens": false,
    "accumulated_thoughts": ["thought-full:<secret-hidden>\n"]
  }
]

This shows the PR fixes the primary leak for both injection channels and masks persisted state/event output. The remaining caveat is visible in the token list: split chunks still reveal the two pieces on the live token path, although the final persisted FinishAction is re-masked after joining.

Issues Found

  • 🟠 Issue: Secrets split across consecutive ACP streamed message chunks are still emitted as reconstructable fragments through the live token callback/WebSocket path, even though the persisted FinishAction is masked after joining.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
@simonrosenberg simonrosenberg force-pushed the fix-acp-mask-injected-secrets branch from 41d1b73 to ac8e20c Compare June 2, 2026 21:22
@simonrosenberg

Copy link
Copy Markdown
Member Author

Rebased onto current main (force-push to ac8e20c5) to pick up the merged #3464 (route creds through secret_registry, deprecate acp_env), #3465 (tool-call dedup), and #3462. Re-applied the masking onto the new structure:

  • Registry-channel secrets now auto-seed the mask set via get_secret_value; my track_exported_value seeding now covers only the two channels that still bypass it — the agent_context.secrets drain and the deprecated acp_env.
  • Mask-at-ingestion fits refactor(acp): dedup tool-call progress at the source (O(n²)→O(1)) #3465's dedup: the started event, the single terminal-transition event, and the flush/supersede path all read masked accumulator entries.
  • All five earlier review suggestions remain addressed (min-length guard, title masking, accurate docstring, expanded test comment).

The previously-failing Python API check was a pre-existing breakage on main that those merges fixed, so it's green now too.

all-hands-bot commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — fix(acp): mask injected secrets in tool-call output and streamed text

This is a well-layered P0 security fix. Two prior automated reviews have already been posted and the author has addressed all five items from the first pass (min-length guard, title masking, docstring wording, acp_env min-length, test comment). The second review raised four additional observations (IDs 3342204848, 3342204856, 3342204874, 3342207690) that are still unresolved. This review focuses on what remains open and adds one additional observation not yet raised.


Status of Open Items from Prior Reviews

# Thread Severity Status
1 _MIN_TRACKED_VALUE_LENGTH = 4 — common 4-char strings like "prod", "true", "info" remain susceptible to over-masking 🟡 Design note Unresolved — acknowledged as a pre-existing trade-off; #1022 is the correct long-term fix
2 Merge-before-mask ordering at _mask_tool_call_entry(target) is implicit from the diff — a one-line comment would make the invariant explicit 🟡 Nit Unresolved
3 if name in env: continue skips track_exported_value for names already present from os.environ — narrow gap worth a comment 🟡 Minor Unresolved
4 Functional QA: a secret split across two consecutive on_token stream chunks still crosses the network as reconstructable plaintext fragments before the persistence-boundary re-mask fires 🟠 Important Unresolved — no author response yet

New Observation

_fork_accumulated_text is not masked (ask_agent fork path)

The ask_agent() fork path accumulates text chunks without masking:

# session_update, lines ~690-694 (current file)
if self._fork_session_id is not None and session_id == self._fork_session_id:
    if isinstance(update, AgentMessageChunk):
        if isinstance(update.content, TextContentBlock):
            self._fork_accumulated_text.append(update.content.text)  # no masking
    return

The PR description explicitly defers this ("The ask_agent() fork path is intentionally out of scope"), which is a reasonable scoping call. However, depending on how _fork_accumulated_text is later consumed (returned to the caller, persisted, or relayed), a credential in the fork response could also appear in cleartext. A one-line comment at the fork early-return would make the deliberate deferral discoverable for future readers — right now the only documentation of this decision is in the PR description, which is harder to find during a code audit.


Summary Assessment

The core design is sound and the test suite covers the essential correctness properties well (per-chunk masking, join-boundary re-mask, injection seeding, error fallback). The main actionable item before merge is resolving the question from item 4 above:

Does the security goal explicitly include preventing split-secret fragments from crossing the WebSocket/network boundary via the live streaming path, or is the guarantee limited to the persisted event stream?

If the answer is "persisted stream only," a code comment or PR description update scoping this out would close the question cleanly. If the answer is "includes live streaming," the on_token path needs buffered/rolling masking (e.g. maintaining a sliding window across the last N bytes) — a non-trivial change worth a follow-up issue.

Everything else looks correct: defensive error handling, lifecycle cleanup of mask in reset() and _clear_turn_callbacks(), both step() and astep() call sites, and the retry paths inside each are all handled consistently.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified ACP secret masking behavior with real SDK/ACP schema objects; the PR removes plaintext secret leakage from streamed text, thought accumulation, tool-call events, and tool-call accumulators in the exercised paths.

Does this PR achieve its stated goal?

Yes. The stated goal is to prevent injected ACP secrets from landing in streamed text/tool-call output and to seed exported injected values into the mask set. On the base commit, the same ACP session updates leaked acp-secret-12345 through token callbacks, accumulated text/thoughts, emitted ACPToolCallEvents, and the in-memory tool-call accumulator; on the PR commit, the plaintext secret was absent and every exercised sink contained <secret-hidden> instead. The new track_exported_value path was also available and masked the registered injected value.

Phase Result
Environment Setup make build completed successfully; no test/lint/typecheck commands were run.
CI Status ⚠️ Existing checks observed via gh pr view: core Run tests jobs green; unresolved-review-threads failing and several Agent Server/QA jobs still in progress at observation time.
Functional Verification ✅ Before/after SDK script reproduced the leak on base and verified masking on PR head.
Functional Verification

Test 1: ACP streamed text and tool-call output are masked before relay/persistence sinks

Step 1 — Reproduce / establish baseline without the fix:

Ran git switch --detach 1c385741d98b3733209984db6a2fdd1883a8e856 && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_mask_behavior.py.

Key output:

SUMMARY {"leak_locations": ["tokens", "accumulated_text", "accumulated_thoughts", "event_payloads", "accumulated_tool_calls"], "mask_sentinel_present": false, "plaintext_secret_present": true}
"tokens": ["streamed token acp-secret-12345"]
"raw_output": {"stdout": "final stdout acp-secret-12345", "lines": ["acp-secret-12345"]}
"track_exported_value_available": false

This confirms the baseline bug: even when the secret registry has the secret in its mask set, ACP streamed tokens, thoughts, emitted tool-call events, and accumulated tool-call data carried plaintext.

Step 2 — Apply the PR's changes:

Checked out PR head ac8e20c57546ce3b209b61dc0217e7daa92a4c5b.

Step 3 — Re-run with the fix in place:

Ran git switch --detach ac8e20c57546ce3b209b61dc0217e7daa92a4c5b && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_mask_behavior.py.

Key output:

SUMMARY {"leak_locations": [], "mask_sentinel_present": true, "plaintext_secret_present": false}
"tokens": ["streamed token <secret-hidden>"]
"raw_input": {"command": "echo <secret-hidden>", "nested": ["safe", "<secret-hidden>"]}
"raw_output": {"stdout": "final stdout <secret-hidden>", "lines": ["<secret-hidden>"]}
"track_exported_value_available": true
"track_exported_value_mask_result": "registered=<secret-hidden>"

This shows the fix works for the exercised user-visible sinks: streamed token relay, thought accumulation, start/progress ACPToolCallEvent payloads, nested JSON raw input/output, and the accumulator all contain the mask sentinel and no plaintext secret.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

@simonrosenberg

Copy link
Copy Markdown
Member Author

@all-hands-bot re: the QA "PASS WITH ISSUES" — split-chunk live-relay fragments.

Acknowledged and accepted as a documented best-effort boundary for this PR:

  • The persisted/at-rest event stream (the largest surface, stored server-side) is fully masked — including split secrets — via the join-boundary re-mask in _finalize_successful_turn.
  • The residual is only the transient, non-persisted live on_token/WebSocket relay, and only when an ACP server streams a secret split mid-value across consecutive chunks. Per-chunk masking can't match a fragment; a correct fix is streaming prefix-redaction (hold back only the trailing suffix that's a prefix of a tracked secret), which is net-new infra better isolated in its own change.
  • Merge Multi-Arch Manifest is broken on main #1023 itself frames the live path as best-effort; the guaranteed-masking contract is on the persisted stream.

Filed as a follow-up: OpenHands/agent-canvas#1049.

@simonrosenberg

Copy link
Copy Markdown
Member Author

Pushed 17f899f8: cut the SecretRegistry additions (track_exported_value + _MIN_TRACKED_VALUE_LENGTH) as out of scope for an ACP-masking PR. The masking now relies on the registry channel's existing auto-seeding via get_secret_value (which #3464 routes provider creds + agent_context.secrets through). The deprecated acp_env and the canvas-local drain are no longer seeded for masking — out of scope here (transitional channels), which also removes the min-length-guard / over-masking discussion. Masking behavior, tool-call/title/thought coverage, and the persisted-turn re-mask are unchanged. Docs follow-up for the acp_env deprecation: OpenHands/docs#544.

all-hands-bot commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified ACP secret masking behavior before/after: the base branch leaked an injected secret through ACP streamed text, tool-call fields, and the finalized turn message, while this PR masks those values with <secret-hidden>.

Does this PR achieve its stated goal?

Yes. The PR sets out to prevent injected ACP secrets from landing in streamed text/tool-call event output and persisted turn text. I exercised the ACP session-update path with real acp.schema message/thought/tool-call objects and a SecretRegistry seeded through update_secrets() + get_secret_value(): base leaked supersecret, while the PR produced leaked_secret=False, masked streamed text/tool-call fields, and masked a split secret after finalizing the turn.

Phase Result
Environment Setup make build completed successfully; dependencies installed via uv sync --dev.
CI Status ⏳ Observed 27 successful, 1 skipped, 4 pending, 0 failing checks via gh pr checks.
Functional Verification ✅ Before/after SDK execution confirmed the security leak is fixed for the exercised ACP paths.
Functional Verification

Test 1: ACP streamed text and tool-call event fields are masked

Step 1 — Reproduce / establish baseline (without the fix):
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_mask_qa.py with a script that sends ACP AgentMessageChunk, AgentThoughtChunk, ToolCallStart, and ToolCallProgress objects containing supersecret through _OpenHandsACPBridge after seeding SecretRegistry:

leaked_secret= True
mask_count= 0
streamed_tokens= ['message token=supersecret']
start_event_raw_input= {'argv': ['echo', 'supersecret'], 'command': 'echo supersecret'}
terminal_event_raw_output= {'stdout': 'done supersecret'}

This confirms the bug exists on base: streamed tokens and tool-call event fields preserve the raw injected secret.

Step 2 — Apply the PR's changes:
Checked out fix-acp-mask-injected-secrets at 17f899f88d817afbab31d00fe3af684feec5b0f3.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_mask_qa.py:

leaked_secret= False
mask_count= 19
streamed_tokens= ['message token=<secret-hidden>']
start_event_raw_input= {'argv': ['echo', '<secret-hidden>'], 'command': 'echo <secret-hidden>'}
terminal_event_raw_output= {'stdout': 'done <secret-hidden>'}

This shows the changed behavior works: streamed message text and nested tool-call raw input/output are masked before reaching the emitted/accumulated ACP event data.

Test 2: Finalized ACP turn masks a secret split across chunks

Step 1 — Reproduce / establish baseline (without the fix):
The same baseline run populated an ACP agent client's accumulated response with joined split super + secret value, then finalized the turn:

finish_messages= ['joined split supersecret value']

This confirms the persistence-boundary bug exists on base: the joined final response reassembles and leaks the secret.

Step 2 — Apply the PR's changes:
Checked out fix-acp-mask-injected-secrets at 17f899f88d817afbab31d00fe3af684feec5b0f3.

Step 3 — Re-run with the fix in place:
The PR run produced:

finish_messages= ['joined split <secret-hidden> value']

This confirms the final turn text is re-masked after joining chunks, covering the split-secret case described in the PR.

Issues Found

None.


This review was created by an AI agent (OpenHands) on behalf of the user.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR addresses a real security vulnerability: injected secrets in the ACP subprocess environment could leak back in cleartext through the persisted event stream and live on_token / on_event sinks. The fix is well-structured with a layered approach:

  • Per-chunk masking at ingestion in session_update (best-effort, handles most cases)
  • Per-frame re-masking of tool-call entries (handles cumulative progress frames)
  • Join-boundary re-masking in _finalize_successful_turn (catches secrets split across chunk boundaries)

The recursive _mask_json_value helper handles arbitrary JSON-structured tool call payloads correctly. Test coverage is comprehensive and the defensive error handling in _mask_value matches the existing masking contract.

One security gap remains that should be addressed before merging.


🔴 Critical: ask_agent() fork path does not mask accumulated text

The _fork_accumulated_text list (used by ask_agent()) is populated at line 693 without masking:

# Line 693 — unchanged by this PR, but within scope of the fix
self._fork_accumulated_text.append(update.content.text)  # ← not masked

The result is returned directly to the caller at ask_agent() line 2656:

result = "".join(client._fork_accumulated_text)  # ← may contain plaintext secret

While the fork result is not persisted to the event stream in the same way as the main turn, ask_agent() is documented as a "sidebar reply" that is typically displayed in the UI and relayed to network consumers. A secret echoed by the ACP server in the fork session would reach whoever receives the ask_agent() return value in cleartext.

Fix: Apply self._mask_value() when appending to _fork_accumulated_text:

if self._fork_session_id is not None and session_id == self._fork_session_id:
    if isinstance(update, AgentMessageChunk):
        if isinstance(update.content, TextContentBlock):
            self._fork_accumulated_text.append(self._mask_value(update.content.text))
    return

Note that self.mask is already set by _reset_client_for_turn() before step()/astep() runs, and ask_agent() runs concurrently on the shared self._client — so self.mask is already available when the fork path executes.

A test mirroring test_message_chunk_masked_in_relay_and_accumulation but routing through _fork_session_id would complete coverage.


🟡 Acknowledged residual risk: per-chunk on_token leaks split secrets

The code comment correctly documents that a secret split across two streaming chunks will reach on_token in cleartext (each half doesn't match the pattern alone). The join-boundary re-mask catches it before persistence. This is a reasonable trade-off for streaming UX. Worth keeping the comment as documentation of the known limitation.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
The regular terminal masks every observation via
secret_registry.mask_secrets_in_output(); ACP masked nothing, so a
subprocess that echoed an injected credential (env, a failing curl
printing an Authorization header) landed in the persisted event stream
and WebSocket relay in cleartext — the largest credential-leak surface,
widened on cloud where subscription tokens are injected.

Bind the registry masker onto the bridge once in _start_acp_server
(conversation-stable; a pure read of _exported_values with none of the
cross-thread/state-lock hazards that make on_event/on_token per-turn)
and route ACP output through it before emitting:
- streamed AgentMessageChunk text (on_token + accumulated_text) and
  AgentThoughtChunk text
- tool-call title/raw_input/raw_output/content, masked at ingestion so
  the started event, the single terminal event (#3465 dedup), and the
  flush/supersede path all carry masked values and the accumulator never
  holds plaintext
- the ask_agent() fork accumulator + its joined return value, which is
  relayed to the caller/UI (binding the masker once is what makes this
  reachable even when ask_agent runs while no turn is active)
- the joined response_text/thought_text in _finalize_successful_turn
  (persistence boundary), which also catches secrets split across chunks

The mask set is the registry's _exported_values, seeded by
get_secret_value when _start_acp_server reads state.secret_registry —
the canonical channel that #3464 routes provider creds and (on the
caller/cloud path) agent_context.secrets through. No new SecretRegistry
API; the deprecated acp_env / canvas-local drain are out of scope.

A failing masker falls back to the original value (crash-safe), matching
the terminal tool contract. The live on_token relay still forwards a
secret split mid-value across chunks as fragments; the persisted stream
is fully masked (follow-up: OpenHands/agent-canvas#1049).

Implements pre-cloud refactor P0-2 (OpenHands/agent-canvas#1023).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@simonrosenberg simonrosenberg force-pushed the fix-acp-mask-injected-secrets branch from 17f899f to ec5d309 Compare June 3, 2026 09:52
@simonrosenberg

Copy link
Copy Markdown
Member Author

Addressed the CHANGES_REQUESTED in ec5d309: masked the ask_agent() fork path (🔴) and added the fork-return re-mask (🟡). Key design point — I bound the masker once in _start_acp_server rather than per-turn, because ask_agent() can run while no turn is active (where the per-turn mask would be None); this makes fork masking effective regardless of timing and also drops the per-turn threading. Both threads resolved.

Copy link
Copy Markdown
Collaborator

🔍 Review in progress…

We are performing the review through OpenHands Cloud Automation. You can log in and view the conversation here.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

ACP secret masking worked across streamed text, tool-call event fields, and persisted FinishAction output when exercised with real ACP schema updates and an ACPAgent.step() turn.

Does this PR achieve its stated goal?

Yes. The goal was to prevent ACP-emitted injected secrets from reaching streamed token output, tool-call output, or persisted turn output in cleartext. On the base branch, the same verification script exposed supersecret in all three sinks; on PR commit ec5d30926bf9e7a3e30a6f2aed37a653ba97524a, those values were replaced with <secret-hidden>, including a secret split across two response chunks and only reassembled at finalization.

Phase Result
Environment Setup make build completed successfully and installed project dependencies.
CI Status ⏳ Observed 20 passing, 8 pending, and 3 skipped checks; no failing checks at observation time.
Functional Verification ✅ Baseline leak reproduced; PR masked the exercised stream, tool-call, and persisted outputs.
Functional Verification

Test 1: ACP stream, tool-call events, and persisted turn output

I used a temporary SDK script (/tmp/qa_acp_secret_masking.py) that:

  • sends real acp.schema.AgentMessageChunk, ToolCallStart, and ToolCallProgress updates through _OpenHandsACPBridge.session_update();
  • captures the same on_token / on_event sinks that are relayed or persisted by the SDK; and
  • runs ACPAgent.step() with a simulated ACP turn whose final response is split as super + secret before the persisted FinishAction is emitted.

Step 1 — Reproduce baseline without the fix:
Ran git checkout --quiet origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_secret_masking.py:

STREAM_TOKEN= visible token: supersecret
STREAM_ACCUMULATED= visible token: supersecret
TOOL_START_TITLE= Running echo supersecret
TOOL_START_RAW_INPUT= {'command': 'echo supersecret', 'nested': ['arg=supersecret']}
TOOL_TERMINAL_RAW_OUTPUT= completed with supersecret
TOOL_ACCUMULATED= {'tool_call_id': 'tc-1', 'title': 'Running echo supersecret', 'tool_kind': 'execute', 'status': 'completed', 'raw_input': {'command': 'echo supersecret', 'nested': ['arg=supersecret']}, 'raw_output': 'completed with supersecret', 'content': None}
FINISH_MESSAGE= split value: supersecret done

This confirms the bug: the ACP bridge and final ACP turn output could expose the injected secret in cleartext.

Step 2 — Apply the PR's changes:
Checked out fix-acp-mask-injected-secrets at ec5d30926bf9e7a3e30a6f2aed37a653ba97524a.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_secret_masking.py:

STREAM_TOKEN= visible token: <secret-hidden>
STREAM_ACCUMULATED= visible token: <secret-hidden>
TOOL_START_TITLE= Running echo <secret-hidden>
TOOL_START_RAW_INPUT= {'command': 'echo <secret-hidden>', 'nested': ['arg=<secret-hidden>']}
TOOL_TERMINAL_RAW_OUTPUT= completed with <secret-hidden>
TOOL_ACCUMULATED= {'tool_call_id': 'tc-1', 'title': 'Running echo <secret-hidden>', 'tool_kind': 'execute', 'status': 'completed', 'raw_input': {'command': 'echo <secret-hidden>', 'nested': ['arg=<secret-hidden>']}, 'raw_output': 'completed with <secret-hidden>', 'content': None}
FINISH_MESSAGE= split value: <secret-hidden> done

This shows the fix works for the exercised behavior: streamed text and accumulated text are masked, nested tool-call JSON fields are masked before event emission/accumulation, and the final persisted message catches the split secret at the join boundary.

Issues Found

None.

This PR review was created by an AI agent (OpenHands) on behalf of the user.

@all-hands-bot all-hands-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — fix(acp): mask injected secrets in tool-call output and streamed text

🟢 Good taste — Well-structured security fix with proper defense-in-depth layering and solid test coverage.


Summary

This PR closes a real P0 gap: the ACP agent path previously forwarded subprocess output verbatim into the persisted event stream and live on_token/on_event sinks, while the regular terminal tool always ran output through secret_registry.mask_secrets_in_output. The fix routes ACP output through the same masker with three complementary tiers:

  1. Per-chunk masking at ingestionAgentMessageChunk, AgentThoughtChunk, and fork-session text are masked before touching accumulated_text / accumulated_thoughts / _fork_accumulated_text.
  2. Per-frame tool-call masking_mask_tool_call_entry mutates the accumulator entry in place so that the start event, every progress frame, the flush path, and the supersede path (_cancel_inflight_tool_calls) all carry masked values without needing separate masking in each emitter.
  3. Join-boundary re-masking_finalize_successful_turn and the ask_agent() fork join both re-mask the concatenated string, catching any secret split across two consecutive chunks.

The recursive _mask_json_value helper correctly handles string, dict, list, and scalar leaves. The defensive _mask_value fallback (log at DEBUG, return original on exception) matches the terminal tool masking contract and avoids crashing the agent on a broken masker.


Note on the prior CHANGES_REQUESTED review

The existing CHANGES_REQUESTED review cites a "🔴 Critical" gap — that the ask_agent() fork path does not mask _fork_accumulated_text. This issue is already addressed by the current PR. The fix is present in two places:

  • At ingestion (session_update, fork branch): self._fork_accumulated_text.append(self._mask_value(update.content.text))
  • At the join boundary (ask_agent(), fork result): result = client._mask_value("".join(client._fork_accumulated_text))

A dedicated test (test_fork_session_text_masked) covers the fork ingestion path. The CHANGES_REQUESTED verdict no longer reflects the current code; this PR is ready for re-evaluation.


Minor observations (non-blocking)

1. Stale comment in _OpenHandsACPBridge.__init__ — "set per turn" contradicts the actual lifetime

The comment at self.mask = None (lines 597–602) says "set per turn by ACPAgent", but mask is bound once in _start_acp_server for the full conversation lifetime and explicitly not cleared by reset(). The _reset_client_for_turn docstring correctly says "The secret masker is bound once in _start_acp_server (conversation-stable), not here." The __init__ comment should match that wording.

2. reset() silently skips mask — intent not documented

reset() clears on_token, on_event, and on_activity, but says nothing about mask. A future contributor cannot distinguish "forgotten" from "by design" without reading _start_acp_server and the test. A one-liner like # mask is conversation-lifetime (bound once in _start_acp_server) — NOT cleared here makes the invariant self-documenting.


Known / acknowledged residual gap

The live on_token relay can still forward individual chunks of a secret that straddles a streaming frame boundary (each fragment alone does not match the pattern; only the join reassembles it). The PR description and inline comments correctly document this and link a follow-up issue. The join-boundary re-mask fully protects the persisted stream.


Tests

Coverage is comprehensive: TestMaskJsonValue (scalar, nested dict/list), TestACPBridgeMasking (message chunk, thought chunk, tool-call start + accumulator, tool-call progress + terminal event, no-op when mask is None, error fallback, reset() preserves mask, fork-session text), TestACPStepMasksPersistedTurn (end-to-end: secret split across chunks is caught at the persistence boundary). All tests exercise real code paths without excessive mocking.


[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

Changes are additive and entirely within the ACP agent path. No existing API surfaces are modified. The defensive fallback in _mask_value prevents the masking logic from breaking the agent on unexpected errors.

VERDICT:Worth merging — security fix is correct and complete; the two comment-only nits above can be addressed in a follow-up or a quick fixup commit.

KEY INSIGHT: Setting mask once at conversation start (in _start_acp_server) rather than per-turn avoids a race between ask_agent() fork sessions (which run concurrently on the shared _client) and the main turn's reset() / callback-wiring cycle — a subtle but important design choice.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

@simonrosenberg simonrosenberg merged commit 2f83959 into main Jun 3, 2026
42 of 43 checks passed
@simonrosenberg simonrosenberg deleted the fix-acp-mask-injected-secrets branch June 3, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acp About ACP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants