You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent subprocess runs with permissionMode: "bypassPermissions" + allowDangerouslySkipPermissions: true and Bash in allowedTools (src/core/executor.ts:259-261). Destructive-action prohibitions — "NEVER force push", "NEVER call gh pr merge", "NEVER push to the base branch" — are encoded only as system-prompt directives: src/core/prompt-builder.ts:193, src/core/prompt-builder.ts:392, src/core/prompt-builder.ts:594, src/core/prompt-builder.ts:755, plus per-handler reinforcements at src/workflows/handlers/resolve.ts:392-393 and src/workflows/handlers/review.ts:326. There is no runtime gate that physically blocks the agent from executing git push --force, git reset --hard, gh pr merge, git branch -D, or the GraphQL mergePullRequest / mergeBranch mutations once the LLM decides to call Bash.
The repo's scripts/check-no-destructive-actions.ts (FR-009 static guard, wired via bun run check:no-destructive) carries the canonical forbidden-pattern set (lines 50-65) and CI-fails when our source code writes such commands inside src/workflows/ship/ or src/daemon/scoped-*-executor.ts. That guard protects us from shipping new bypass paths — but it cannot stop the runtime LLM from typing one of those exact commands into a Bash tool call after a successful prompt injection from cloned-PR content, an attacker-controlled comment body, or an MCP tool result.
The Claude Agent SDK has shipped a first-class fix for this since v0.3.0: the hooks option, with a PreToolUse matcher, lets the executor intercept every Bash tool call, scan tool_input.command, and return permissionDecision: "deny" with a structured reason that flows back into the model as a tool-result error (see the SDK docs and the example in the project's own context7 query above). hooks is the only queryOptions field documented for runtime tool-call gating, and a grep of src/ confirms zero current callers (HookCallback, PreToolUseHookInput, hookSpecificOutput are unimported). Wiring a single PreToolUse: [{ matcher: "Bash", hooks: [...] }] into the queryOptions constructed at src/core/executor.ts:257-330 — backed by reusing the existing FORBIDDEN array from scripts/check-no-destructive-actions.ts (lift it into src/utils/forbidden-bash.ts so the static guard and the runtime hook share one source of truth) — closes the gap with no architectural change and no new dependency.
Diagram
flowchart TB
subgraph Now["Current: prompt-only barrier"]
direction TB
P1[Prompt: NEVER force push<br/>NEVER gh pr merge]:::warn
A1[Claude Agent SDK<br/>bypassPermissions + Bash allowed]:::neutral
T1{Attacker-influenced<br/>context overrides directive?}:::warn
B1[Bash: git push --force]:::danger
G1[GitHub remote ref rewritten]:::danger
OK1[Safe path]:::ok
P1 --> A1 --> T1
T1 -- yes --> B1 --> G1
T1 -- no --> OK1
end
subgraph Hook["Proposed: PreToolUse hook gate"]
direction TB
P2[Same prompt]:::warn
A2[SDK with hooks.PreToolUse<br/>matcher: Bash]:::neutral
B2[Bash: git push --force]:::danger
H2{Hook scans tool_input.command<br/>against FORBIDDEN regex set}:::ok
D2[permissionDecision: deny<br/>reason posted to model]:::ok
OK2[Tool call blocked<br/>agent recovers or fails closed]:::ok
ALLOW2[allow]:::ok
P2 --> A2 --> B2 --> H2
H2 -- match --> D2 --> OK2
H2 -- no match --> ALLOW2
end
classDef neutral fill:#1f4e79,color:#ffffff,stroke:#0d2a44,stroke-width:1px
classDef warn fill:#9a6700,color:#ffffff,stroke:#5c3d00,stroke-width:1px
classDef danger fill:#a40e26,color:#ffffff,stroke:#5c0815,stroke-width:1px
classDef ok fill:#196f3d,color:#ffffff,stroke:#0d3a20,stroke-width:1px
Loading
Rationale
The CLAUDE.md "Security invariants" section already enumerates four prompt-injection-hardening contracts (subprocess env allowlist, output secret-strip, input sanitization, structured-output parsing). A fifth contract — runtime tool-call gating for destructive Bash invocations — is the natural next layer, and it is the only one of the five that has a vendor-supported SDK API rather than a custom chokepoint we maintain ourselves. Crucially, defense-in-depth is the explicit pattern this codebase already chose for idempotency (Valkey claim + idx_workflow_runs_inflight partial-unique index, see CLAUDE.md "Idempotency" section) and for output secret-stripping (regex pass + LLM scanner + deletion-only gate, src/utils/github-output-guard.ts). The current destructive-action story is the odd one out: a single prose-level barrier with no runtime backstop.
The blast radius of a single successful prompt-injected git push --force-with-lease against a PR head is meaningful — collaborators' in-flight commits get overwritten, the PR review history detaches, and force-with-lease itself is one of the FORBIDDEN patterns the static guard explicitly forbids us from emitting (scripts/check-no-destructive-actions.ts:51). A gh pr merge injection violates FR-017 ("merging is a human action") and bypasses any branch-protection check that allows the GitHub App identity. Both are credible after the recent .mcp.json auto-load bypass (#196) and .claude/settings.json SessionStart bypass (#191) showed the cloned PR tree is an active prompt-injection surface against this exact subprocess; both were closed with SDK-option fixes (strictMcpConfig: true, settingSources: []) inside the same queryOptions object that this proposal extends.
Adoption cost is low: one new field on the queryOptions literal, one new file src/utils/forbidden-bash.ts exporting the shared FORBIDDEN regex array, a one-line refactor in scripts/check-no-destructive-actions.ts to import from it, and a focused unit test exercising representative deny inputs (force push variants, gh pr merge, git reset --hard, the GraphQL mutation names already in scope). The hook is per-query() invocation, so it composes with the existing useCacheableLayout pivot and the per-call model override without rework.
References
Internal:
src/core/executor.ts:257-330 — queryOptions literal where hooks: { PreToolUse: [...] } would be added (between mcpServers at 278 and systemPrompt at 291).
src/core/executor.ts:259-268 — current permissionMode: "bypassPermissions", allowDangerouslySkipPermissions: true, allowedTools, disallowedTools: ["ToolSearch"] surface that the hook complements.
src/core/prompt-builder.ts:193, src/core/prompt-builder.ts:392, src/core/prompt-builder.ts:594, src/core/prompt-builder.ts:755 — current prompt-only "NEVER force push" directives that the hook would back with runtime enforcement.
src/workflows/handlers/resolve.ts:392-393, src/workflows/handlers/review.ts:326 — per-handler prompt-only reinforcements ("NEVER call gh pr merge", "NEVER push to the base branch").
CLAUDE.md "Security invariants (prompt-injection hardening)" section — the four-contract list this finding extends with a fifth, runtime tool-call gating.
CLAUDE.md "CI/CD Pipeline" entry for check:no-destructive — describes the existing static-guard layer this finding makes runtime.
External:
Claude Agent SDK — Intercept and control agent behavior with hooks — official SDK docs for PreToolUse matchers, deny-decision return shape, and the security rationale ("dangerous commands are physically blocked at the execution layer; the agent cannot bypass, forget, or reason its way around it").
Claude Agent SDK TypeScript — Query Function Options — hooks option signature Partial<Record<HookEvent, HookCallbackMatcher[]>>, available since the v0.3.x line pinned in package.json (@anthropic-ai/claude-agent-sdk: ^0.3.0).
Claude Code Hooks: 6 Production Patterns (2026) — production pattern catalog for PreToolUse Bash-command pattern matching including rm -rf /, DROP TABLE, curl | bash, the exact class of guard proposed here.
Suggested Next Steps
Lift FORBIDDEN into src/utils/forbidden-bash.ts as an exported readonly array of { pattern: RegExp; description: string } so the static guard and the runtime hook consume one source of truth. Update scripts/check-no-destructive-actions.ts:50 to import it.
Add forbiddenBashHook: HookCallback in a new src/core/hooks/forbidden-bash.ts that casts input to PreToolUseHookInput, extracts tool_input.command (string), tests each FORBIDDEN pattern, and returns { hookSpecificOutput: { hookEventName: "PreToolUse", permissionDecision: "deny", permissionDecisionReason: "Blocked by forbidden-bash hook: ${description}" } } on first match. Default-allow with {}.
Wire it into queryOptions at src/core/executor.ts:278 as hooks: { PreToolUse: [{ matcher: "Bash", hooks: [forbiddenBashHook] }] }. Log a structured agent.hook.denied event (delivery-correlated child logger) on every deny so the operator sees runtime gating activity, mirroring the llm_scanner_substitution_rejected log convention from src/utils/github-output-guard.ts.
Add a focused unit test at src/core/hooks/forbidden-bash.test.ts covering: git push --force, git push -f, git push --force-with-lease, git reset --hard HEAD~1, gh pr merge 123 --squash, plus three benign commands (git push origin HEAD, git status, gh pr checks 5) that MUST allow. Use the existing colocated-test pattern (issue fix(testing): scripts/test-isolated.sh glob silently excludes 4 src/**/*.test.ts files from CI #201) so CI picks it up.
Extend CLAUDE.md "Security invariants" with a fifth contract — Runtime destructive-Bash gate (src/core/hooks/forbidden-bash.ts) — and cross-link from the check:no-destructive bullet in "CI/CD Pipeline" so future contributors see both layers as paired.
Update docs/build/architecture.md and docs/operate/configuration.md alongside the code change (per the CLAUDE.md "Documentation" doc-sync rule) so the doc-citation CI gate stays green.
Areas Evaluated
Read src/core/executor.ts in full to confirm current queryOptions shape, permissionMode, allowedTools/disallowedTools surface, and absence of any hooks: field on the literal.
Read src/core/prompt-builder.ts:185-200 to confirm the prompt-only NEVER force push directive at line 193 is part of the standard commitInstructions block emitted for every PR job.
Read src/workflows/handlers/resolve.ts:385-399 to confirm the gh pr merge / base-branch push prohibitions at lines 392-393 are also prompt-only.
Read scripts/check-no-destructive-actions.ts:1-80 to confirm the static guard is source-code-only (greps our .ts files) and lifts the canonical FORBIDDEN regex array.
Grepped src/ for HookCallback|PreToolUse|hookSpecificOutput and hooks: — zero matches, confirming the SDK hooks API is unused.
Queried context7 (/nothflare/claude-agent-sdk-docs) for the current TS query() options surface, confirmed hooks is documented and matches the v0.3.x line pinned in package.json.
Cross-checked existing closed agent-sdk issue (settingSources [] / #191) to ensure no overlap; this finding addresses a distinct, complementary attack surface (Bash tool call) using the same queryOptions extension pattern.
WebSearch'd current Claude Agent SDK hook production patterns (2026) to confirm the PreToolUse + matcher: "Bash" + deny-on-pattern approach is the recommended SDK-native idiom for exactly this class of guard.
Generated by the scheduled research action on 2026-06-08
Finding
The agent subprocess runs with
permissionMode: "bypassPermissions"+allowDangerouslySkipPermissions: trueandBashinallowedTools(src/core/executor.ts:259-261). Destructive-action prohibitions — "NEVER force push", "NEVER callgh pr merge", "NEVER push to the base branch" — are encoded only as system-prompt directives:src/core/prompt-builder.ts:193,src/core/prompt-builder.ts:392,src/core/prompt-builder.ts:594,src/core/prompt-builder.ts:755, plus per-handler reinforcements atsrc/workflows/handlers/resolve.ts:392-393andsrc/workflows/handlers/review.ts:326. There is no runtime gate that physically blocks the agent from executinggit push --force,git reset --hard,gh pr merge,git branch -D, or the GraphQLmergePullRequest/mergeBranchmutations once the LLM decides to call Bash.The repo's
scripts/check-no-destructive-actions.ts(FR-009 static guard, wired viabun run check:no-destructive) carries the canonical forbidden-pattern set (lines 50-65) and CI-fails when our source code writes such commands insidesrc/workflows/ship/orsrc/daemon/scoped-*-executor.ts. That guard protects us from shipping new bypass paths — but it cannot stop the runtime LLM from typing one of those exact commands into aBashtool call after a successful prompt injection from cloned-PR content, an attacker-controlled comment body, or an MCP tool result.The Claude Agent SDK has shipped a first-class fix for this since v0.3.0: the
hooksoption, with aPreToolUsematcher, lets the executor intercept everyBashtool call, scantool_input.command, and returnpermissionDecision: "deny"with a structured reason that flows back into the model as a tool-result error (see the SDK docs and the example in the project's own context7 query above).hooksis the onlyqueryOptionsfield documented for runtime tool-call gating, and agrepofsrc/confirms zero current callers (HookCallback,PreToolUseHookInput,hookSpecificOutputare unimported). Wiring a singlePreToolUse: [{ matcher: "Bash", hooks: [...] }]into thequeryOptionsconstructed atsrc/core/executor.ts:257-330— backed by reusing the existingFORBIDDENarray fromscripts/check-no-destructive-actions.ts(lift it intosrc/utils/forbidden-bash.tsso the static guard and the runtime hook share one source of truth) — closes the gap with no architectural change and no new dependency.Diagram
flowchart TB subgraph Now["Current: prompt-only barrier"] direction TB P1[Prompt: NEVER force push<br/>NEVER gh pr merge]:::warn A1[Claude Agent SDK<br/>bypassPermissions + Bash allowed]:::neutral T1{Attacker-influenced<br/>context overrides directive?}:::warn B1[Bash: git push --force]:::danger G1[GitHub remote ref rewritten]:::danger OK1[Safe path]:::ok P1 --> A1 --> T1 T1 -- yes --> B1 --> G1 T1 -- no --> OK1 end subgraph Hook["Proposed: PreToolUse hook gate"] direction TB P2[Same prompt]:::warn A2[SDK with hooks.PreToolUse<br/>matcher: Bash]:::neutral B2[Bash: git push --force]:::danger H2{Hook scans tool_input.command<br/>against FORBIDDEN regex set}:::ok D2[permissionDecision: deny<br/>reason posted to model]:::ok OK2[Tool call blocked<br/>agent recovers or fails closed]:::ok ALLOW2[allow]:::ok P2 --> A2 --> B2 --> H2 H2 -- match --> D2 --> OK2 H2 -- no match --> ALLOW2 end classDef neutral fill:#1f4e79,color:#ffffff,stroke:#0d2a44,stroke-width:1px classDef warn fill:#9a6700,color:#ffffff,stroke:#5c3d00,stroke-width:1px classDef danger fill:#a40e26,color:#ffffff,stroke:#5c0815,stroke-width:1px classDef ok fill:#196f3d,color:#ffffff,stroke:#0d3a20,stroke-width:1pxRationale
The CLAUDE.md "Security invariants" section already enumerates four prompt-injection-hardening contracts (subprocess env allowlist, output secret-strip, input sanitization, structured-output parsing). A fifth contract — runtime tool-call gating for destructive Bash invocations — is the natural next layer, and it is the only one of the five that has a vendor-supported SDK API rather than a custom chokepoint we maintain ourselves. Crucially, defense-in-depth is the explicit pattern this codebase already chose for idempotency (Valkey claim +
idx_workflow_runs_inflightpartial-unique index, see CLAUDE.md "Idempotency" section) and for output secret-stripping (regex pass + LLM scanner + deletion-only gate,src/utils/github-output-guard.ts). The current destructive-action story is the odd one out: a single prose-level barrier with no runtime backstop.The blast radius of a single successful prompt-injected
git push --force-with-leaseagainst a PR head is meaningful — collaborators' in-flight commits get overwritten, the PR review history detaches, andforce-with-leaseitself is one of theFORBIDDENpatterns the static guard explicitly forbids us from emitting (scripts/check-no-destructive-actions.ts:51). Agh pr mergeinjection violates FR-017 ("merging is a human action") and bypasses any branch-protection check that allows the GitHub App identity. Both are credible after the recent.mcp.jsonauto-load bypass (#196) and.claude/settings.jsonSessionStart bypass (#191) showed the cloned PR tree is an active prompt-injection surface against this exact subprocess; both were closed with SDK-option fixes (strictMcpConfig: true,settingSources: []) inside the samequeryOptionsobject that this proposal extends.Adoption cost is low: one new field on the
queryOptionsliteral, one new filesrc/utils/forbidden-bash.tsexporting the sharedFORBIDDENregex array, a one-line refactor inscripts/check-no-destructive-actions.tsto import from it, and a focused unit test exercising representative deny inputs (force push variants,gh pr merge,git reset --hard, the GraphQL mutation names already in scope). The hook is per-query()invocation, so it composes with the existinguseCacheableLayoutpivot and the per-callmodeloverride without rework.References
Internal:
src/core/executor.ts:257-330—queryOptionsliteral wherehooks: { PreToolUse: [...] }would be added (betweenmcpServersat 278 andsystemPromptat 291).src/core/executor.ts:259-268— currentpermissionMode: "bypassPermissions",allowDangerouslySkipPermissions: true,allowedTools,disallowedTools: ["ToolSearch"]surface that the hook complements.scripts/check-no-destructive-actions.ts:50-65— canonicalFORBIDDENregex array (force-push variants,git reset --hard,git branch -D,git filter-branch,git filter-repo,git replace,gh pr merge,mergePullRequest,mergeBranch) to share with the runtime hook.src/core/prompt-builder.ts:193,src/core/prompt-builder.ts:392,src/core/prompt-builder.ts:594,src/core/prompt-builder.ts:755— current prompt-only "NEVER force push" directives that the hook would back with runtime enforcement.src/workflows/handlers/resolve.ts:392-393,src/workflows/handlers/review.ts:326— per-handler prompt-only reinforcements ("NEVER callgh pr merge", "NEVER push to the base branch").CLAUDE.md"Security invariants (prompt-injection hardening)" section — the four-contract list this finding extends with a fifth, runtime tool-call gating.CLAUDE.md"CI/CD Pipeline" entry forcheck:no-destructive— describes the existing static-guard layer this finding makes runtime.External:
PreToolUsematchers, deny-decision return shape, and the security rationale ("dangerous commands are physically blocked at the execution layer; the agent cannot bypass, forget, or reason its way around it").hooksoption signaturePartial<Record<HookEvent, HookCallbackMatcher[]>>, available since the v0.3.x line pinned inpackage.json(@anthropic-ai/claude-agent-sdk: ^0.3.0).PreToolUseBash-command pattern matching includingrm -rf /,DROP TABLE,curl | bash, the exact class of guard proposed here.Suggested Next Steps
FORBIDDENintosrc/utils/forbidden-bash.tsas an exportedreadonlyarray of{ pattern: RegExp; description: string }so the static guard and the runtime hook consume one source of truth. Updatescripts/check-no-destructive-actions.ts:50to import it.forbiddenBashHook: HookCallbackin a newsrc/core/hooks/forbidden-bash.tsthat castsinputtoPreToolUseHookInput, extractstool_input.command(string), tests eachFORBIDDENpattern, and returns{ hookSpecificOutput: { hookEventName: "PreToolUse", permissionDecision: "deny", permissionDecisionReason: "Blocked by forbidden-bash hook: ${description}" } }on first match. Default-allow with{}.queryOptionsatsrc/core/executor.ts:278ashooks: { PreToolUse: [{ matcher: "Bash", hooks: [forbiddenBashHook] }] }. Log a structuredagent.hook.deniedevent (delivery-correlated child logger) on every deny so the operator sees runtime gating activity, mirroring thellm_scanner_substitution_rejectedlog convention fromsrc/utils/github-output-guard.ts.src/core/hooks/forbidden-bash.test.tscovering:git push --force,git push -f,git push --force-with-lease,git reset --hard HEAD~1,gh pr merge 123 --squash, plus three benign commands (git push origin HEAD,git status,gh pr checks 5) that MUST allow. Use the existing colocated-test pattern (issue fix(testing): scripts/test-isolated.sh glob silently excludes 4 src/**/*.test.ts files from CI #201) so CI picks it up.src/core/hooks/forbidden-bash.ts) — and cross-link from thecheck:no-destructivebullet in "CI/CD Pipeline" so future contributors see both layers as paired.docs/build/architecture.mdanddocs/operate/configuration.mdalongside the code change (per the CLAUDE.md "Documentation" doc-sync rule) so the doc-citation CI gate stays green.Areas Evaluated
src/core/executor.tsin full to confirm currentqueryOptionsshape,permissionMode,allowedTools/disallowedToolssurface, and absence of anyhooks:field on the literal.src/core/prompt-builder.ts:185-200to confirm the prompt-onlyNEVER force pushdirective at line 193 is part of the standardcommitInstructionsblock emitted for every PR job.src/workflows/handlers/resolve.ts:385-399to confirm thegh pr merge/ base-branch push prohibitions at lines 392-393 are also prompt-only.scripts/check-no-destructive-actions.ts:1-80to confirm the static guard is source-code-only (greps our.tsfiles) and lifts the canonicalFORBIDDENregex array.src/forHookCallback|PreToolUse|hookSpecificOutputandhooks:— zero matches, confirming the SDK hooks API is unused./nothflare/claude-agent-sdk-docs) for the current TSquery()options surface, confirmedhooksis documented and matches the v0.3.x line pinned inpackage.json.settingSources []/#191) to ensure no overlap; this finding addresses a distinct, complementary attack surface (Bash tool call) using the samequeryOptionsextension pattern.PreToolUse+ matcher: "Bash" + deny-on-pattern approach is the recommended SDK-native idiom for exactly this class of guard.Generated by the scheduled research action on 2026-06-08