feat(approval-gate): standalone worker for human-in-the-loop function approval#260
feat(approval-gate): standalone worker for human-in-the-loop function approval#260ytallo wants to merge 14 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntroduces the complete Changesapproval-gate Rust worker
Sequence Diagram(s)sequenceDiagram
participant Harness
participant approval_gate as approval-gate
participant policy_worker
participant state_engine
participant EventSink
Harness->>approval_gate: approval::gate (HookInput)
approval_gate->>approval_gate: validate_id / pre_policy_allow
approval_gate->>state_engine: read settings
approval_gate->>policy_worker: policy::check_permissions
policy_worker-->>approval_gate: NeedsApproval
approval_gate->>state_engine: pending::put
approval_gate-->>Harness: HookOutput::Hold
approval_gate--)EventSink: pending_created
Console->>approval_gate: approval::resolve
approval_gate->>state_engine: pending::get
approval_gate->>Harness: harness::function::resolve
approval_gate->>state_engine: pending::delete_with_gate
approval_gate--)EventSink: pending_resolved
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
skill-check — worker0 verified, 21 skipped (no docs/).
Four for four. Nicely done. |
|
Companion harness integration PR stacks on this branch: #261 (adopts this worker, retires the in-process gate). Merge this one first. |
1140eeb to
919e731
Compare
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (2)
approval-gate/README.md (2)
42-44: ⚡ Quick winConfirm cron worker requirement and sweep binding config.
The README states the sweep needs
iii-cronand references a cron trigger binding config, but does not specify the binding config key or defaults. Per the review stack (internals.md), the key isexpressionwith a 6-field cron default"0 * * * * *". If cron is not added, held calls expire only after reachingpending_timeout_ms, which may exceed user expectations.Consider adding: "The binding config key is
expression(6-field cron); see config.yaml for the default."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@approval-gate/README.md` around lines 42 - 44, The README documentation about the `iii-cron` worker requirement needs clarification on the binding configuration details. Expand the existing text around the sweep and iii-cron worker requirement to explicitly state that the binding config key is `expression` with a default 6-field cron value of `"0 * * * * *"`, and add a reference to config.yaml where this default is defined. This will help users understand how to properly configure the cron trigger binding without leaving them to infer the configuration from other documentation.
113-114: ⚡ Quick winClarify what
config.yamlcontains vs. what lives in the engine's configuration entry.Line 113-114 state "
config.yamlcarries runtime wiring only (hook binding globs/budget, sweep cron expression, per-call timeouts)" but do not specify the structure or how to edit it. This could confuse operators who need to tweak timeouts or cron schedules.Consider adding a note: "See
config.yaml(comments in the file document each field). To change deployment defaults (mode, always_allow seed), use the console's Configuration screen to edit theapproval-gateconfiguration entry; to change worker-level settings (hook timeout, sweep schedule), re-deploy with an editedconfig.yaml."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@approval-gate/README.md` around lines 113 - 114, The documentation at lines 113-114 in the README is unclear about the distinction between config.yaml settings and engine configuration entry settings, leaving operators confused about how to modify runtime parameters. Add a clarifying note after the existing text that explains: (1) config.yaml contains comments documenting each field, (2) deployment defaults (mode, always_allow seed) are edited via the console's Configuration screen for the approval-gate entry, and (3) worker-level settings (hook timeout, sweep schedule) require re-deploying with an edited config.yaml. This note should guide operators on which approach to use for different types of configuration changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@approval-gate/src/config.rs`:
- Around line 16-18: The `hook.on_error` configuration field accepts any string
value, which could allow typos or unsafe values to bypass validation and weaken
the fail-closed contract at runtime. At approval-gate/src/config.rs lines 33-35
(where on_error is parsed), add validation to constrain the accepted value to
only "fail_closed" and reject any other strings with a descriptive error during
config loading. The default_on_error function at lines 16-18 can remain as is
since it already provides the correct default. At lines 95-98 (where on_error is
used), verify that the validation at the parsing location is sufficient and no
additional runtime checks are needed, or if those lines represent another
parsing/validation point, apply the same "fail_closed" constraint enforcement
there.
In `@approval-gate/src/events.rs`:
- Around line 190-217: The code is discarding registration errors by using let _
= on both register_trigger_type calls for PENDING_CREATED and PENDING_RESOLVED,
and then unconditionally logging success. Instead of ignoring the Result from
register_trigger_type, properly handle the registration results by checking if
they succeeded before logging, or propagate any errors up the call stack. Each
RegisterTriggerType registration in the ApprovalTriggerHandler setup should have
its error checked, and the tracing::info log statement should only be reached if
all registrations succeeded.
In `@approval-gate/src/functions/gate.rs`:
- Around line 119-123: The duplicate-hold paths are returning the original
timeout duration instead of the remaining time until the persisted expiration.
In both the Ok(Some(existing)) match arm starting at line 119 and the similar
duplicate-hold path at lines 164-166, replace the calculation
(existing.expires_at - existing.pending_at).max(0) with a calculation that
computes the actual remaining time until expiration: subtract the current time
from existing.expires_at to get how much time is left before the hold expires,
ensuring the result does not go below zero with .max(0).
In `@approval-gate/src/functions/get_settings.rs`:
- Around line 29-33: The code has formatting inconsistencies that are blocking
CI checks for cargo fmt. Run cargo fmt --all to apply rustfmt normalization
across the entire codebase. This will automatically fix: (1) the import ordering
and layout in the test module at approval-gate/src/functions/get_settings.rs
lines 29-33, and (2) the imports and signature wrapping in the test helper area
at approval-gate/src/functions/list_pending.rs lines 103-112. After running the
formatter, commit the resulting formatting changes.
In `@approval-gate/src/functions/on_session_deleted.rs`:
- Around line 62-95: The test function
purges_settings_and_only_the_sessions_pending_records has formatting issues that
are causing CI to fail. Additionally, there are formatting issues in
on_turn_completed.rs. Run cargo fmt --all to automatically fix all formatting
issues in the Rust code, which will resolve the failing CI checks.
- Around line 22-30: The error handling in the on_session_deleted function
currently logs a warning when settings::clear fails but continues execution,
which silently acknowledges the session_deleted event despite incomplete
cleanup. Instead of logging and continuing when the settings::clear call fails,
propagate the error by returning it from the function so the trigger can be
retried. Keep the pending purge attempt intact but ensure the clear error is
returned rather than being suppressed.
In `@approval-gate/src/functions/resolve.rs`:
- Around line 45-67: When handling the ResolveDecision::Deny case in the resolve
function, the user_deny_envelope function applies a default reason of "Rejected
by operator." when no explicit reason is provided, but this default reason is
not being carried through to the resolved event emitted later. Extract the
effective deny reason (either from req.reason or the default used by
user_deny_envelope) and ensure that same reason string is passed to the resolved
event emission so that downstream consumers receive consistent deny reason
information in both the envelope content and the event payload.
In `@approval-gate/src/functions/sweep.rs`:
- Around line 38-44: The approval-gate/src/functions/sweep.rs file has rustfmt
formatting drift issues that are causing cargo fmt --check to fail in CI. Run
cargo fmt --all to automatically format the entire codebase according to Rust
formatting standards, which will fix the formatting violations in this file and
allow the lint and test workflows to pass.
In `@approval-gate/src/policy.rs`:
- Around line 26-31: The Rust code formatting does not comply with the project's
style standards enforced by rustfmt. Run cargo fmt --all to automatically format
all Rust code in the approval-gate workspace, which will fix the formatting
violations in policy.rs check function, session.rs, and functions/gate.rs and
resolve the CI failure.
In `@approval-gate/src/settings.rs`:
- Around line 129-133: The Rust code formatting does not comply with the
project's style standards as checked by cargo fmt. Run cargo fmt --all from the
project root to automatically format all Rust source files in the approval-gate
crate and its dependencies according to the configured formatting rules. This
will resolve all formatting failures reported by cargo fmt --check.
In `@approval-gate/src/state.rs`:
- Around line 53-65: The list function in the approval-gate/src/state.rs file
has formatting issues that do not conform to Rust's standard formatting rules
enforced by rustfmt. Run cargo fmt --all from the project root to automatically
reformat the entire codebase, including the list function and any other files
with formatting drift, to ensure the code passes the cargo fmt --check lint
requirement.
In `@approval-gate/src/testkit/engine.rs`:
- Around line 265-273: The code in the register_function call for
"policy::check_permissions" at approval-gate/src/testkit/engine.rs lines 265-273
(and similarly at lines 376-404) does not conform to Rust formatting standards,
particularly around closure and signature wrapping. Run cargo fmt --all to
automatically fix all formatting issues in the file, including the
closure/signature/call wrapping in these register_function calls and any other
formatting inconsistencies.
---
Nitpick comments:
In `@approval-gate/README.md`:
- Around line 42-44: The README documentation about the `iii-cron` worker
requirement needs clarification on the binding configuration details. Expand the
existing text around the sweep and iii-cron worker requirement to explicitly
state that the binding config key is `expression` with a default 6-field cron
value of `"0 * * * * *"`, and add a reference to config.yaml where this default
is defined. This will help users understand how to properly configure the cron
trigger binding without leaving them to infer the configuration from other
documentation.
- Around line 113-114: The documentation at lines 113-114 in the README is
unclear about the distinction between config.yaml settings and engine
configuration entry settings, leaving operators confused about how to modify
runtime parameters. Add a clarifying note after the existing text that explains:
(1) config.yaml contains comments documenting each field, (2) deployment
defaults (mode, always_allow seed) are edited via the console's Configuration
screen for the approval-gate entry, and (3) worker-level settings (hook timeout,
sweep schedule) require re-deploying with an edited config.yaml. This note
should guide operators on which approach to use for different types of
configuration changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 54f2f5d9-8b20-4b81-bfb1-fb19a127fd28
⛔ Files ignored due to path filters (1)
approval-gate/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
approval-gate/Cargo.tomlapproval-gate/README.mdapproval-gate/architecture/README.mdapproval-gate/architecture/integration.mdapproval-gate/architecture/internals.mdapproval-gate/build.rsapproval-gate/config.yamlapproval-gate/iii.worker.yamlapproval-gate/src/config.rsapproval-gate/src/decision.rsapproval-gate/src/denial.rsapproval-gate/src/error.rsapproval-gate/src/events.rsapproval-gate/src/functions/add_always_allow.rsapproval-gate/src/functions/approve_always.rsapproval-gate/src/functions/clear_settings.rsapproval-gate/src/functions/gate.rsapproval-gate/src/functions/get_pending.rsapproval-gate/src/functions/get_settings.rsapproval-gate/src/functions/list_pending.rsapproval-gate/src/functions/mod.rsapproval-gate/src/functions/on_config_change.rsapproval-gate/src/functions/on_session_deleted.rsapproval-gate/src/functions/on_turn_completed.rsapproval-gate/src/functions/purge.rsapproval-gate/src/functions/remove_always_allow.rsapproval-gate/src/functions/resolve.rsapproval-gate/src/functions/set_mode.rsapproval-gate/src/functions/sweep.rsapproval-gate/src/gate_config.rsapproval-gate/src/harness.rsapproval-gate/src/lib.rsapproval-gate/src/main.rsapproval-gate/src/manifest.rsapproval-gate/src/pending.rsapproval-gate/src/policy.rsapproval-gate/src/redact.rsapproval-gate/src/session.rsapproval-gate/src/settings.rsapproval-gate/src/state.rsapproval-gate/src/testkit/engine.rsapproval-gate/src/testkit/mod.rsapproval-gate/src/types.rsapproval-gate/tests/integration.rs
| fn default_on_error() -> String { | ||
| "fail_closed".to_string() | ||
| } |
There was a problem hiding this comment.
Constrain hook.on_error to fail-closed values.
Line 34 currently accepts any string, so typos or unsafe values can pass parsing and weaken the fail-closed contract at runtime. Enforce "fail_closed" during config load.
Suggested patch
-use anyhow::{Context, Result};
+use anyhow::{ensure, Context, Result};
@@
pub fn load_config(path: &str) -> Result<WorkerConfig> {
let raw = std::fs::read_to_string(path).with_context(|| format!("read config {path}"))?;
- serde_yaml::from_str(&raw).with_context(|| format!("parse config {path}"))
+ let cfg: WorkerConfig =
+ serde_yaml::from_str(&raw).with_context(|| format!("parse config {path}"))?;
+ ensure!(
+ cfg.hook.on_error == "fail_closed",
+ "parse config {}: hook.on_error must be \"fail_closed\"",
+ path
+ );
+ Ok(cfg)
}Also applies to: 33-35, 95-98
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@approval-gate/src/config.rs` around lines 16 - 18, The `hook.on_error`
configuration field accepts any string value, which could allow typos or unsafe
values to bypass validation and weaken the fail-closed contract at runtime. At
approval-gate/src/config.rs lines 33-35 (where on_error is parsed), add
validation to constrain the accepted value to only "fail_closed" and reject any
other strings with a descriptive error during config loading. The
default_on_error function at lines 16-18 can remain as is since it already
provides the correct default. At lines 95-98 (where on_error is used), verify
that the validation at the parsing location is sufficient and no additional
runtime checks are needed, or if those lines represent another
parsing/validation point, apply the same "fail_closed" constraint enforcement
there.
| let _ = iii.register_trigger_type( | ||
| RegisterTriggerType::new( | ||
| PENDING_CREATED, | ||
| "A function call was held for human approval and its inbox record written. \ | ||
| Payload: PendingApprovalRecord (redacted args, session context, expiry). \ | ||
| Bind notification workers here.", | ||
| ApprovalTriggerHandler { | ||
| set: sets.created.clone(), | ||
| }, | ||
| ) | ||
| .trigger_request_format::<BindingConfig>(), | ||
| ); | ||
| let _ = iii.register_trigger_type( | ||
| RegisterTriggerType::new( | ||
| PENDING_RESOLVED, | ||
| "A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \ | ||
| Emitted exactly once per record; lets UIs clear badges.", | ||
| ApprovalTriggerHandler { | ||
| set: sets.resolved.clone(), | ||
| }, | ||
| ) | ||
| .trigger_request_format::<BindingConfig>(), | ||
| ); | ||
|
|
||
| tracing::info!( | ||
| trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], | ||
| "registered trigger types" | ||
| ); |
There was a problem hiding this comment.
Do not swallow trigger-type registration failures.
Lines 190 and 202 discard registration errors, and Line 214 logs success regardless. If registration fails, subscribers cannot bind and the worker reports a misleading healthy state.
Suggested patch
pub fn register_trigger_types(iii: &Arc<III>) -> TriggerSets {
let sets = TriggerSets::new();
+ let mut all_ok = true;
- let _ = iii.register_trigger_type(
+ if let Err(e) = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_CREATED,
@@
)
.trigger_request_format::<BindingConfig>(),
- );
- let _ = iii.register_trigger_type(
+ ) {
+ all_ok = false;
+ tracing::warn!(trigger_type = PENDING_CREATED, error = %e, "failed to register trigger type");
+ }
+ if let Err(e) = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_RESOLVED,
@@
)
.trigger_request_format::<BindingConfig>(),
- );
+ ) {
+ all_ok = false;
+ tracing::warn!(trigger_type = PENDING_RESOLVED, error = %e, "failed to register trigger type");
+ }
- tracing::info!(
- trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED],
- "registered trigger types"
- );
+ if all_ok {
+ tracing::info!(trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], "registered trigger types");
+ }
sets
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let _ = iii.register_trigger_type( | |
| RegisterTriggerType::new( | |
| PENDING_CREATED, | |
| "A function call was held for human approval and its inbox record written. \ | |
| Payload: PendingApprovalRecord (redacted args, session context, expiry). \ | |
| Bind notification workers here.", | |
| ApprovalTriggerHandler { | |
| set: sets.created.clone(), | |
| }, | |
| ) | |
| .trigger_request_format::<BindingConfig>(), | |
| ); | |
| let _ = iii.register_trigger_type( | |
| RegisterTriggerType::new( | |
| PENDING_RESOLVED, | |
| "A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \ | |
| Emitted exactly once per record; lets UIs clear badges.", | |
| ApprovalTriggerHandler { | |
| set: sets.resolved.clone(), | |
| }, | |
| ) | |
| .trigger_request_format::<BindingConfig>(), | |
| ); | |
| tracing::info!( | |
| trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], | |
| "registered trigger types" | |
| ); | |
| pub fn register_trigger_types(iii: &Arc<III>) -> TriggerSets { | |
| let sets = TriggerSets::new(); | |
| let mut all_ok = true; | |
| if let Err(e) = iii.register_trigger_type( | |
| RegisterTriggerType::new( | |
| PENDING_CREATED, | |
| "A function call was held for human approval and its inbox record written. \ | |
| Payload: PendingApprovalRecord (redacted args, session context, expiry). \ | |
| Bind notification workers here.", | |
| ApprovalTriggerHandler { | |
| set: sets.created.clone(), | |
| }, | |
| ) | |
| .trigger_request_format::<BindingConfig>(), | |
| ) { | |
| all_ok = false; | |
| tracing::warn!(trigger_type = PENDING_CREATED, error = %e, "failed to register trigger type"); | |
| } | |
| if let Err(e) = iii.register_trigger_type( | |
| RegisterTriggerType::new( | |
| PENDING_RESOLVED, | |
| "A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \ | |
| Emitted exactly once per record; lets UIs clear badges.", | |
| ApprovalTriggerHandler { | |
| set: sets.resolved.clone(), | |
| }, | |
| ) | |
| .trigger_request_format::<BindingConfig>(), | |
| ) { | |
| all_ok = false; | |
| tracing::warn!(trigger_type = PENDING_RESOLVED, error = %e, "failed to register trigger type"); | |
| } | |
| if all_ok { | |
| tracing::info!(trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], "registered trigger types"); | |
| } | |
| sets | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@approval-gate/src/events.rs` around lines 190 - 217, The code is discarding
registration errors by using let _ = on both register_trigger_type calls for
PENDING_CREATED and PENDING_RESOLVED, and then unconditionally logging success.
Instead of ignoring the Result from register_trigger_type, properly handle the
registration results by checking if they succeeded before logging, or propagate
any errors up the call stack. Each RegisterTriggerType registration in the
ApprovalTriggerHandler setup should have its error checked, and the
tracing::info log statement should only be reached if all registrations
succeeded.
| if let Err(e) = settings::clear( | ||
| deps.iii.as_ref(), | ||
| &event.session_id, | ||
| deps.cfg.state_timeout_ms, | ||
| ) | ||
| .await | ||
| { | ||
| tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed"); | ||
| } |
There was a problem hiding this comment.
Don’t acknowledge session::deleted when settings purge fails.
This path logs and continues on settings::clear failure, which can leave per-session approval state behind after deletion. Keep the pending purge attempt, but return the clear error so the trigger can be retried instead of silently accepting partial cleanup.
Suggested fix
- if let Err(e) = settings::clear(
- deps.iii.as_ref(),
- &event.session_id,
- deps.cfg.state_timeout_ms,
- )
- .await
- {
- tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed");
- }
+ let clear_res = settings::clear(
+ deps.iii.as_ref(),
+ &event.session_id,
+ deps.cfg.state_timeout_ms,
+ )
+ .await;
let purged = purge::purge_matching(deps, |r| r.session_id == event.session_id).await;
+ if let Err(e) = clear_res {
+ tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed");
+ return Err(e);
+ }
tracing::info!(session_id = %event.session_id, purged, "session deleted: approval records purged");🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@approval-gate/src/functions/on_session_deleted.rs` around lines 22 - 30, The
error handling in the on_session_deleted function currently logs a warning when
settings::clear fails but continues execution, which silently acknowledges the
session_deleted event despite incomplete cleanup. Instead of logging and
continuing when the settings::clear call fails, propagate the error by returning
it from the function so the trigger can be retried. Keep the pending purge
attempt intact but ensure the clear error is returned rather than being
suppressed.
| pub async fn check( | ||
| iii: &III, | ||
| function_id: &str, | ||
| args: &Value, | ||
| timeout_ms: u64, | ||
| ) -> PolicyOutcome { |
There was a problem hiding this comment.
Run rustfmt before merge (CI is currently red).
cargo fmt --all is required; rustfmt check is failing and blocking this PR (also reported for approval-gate/src/session.rs and approval-gate/src/functions/gate.rs).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@approval-gate/src/policy.rs` around lines 26 - 31, The Rust code formatting
does not comply with the project's style standards enforced by rustfmt. Run
cargo fmt --all to automatically format all Rust code in the approval-gate
workspace, which will fix the formatting violations in policy.rs check function,
session.rs, and functions/gate.rs and resolve the CI failure.
Source: Pipeline failures
There was a problem hiding this comment.
♻️ Duplicate comments (1)
approval-gate/src/functions/gate.rs (1)
118-123:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReturn remaining hold timeout from persisted expiry, not full duration.
In duplicate-hold branches, Line 121 and Line 164 return full-duration values instead of the remaining time until
expires_at. That can re-arm an aging/expired hold on redelivery. Use(record.expires_at - now_ms()).max(0)in both branches.Suggested fix
Ok(Some(existing)) => { return HookOutput::Hold { - pending_timeout_ms: (existing.expires_at - existing.pending_at).max(0), + pending_timeout_ms: (existing.expires_at - now_ms()).max(0), }; } @@ - Ok(Some(_prior)) => HookOutput::Hold { - pending_timeout_ms: defaults.pending_timeout_ms, - }, + Ok(Some(prior)) => HookOutput::Hold { + pending_timeout_ms: (prior.expires_at - now_ms()).max(0), + },Also applies to: 163-165
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@approval-gate/src/functions/gate.rs` around lines 118 - 123, The pending_timeout_ms calculation is returning the full duration between pending_at and expires_at instead of the remaining time until expiry, which can re-arm expired holds on redelivery. In the match branch starting at line 118 (the Ok(Some(existing)) case) and the similar branch at line 163, replace the timeout calculation to use the current time obtained from now_ms() and compute the remaining duration as (expires_at - now_ms()).max(0) instead of (expires_at - pending_at).max(0). This ensures the returned timeout represents only the time remaining until actual expiry, not the original full duration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@approval-gate/src/functions/gate.rs`:
- Around line 118-123: The pending_timeout_ms calculation is returning the full
duration between pending_at and expires_at instead of the remaining time until
expiry, which can re-arm expired holds on redelivery. In the match branch
starting at line 118 (the Ok(Some(existing)) case) and the similar branch at
line 163, replace the timeout calculation to use the current time obtained from
now_ms() and compute the remaining duration as (expires_at - now_ms()).max(0)
instead of (expires_at - pending_at).max(0). This ensures the returned timeout
represents only the time remaining until actual expiry, not the original full
duration.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 55f54017-321f-4a26-8800-2c5200b20eb0
📒 Files selected for processing (32)
approval-gate/README.mdapproval-gate/architecture/internals.mdapproval-gate/src/error.rsapproval-gate/src/events.rsapproval-gate/src/functions/add_always_allow.rsapproval-gate/src/functions/approve_always.rsapproval-gate/src/functions/clear_settings.rsapproval-gate/src/functions/gate.rsapproval-gate/src/functions/get_pending.rsapproval-gate/src/functions/get_settings.rsapproval-gate/src/functions/list_pending.rsapproval-gate/src/functions/mod.rsapproval-gate/src/functions/on_config_change.rsapproval-gate/src/functions/on_session_deleted.rsapproval-gate/src/functions/on_turn_completed.rsapproval-gate/src/functions/purge.rsapproval-gate/src/functions/remove_always_allow.rsapproval-gate/src/functions/resolve.rsapproval-gate/src/functions/set_mode.rsapproval-gate/src/functions/sweep.rsapproval-gate/src/gate_config.rsapproval-gate/src/harness.rsapproval-gate/src/lib.rsapproval-gate/src/main.rsapproval-gate/src/pending.rsapproval-gate/src/policy.rsapproval-gate/src/session.rsapproval-gate/src/settings.rsapproval-gate/src/state.rsapproval-gate/src/testkit/engine.rsapproval-gate/src/testkit/mod.rsapproval-gate/tests/integration.rs
✅ Files skipped from review due to trivial changes (2)
- approval-gate/architecture/internals.md
- approval-gate/README.md
🚧 Files skipped from review as they are similar to previous changes (26)
- approval-gate/src/harness.rs
- approval-gate/src/functions/add_always_allow.rs
- approval-gate/src/functions/set_mode.rs
- approval-gate/src/testkit/mod.rs
- approval-gate/src/functions/on_config_change.rs
- approval-gate/src/state.rs
- approval-gate/src/lib.rs
- approval-gate/src/functions/approve_always.rs
- approval-gate/src/functions/clear_settings.rs
- approval-gate/src/functions/remove_always_allow.rs
- approval-gate/src/policy.rs
- approval-gate/src/functions/mod.rs
- approval-gate/src/functions/purge.rs
- approval-gate/src/functions/on_session_deleted.rs
- approval-gate/src/error.rs
- approval-gate/src/session.rs
- approval-gate/src/functions/resolve.rs
- approval-gate/src/functions/on_turn_completed.rs
- approval-gate/src/events.rs
- approval-gate/src/gate_config.rs
- approval-gate/src/functions/list_pending.rs
- approval-gate/src/functions/sweep.rs
- approval-gate/src/pending.rs
- approval-gate/tests/integration.rs
- approval-gate/src/testkit/engine.rs
- approval-gate/src/settings.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 161-178: The iii engine version is inconsistent across CI jobs
that are supposed to mirror each other. The current "Install iii CLI + engine"
step sets VERSION to 0.19.2, but the interface-smoke step below uses VERSION
0.17.0. Update the VERSION environment variable in one of these locations to
match the other, ensuring both the engine-backed tests and interface-smoke tests
validate against the same iii engine version.
In `@approval-gate/src/functions/resolve.rs`:
- Around line 45-57: The expiry check at record.expires_at <= now_ms() is not
atomic with the subsequent harness::function::resolve call at lines 86-90,
creating a TOCTOU window where a record could expire between the check and the
harness invocation, allowing both the sweep and human-resolve to claim ownership
and call harness for the same record. To fix this, implement an atomic state
transition before any harness call is made: instead of just checking and
returning early, update the record state atomically (in the state store or
database) to claim ownership for the human resolver path before proceeding to
the harness call, ensuring that only one path (human or sweep) can successfully
proceed while the other encounters the claimed ownership state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ac9031de-32b1-4c65-9566-5d77774cf25f
📒 Files selected for processing (3)
.github/workflows/ci.ymlapproval-gate/src/events.rsapproval-gate/src/functions/resolve.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- approval-gate/src/events.rs
| // An expired record belongs exclusively to the timeout sweep, which | ||
| // collects every record with `expires_at <= now` (see `sweep::handle`). | ||
| // Resolving it here would race that sweep: both paths would call | ||
| // `harness::function::resolve` for the same call, and a late timeout | ||
| // could clobber this human decision. Treat a post-deadline resolve as a | ||
| // benign no-op — the sweep delivers (or already delivered) the timeout. | ||
| // The boundary matches the sweep exactly, so no record is owned by both. | ||
| if record.expires_at <= now_ms() { | ||
| return Ok(ResolveResponse { | ||
| resolved: false, | ||
| turn_resumed: None, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Close the expiry TOCTOU window before calling the harness.
The expiry check at Line 52 is not atomic with the harness resolve at Lines 86-90. A request can pass the check just before deadline, then expire while in flight, letting sweep and human-resolve both call harness::function::resolve for the same record. That breaks the “no shared ownership” guarantee in this block.
Please move to an atomic ownership claim (human vs sweep) in state before either path calls harness, so only one resolver can proceed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@approval-gate/src/functions/resolve.rs` around lines 45 - 57, The expiry
check at record.expires_at <= now_ms() is not atomic with the subsequent
harness::function::resolve call at lines 86-90, creating a TOCTOU window where a
record could expire between the check and the harness invocation, allowing both
the sweep and human-resolve to claim ownership and call harness for the same
record. To fix this, implement an atomic state transition before any harness
call is made: instead of just checking and returning early, update the record
state atomically (in the state store or database) to claim ownership for the
human resolver path before proceeding to the harness call, ensuring that only
one path (human or sweep) can successfully proceed while the other encounters
the claimed ownership state.
747c877 to
6e75e9a
Compare
…06-agentic spec The policy and decision surface for human-held function calls (tech-specs/2026-06-agentic/approval-gate.md): - approval::gate pre_dispatch hook: human-only defense, per-session mode/allow-list evaluation (ported exactly from the proven turn-orchestrator consultBefore semantics), policy::check_permissions fallback (unparseable reply degrades to hold; transport failure fails closed), synchronous pending-record write before hold (never hold blind), idempotent on function_call_id. - approval::resolve: allow releases via harness::function::resolve action execute, deny delivers an is_error DenialEnvelope; no decision records persisted - resolve first, delete, then emit. - Ephemeral pending inbox (approval_pending scope) with four deletion paths (resolve, turn terminal, session deleted, cron sweep on expires_at); deletion is the emit gate, so pending_resolved fires exactly once per record. state::set null is the atomic gate plus a follow-up state::delete to clean the engine's null tombstone. - Per-session settings (approval_settings scope) with lazy seeding: reads never write; first mutation materializes from the deployment defaults (seed entries granted_by seed). - Two custom trigger types (approval::pending_created / approval::pending_resolved) with session_id/metadata tenancy filters. - approval-gate configuration entry (default_mode, always_allow_seed, pending_timeout_ms) with reactive reload; built-in defaults when the configuration worker is absent. - Recursive argument redaction (secret keys, 256-codepoint clip, depth cap) so records are safe for notification channels. All harness/session-manager trigger bindings are best-effort: the greenfield harness contracts are not implemented yet, so the worker boots standalone and the engine-backed integration suite fakes policy::check_permissions, harness::function::resolve, and session::get. 118 unit tests (full prior-art permission matrix) + 6 engine-backed integration scenarios; 94% line coverage.
…ation contract) Follows docs/architecture/per-worker-architecture.md and the session-manager/architecture reference layout: internals.md for maintainers (gate control flow, emit-gate deletion mechanics, lazy seeding, verified engine facts), integration.md for consumers (function and trigger contract, harness handoff requirements, deployment notes, agent-exposure guidance), plus the index README with system overview and vocabulary. Links the folder from the consumer README.
…against a real engine Replace the Bus trait seam (bus.rs + testkit/fake_bus.rs) with direct iii.trigger() calls through thin per-target wrappers (harness.rs, session.rs, state.rs), matching the llm-router convention. Rewrite the unit and integration tests to run against a real spawned iii engine via testkit/engine.rs instead of the in-memory FakeBus/MemoryState doubles. The public RPC contract is unchanged: all approval::* function ids, payloads, and the harness::function::resolve envelope are identical. Only internal signatures move (Deps.iii: Arc<III>, Emitter::new). Engine-backed tests self-skip when no iii binary is available (III_ENGINE_BIN); CI must provide one or coverage drops.
Engine-backed tests spawn a real iii engine and self-skip when no binary is on PATH, so the approval-gate behavior suite (fail-closed, exactly-once emission, sweep) passed in CI without actually running. Install the iii CLI before `cargo test` so those tests execute. Pinned to 0.19.2 to match the worker's iii-sdk; workers with only pure-unit tests are unaffected.
A human resolve arriving after a hold's deadline raced the timeout sweep: both paths called harness::function::resolve for the same call, and a late timeout could clobber the human decision. Reject a resolve once the record has expired (expires_at <= now), matching the sweep's boundary exactly, so an expired record is owned solely by the sweep. Updates two test seeds that used a past-epoch expires_at to a live deadline, and adds a test that an expired record is left for the sweep.
Engine-backed tests each spawn their own iii engine; running them in parallel on a 2-core CI runner starves the engines until state ops hit their timeout. Run with --test-threads=1 so one engine runs at a time. Pure-unit-only workers are unaffected (their tests are fast either way).
The prior change ran globally in the rust matrix, so it installed iii for every changed worker — which made other workers' engine-gated integration tests (console, iii-directory, session-manager) attempt a real engine dial and fail. Gate both the install and --test-threads=1 on the approval-gate worker; every other worker keeps its prior command and behavior.
Per docs/sops/new-worker.md §3 and §6: add the Modules table row, the create-tag.yml worker option, and the release.yml `approval-gate/v*` tag pattern. Without these the worker has no README entry, can't be picked in the manual release dropdown, and pushing an `approval-gate/vX.Y.Z` tag would trigger nothing.
…od build The engine-backed tests live inline in src/, so testkit must stay in the crate — but it should not compile into the production lib/binary. Gate `mod testkit` on `any(test, feature = "testkit")` and add the feature; a self dev-dependency enables it for the integration tests so `cargo test` still works without `--all-features`.
Snapshot the request/response schemas of the 14 approval::* functions and the 2 trigger payloads under tests/golden/schemas/*.json, generated with SchemaSettings::draft07() — the same generator iii-sdk uses at registration, so the goldens pin exactly what the engine publishes. A catalog()/trigger_catalog() is the single source of truth; function ids and descriptions are now shared consts used by both register_all and the catalog. Regenerate with UPDATE_GOLDENS=1.
install.iii.dev's release index can briefly 404 an existing engine version (observed right after a new release published), which failed the job even though 0.19.2 is valid. Retry the install up to 3x; the final `iii --version` still hard-fails if all attempts miss.
The install script resolves the engine release via the GitHub API; with no token it shares the 60/hr unauthenticated limit, so under CI load it fails with "rate limit hit" (and earlier surfaced as a misleading "release tag not found"). Pass the Actions GITHUB_TOKEN (5000/hr) to both the rust-job engine install and the pre-existing interface-smoke install, which flaked across workers for the same reason. The rust-step retry stays as a genuine-network-blip guard.
Restore .github/workflows/ci.yml to main. The engine-backed tests will self-skip in CI again (no iii on PATH); they still run locally / via `cargo test` with an engine available, and the pure unit + golden-schema tests still run in CI. Engine-test CI can be handled outside this worker PR.
6e75e9a to
749b0ca
Compare
…ng SOP
Conform the worker's wire surface to the new-worker / binary-worker SOP
(§7 Naming: multi-word segments are kebab-case, never snake_case), matching
context-manager (context::count-tokens, context::on-config-change).
Renamed function ids: approval::{list_pending, get_pending, set_mode,
add_always_allow, remove_always_allow, approve_always, get_settings,
clear_settings, on_config_change, on_session_deleted, on_turn_completed}
and trigger ids approval::{pending_created, pending_resolved} to their
kebab equivalents. Updated the harness trigger bindings the worker reads
(harness::hook::pre-dispatch, harness::turn-completed).
Golden wire-schema files renamed and regenerated; gate/resolve/sweep are
single-word and unchanged. The hook payload discriminant value
point:"pre_dispatch" is data, not an id, and is left as-is.
Summary
approval-gateis a new standalone iii worker that owns the policy and decision surface for human-in-the-loop function calls. It evaluates every dispatched call against a per-session permission model and either lets itcontinue,denys it, orholds it for a human — maintaining an ephemeral inbox of held calls and notifying consumers via two trigger types. It ships as its own crate (approval-gate/, libapproval_gate+ binapproval-gate) so the gate can be deployed, versioned, and operated independently of the harness.What it does
The worker exposes three surfaces backed by one permission model:
approval::gate) — apre_dispatchhook the worker binds to itself at boot. For each call it takes one settings snapshot and evaluates, in order:approval::*/configuration::*target → deny (human_only_function, even underfull— self-escalation defense), checked before the snapshot.full→ continue.approved_alwayshit → continue (honoured in every mode).auto+always_allowhit → continue (equality fast-path or*-glob).policy::check_permissions:allow→ continue ·deny→ deny ·needs_approval/ unparseable reply → hold · transport failure / timeout → deny (gate_unavailable— fail closed, never an unattended hold).approval::resolveapplies a decision to a held call;set_mode,add_always_allow/remove_always_allow,approve_always,get_settings,clear_settingsmanage per-session settings.approval::list_pending(tenancy filters + cursor) andapproval::get_pendingread the ephemeral index of held calls.Modes: per-session
manual/auto/full, defaulting to the deployment'sdefault_mode.Resolve flow:
resolve { decision: "allow" }callsharness::function::resolvewithaction: "execute"to release the call back through the dispatch pipeline;decision: "deny"calls it withaction: "deliver", is_error: trueand a renderedDenialEnvelopethe model can read and adapt to. Resolve ordering is harness-first, then delete, then emit, so a crash never loses a decision (it leaks one record for the sweep to collect). An unknown{ session_id, function_call_id }returns{ resolved: false }, not an error — duplicate decisions race benignly.Settings: lazily seeded. Reads (including the gate hot path) never write; the first mutation materializes the record from the current deployment defaults and writes it once. The gate reads tolerantly (a state outage degrades to defaults, which never widen access); mutations read strictly.
Lifecycle / GC: a pending record exists only while a call is held — no resolved-approval history is kept. Every record has explicit deletion paths (resolve,
harness::turn_completed,session::deleted) plus a cronapproval::sweepbackstop that expires records pastexpires_at. All deletions funnel through a singledelete_with_gatehelper that uses an atomic null-set to decide exactly-once emission.Events:
approval::pending_createdfires (off the hot path) when a call is held, carrying a self-sufficient, redacted record for notification copy;approval::pending_resolvedfires exactly once per record when it leaves the inbox, withoutcome: allow | deny | timeout | aborted. Bindings filter onsession_idand subset-equalitymetadatafor multi-tenant fan-out.Architecture
[workspace]:cargo buildproduces theapproval-gatebinary;--manifestemits the registry-publish manifest;--config/--urlconfigure the YAML wiring and engine socket.state.rs(state::get/set/delete/list),harness.rs(harness::function::resolve),session.rs(session::get) — each a smalliii.trigger()wrapper. There is no bus/transport abstraction; handlers receiveDeps { iii, sink, defaults, cfg }and call iii directly.src/functions/, registered through a single typed helper infunctions/mod.rsthat mapsApprovalErrorto the wirecode: messageshape. Pure logic (evaluation order, redaction, denial rendering, settings math) lives in dedicated modules (decision.rs,redact.rs,denial.rs,settings.rs,pending.rs).approval_pending/<sid>/<cid>(the inbox) andapproval_settings/<sid>(per-session settings) — both with explicit deletion paths./is the reserved key separator, sosession_id/function_call_idare rejected if they contain it.main.rs): register the two trigger types → register the 14 functions → best-effort bind the hook / configuration /session::deleted/turn_completed/ cron triggers → register theapproval-gateconfiguration entry and read defaults once. Bindings are best-effort: on an engine without those trigger types the worker still boots, serves its RPCs, and logstrigger_type_not_foundfor the absent bindings (restart to re-bind after the sibling lands).default_mode,always_allow_seed,pending_timeout_ms) live in theapproval-gateconfiguration entry with reactive reload viaapproval::on_config_change; field-wise tolerant parsing never fails the gate open.config.yamlcarries runtime wiring only (hook globs/budget, sweep cronexpression, per-call timeouts).Public contract
A consumer (notably the integration PR stacking on this) depends on:
Functions (
approval::*):gateresolve,set_mode,add_always_allow,remove_always_allow,approve_always,get_settings,clear_settings,list_pending,get_pendingon_config_change,on_session_deleted,on_turn_completed,sweepTrigger types:
approval::pending_created,approval::pending_resolved(binding config{ session_id?, metadata? },deny_unknown_fields).Errors:
code: messagewithapproval/invalid_payload,approval/state_unavailable,approval/harness_unavailable.Harness handoff (
approval::resolve→harness::function::resolve): allow ⇒{ session_id, turn_id, function_call_id, action: "execute" }; deny ⇒{ ..., action: "deliver", is_error: true, content, details }. The gate binds itself via theharness::hook::pre_dispatchtrigger type. These harness surfaces are greenfield: the current harness does not implement them, so until the stacked integration PR lands the bindings are inert and the gate is exercised against in-process fakes (see Testing).Testing
approval-gate/tests/integration.rs): spawn a real iii engine viasrc/testkit/engine.rswithconfiguration+iii-statereal, faking the not-yet-built siblings (policy::check_permissions,harness::function::resolve,session::get) in-process. They cover the hold→record→single-emit idempotency path, allow/deny resolution through the fake harness, sweep expiry with exactly-oncetimeoutemission, reactive configuration reload, lazy settings seeding against real state, and full-stack human-only denial.approval::*function and drive the same production code path through a real engine viatestkit::engine. Pure-logic modules (decision,redact,denial,settings,types, …) are tested with no engine.III_ENGINE_BINoriiion PATH and self-skip (printing a notice) when neither is present, so they never fail in an engine-less CI lane — CI must provide an engine binary or that coverage silently drops.cargo check/cargo buildclean; 94/94 tests pass. One integration test can flake under the parallel multi-thread engine harness (timeout); green when run single-threaded (--test-threads=1).Reviewer notes
architecture/integration.md(the consumer-facing function + trigger contract and the harness handoff) andarchitecture/internals.md(evaluation order, the delete/emit gate, lazy seeding, the engine facts the code relies on).src/functions/gate.rs(control flow / fail-closed posture) andsrc/functions/resolve.rs(crash ordering, the execute/deliver envelope) — these define the load-bearing behavior.src/pending.rs(delete_with_gate— relies onstate::setstoring a literal null tombstone, hence the two-step set-null-then-delete); the fail-closed mapping inpolicy.rs(transport failure/timeout ⇒ deny, unparseable ⇒ hold); the tolerant-vs-strict settings reads insettings.rs; and thehuman_only_functionprefix check running before the settings snapshot ingate.rs.state.rs,harness.rs,session.rs) are intentionally thin and untyped at the boundary — the typed shapes live intypes.rs.Summary by CodeRabbit