Skip to content

feat(approval-gate): standalone worker for human-in-the-loop function approval#260

Open
ytallo wants to merge 14 commits into
mainfrom
feat/approval-gate-worker
Open

feat(approval-gate): standalone worker for human-in-the-loop function approval#260
ytallo wants to merge 14 commits into
mainfrom
feat/approval-gate-worker

Conversation

@ytallo

@ytallo ytallo commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

approval-gate is a new standalone iii worker that owns the policy and decision surface for human-in-the-loop function calls. It evaluates every dispatched call against a per-session permission model and either lets it continue, denys it, or holds it for a human — maintaining an ephemeral inbox of held calls and notifying consumers via two trigger types. It ships as its own crate (approval-gate/, lib approval_gate + bin approval-gate) so the gate can be deployed, versioned, and operated independently of the harness.

What it does

The worker exposes three surfaces backed by one permission model:

  • The gate (approval::gate) — a pre_dispatch hook the worker binds to itself at boot. For each call it takes one settings snapshot and evaluates, in order:
    1. approval::* / configuration::* target → deny (human_only_function, even under full — self-escalation defense), checked before the snapshot.
    2. mode full → continue.
    3. per-session approved_always hit → continue (honoured in every mode).
    4. mode auto + always_allow hit → continue (equality fast-path or *-glob).
    5. fall through to policy::check_permissions: allow → continue · deny → deny · needs_approval / unparseable reply → hold · transport failure / timeout → deny (gate_unavailable — fail closed, never an unattended hold).
  • The decision plane — human/console-only RPCs: approval::resolve applies a decision to a held call; set_mode, add_always_allow / remove_always_allow, approve_always, get_settings, clear_settings manage per-session settings.
  • The pending inboxapproval::list_pending (tenancy filters + cursor) and approval::get_pending read the ephemeral index of held calls.

Modes: per-session manual / auto / full, defaulting to the deployment's default_mode.

Resolve flow: resolve { decision: "allow" } calls harness::function::resolve with action: "execute" to release the call back through the dispatch pipeline; decision: "deny" calls it with action: "deliver", is_error: true and a rendered DenialEnvelope the model can read and adapt to. Resolve ordering is harness-first, then delete, then emit, so a crash never loses a decision (it leaks one record for the sweep to collect). An unknown { session_id, function_call_id } returns { resolved: false }, not an error — duplicate decisions race benignly.

Settings: lazily seeded. Reads (including the gate hot path) never write; the first mutation materializes the record from the current deployment defaults and writes it once. The gate reads tolerantly (a state outage degrades to defaults, which never widen access); mutations read strictly.

Lifecycle / GC: a pending record exists only while a call is held — no resolved-approval history is kept. Every record has explicit deletion paths (resolve, harness::turn_completed, session::deleted) plus a cron approval::sweep backstop that expires records past expires_at. All deletions funnel through a single delete_with_gate helper that uses an atomic null-set to decide exactly-once emission.

Events: approval::pending_created fires (off the hot path) when a call is held, carrying a self-sufficient, redacted record for notification copy; approval::pending_resolved fires exactly once per record when it leaves the inbox, with outcome: allow | deny | timeout | aborted. Bindings filter on session_id and subset-equality metadata for multi-tenant fan-out.

Architecture

  • Standalone crate with its own [workspace]: cargo build produces the approval-gate binary; --manifest emits the registry-publish manifest; --config / --url configure the YAML wiring and engine socket.
  • Direct-iii sibling calls. Calls to other workers go through thin per-target wrapper modules — state.rs (state::get/set/delete/list), harness.rs (harness::function::resolve), session.rs (session::get) — each a small iii.trigger() wrapper. There is no bus/transport abstraction; handlers receive Deps { iii, sink, defaults, cfg } and call iii directly.
  • One function per file under src/functions/, registered through a single typed helper in functions/mod.rs that maps ApprovalError to the wire code: message shape. Pure logic (evaluation order, redaction, denial rendering, settings math) lives in dedicated modules (decision.rs, redact.rs, denial.rs, settings.rs, pending.rs).
  • State surface: two scopes — approval_pending/<sid>/<cid> (the inbox) and approval_settings/<sid> (per-session settings) — both with explicit deletion paths. / is the reserved key separator, so session_id / function_call_id are rejected if they contain it.
  • Boot order (main.rs): register the two trigger types → register the 14 functions → best-effort bind the hook / configuration / session::deleted / turn_completed / cron triggers → register the approval-gate configuration entry and read defaults once. Bindings are best-effort: on an engine without those trigger types the worker still boots, serves its RPCs, and logs trigger_type_not_found for the absent bindings (restart to re-bind after the sibling lands).
  • Configuration: deployment defaults (default_mode, always_allow_seed, pending_timeout_ms) live in the approval-gate configuration entry with reactive reload via approval::on_config_change; field-wise tolerant parsing never fails the gate open. config.yaml carries runtime wiring only (hook globs/budget, sweep cron expression, per-call timeouts).

Public contract

A consumer (notably the integration PR stacking on this) depends on:

Functions (approval::*):

  • Harness-driven: gate
  • Human/console: resolve, set_mode, add_always_allow, remove_always_allow, approve_always, get_settings, clear_settings, list_pending, get_pending
  • Internal trigger handlers (never call directly): on_config_change, on_session_deleted, on_turn_completed, sweep

Trigger types: approval::pending_created, approval::pending_resolved (binding config { session_id?, metadata? }, deny_unknown_fields).

Errors: code: message with approval/invalid_payload, approval/state_unavailable, approval/harness_unavailable.

Harness handoff (approval::resolveharness::function::resolve): allow ⇒ { session_id, turn_id, function_call_id, action: "execute" }; deny ⇒ { ..., action: "deliver", is_error: true, content, details }. The gate binds itself via the harness::hook::pre_dispatch trigger type. These harness surfaces are greenfield: the current harness does not implement them, so until the stacked integration PR lands the bindings are inert and the gate is exercised against in-process fakes (see Testing).

Testing

  • Engine-backed tests (approval-gate/tests/integration.rs): spawn a real iii engine via src/testkit/engine.rs with configuration + iii-state real, faking the not-yet-built siblings (policy::check_permissions, harness::function::resolve, session::get) in-process. They cover the hold→record→single-emit idempotency path, allow/deny resolution through the fake harness, sweep expiry with exactly-once timeout emission, reactive configuration reload, lazy settings seeding against real state, and full-stack human-only denial.
  • Handler unit suites sit beside each approval::* function and drive the same production code path through a real engine via testkit::engine. Pure-logic modules (decision, redact, denial, settings, types, …) are tested with no engine.
  • Self-skip: engine-backed suites resolve the engine from III_ENGINE_BIN or iii on PATH and self-skip (printing a notice) when neither is present, so they never fail in an engine-less CI lane — CI must provide an engine binary or that coverage silently drops.
  • Status: cargo check / cargo build clean; 94/94 tests pass. One integration test can flake under the parallel multi-thread engine harness (timeout); green when run single-threaded (--test-threads=1).

Reviewer notes

  • Start with architecture/integration.md (the consumer-facing function + trigger contract and the harness handoff) and architecture/internals.md (evaluation order, the delete/emit gate, lazy seeding, the engine facts the code relies on).
  • Then src/functions/gate.rs (control flow / fail-closed posture) and src/functions/resolve.rs (crash ordering, the execute/deliver envelope) — these define the load-bearing behavior.
  • Scrutinize: the exactly-once emission gate in src/pending.rs (delete_with_gate — relies on state::set storing a literal null tombstone, hence the two-step set-null-then-delete); the fail-closed mapping in policy.rs (transport failure/timeout ⇒ deny, unparseable ⇒ hold); the tolerant-vs-strict settings reads in settings.rs; and the human_only_function prefix check running before the settings snapshot in gate.rs.
  • Note the per-target iii wrappers (state.rs, harness.rs, session.rs) are intentionally thin and untyped at the boundary — the typed shapes live in types.rs.

Summary by CodeRabbit

  • New Features
    • Introduced the approval-gate worker with a fail-closed pre-dispatch hold/resolve workflow.
    • Added pending inbox APIs (list/get pending) plus pending created/resolved notifications.
    • Added per-session approval settings management, including mode and always-allow/approve-always, with cron-based expiry sweeping.
  • Documentation
    • Added end-user and architecture docs describing decision flow, triggers, and operational setup.
  • Tests
    • Added unit coverage and engine-backed integration tests for gate/resolve/sweep and configuration reload.
  • Chores
    • Updated CI to install the required engine before running Rust tests.

@vercel

vercel Bot commented Jun 15, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
workers Ready Ready Preview, Comment Jun 15, 2026 10:40pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces the complete approval-gate Rust worker crate, with wire contracts, state and decision logic, event fan-out, approval handlers, binary wiring, engine-backed tests, and documentation. The PR also adds CI and release configuration for the new worker.

Changes

approval-gate Rust worker

Layer / File(s) Summary
Wire contracts and schema catalog
approval-gate/Cargo.toml, approval-gate/src/lib.rs, approval-gate/src/types.rs, approval-gate/src/error.rs, approval-gate/src/functions/mod.rs, approval-gate/tests/schemas.rs, approval-gate/tests/support/mod.rs, approval-gate/tests/golden/schemas/*
Defines the public wire types, stable error codes, crate module surface, function and trigger catalogs, and the schema snapshot tests and golden fixtures that verify the exported request, response, and trigger shapes.
State storage, settings, and decision primitives
approval-gate/src/state.rs, approval-gate/src/pending.rs, approval-gate/src/settings.rs, approval-gate/src/gate_config.rs, approval-gate/src/config.rs, approval-gate/src/decision.rs, approval-gate/src/denial.rs, approval-gate/src/redact.rs, approval-gate/src/policy.rs, approval-gate/src/session.rs, approval-gate/src/harness.rs
Implements thin state RPC wrappers, pending-record storage, per-session settings persistence and defaults, configuration parsing and loading, pure permission-evaluation helpers, denial rendering, JSON redaction, policy lookup, and session/harness service wrappers.
Trigger registration and event fan-out
approval-gate/src/events.rs
Defines binding configuration, trigger registries, engine registration and unregistration, and asynchronous fan-out for pending approval creation and resolution events.
Gate, resolve, sweep, and approval RPC handlers
approval-gate/src/functions/gate.rs, approval-gate/src/functions/resolve.rs, approval-gate/src/functions/purge.rs, approval-gate/src/functions/get_pending.rs, approval-gate/src/functions/list_pending.rs, approval-gate/src/functions/get_settings.rs, approval-gate/src/functions/set_mode.rs, approval-gate/src/functions/clear_settings.rs, approval-gate/src/functions/add_always_allow.rs, approval-gate/src/functions/remove_always_allow.rs, approval-gate/src/functions/approve_always.rs, approval-gate/src/functions/on_config_change.rs, approval-gate/src/functions/on_session_deleted.rs, approval-gate/src/functions/on_turn_completed.rs, approval-gate/src/functions/sweep.rs
Implements the pre-dispatch gate, pending-resolution flow, sweep backstop, and the remaining approval RPC handlers for settings reads and mutations, pending listing and retrieval, and cleanup triggers.
Binary entrypoint, manifest, and deployment wiring
approval-gate/src/main.rs, approval-gate/src/manifest.rs, approval-gate/build.rs, approval-gate/config.yaml, approval-gate/iii.worker.yaml, .github/workflows/ci.yml, .github/workflows/create-tag.yml, .github/workflows/release.yml, README.md
Implements the Rust binary entrypoint, generated manifest, compile-time target injection, worker configuration, and repository deployment wiring for the approval-gate worker.
Engine-backed testkit and integration suite
approval-gate/src/testkit/engine.rs, approval-gate/src/testkit/mod.rs, approval-gate/tests/integration.rs
Implements the engine-backed test bootstrap, shared stack helpers, and the integration tests that exercise gate, resolve, sweep, configuration reload, lazy settings seeding, and human-only denial against a real engine.
Architecture and operational documentation
approval-gate/README.md, approval-gate/architecture/README.md, approval-gate/architecture/integration.md, approval-gate/architecture/internals.md
Adds the worker README, architecture overview, integration contract, and internals document covering the approval flow, trigger contracts, state lifecycle, redaction rules, and testing assumptions.

Sequence Diagram(s)

sequenceDiagram
  participant Harness
  participant approval_gate as approval-gate
  participant policy_worker
  participant state_engine
  participant EventSink

  Harness->>approval_gate: approval::gate (HookInput)
  approval_gate->>approval_gate: validate_id / pre_policy_allow
  approval_gate->>state_engine: read settings
  approval_gate->>policy_worker: policy::check_permissions
  policy_worker-->>approval_gate: NeedsApproval
  approval_gate->>state_engine: pending::put
  approval_gate-->>Harness: HookOutput::Hold
  approval_gate--)EventSink: pending_created

  Console->>approval_gate: approval::resolve
  approval_gate->>state_engine: pending::get
  approval_gate->>Harness: harness::function::resolve
  approval_gate->>state_engine: pending::delete_with_gate
  approval_gate--)EventSink: pending_resolved
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

  • iii-hq/workers#156: Shares the same approval-gate decision, pending, and resolve flow and would likely affect the same contracts and handler wiring.

Poem

🐇 Hop hop, the gate is wide and bright,
Pending calls now pause in gentle light.
Redacted secrets tuck their tails away,
While humans choose the path: deny or stay.
A cron-kissed sweep keeps time from piling high,
And tiny bunnies wave the approvals by.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/approval-gate-worker

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

skill-check — worker

0 verified, 21 skipped (no docs/).

Layer Result
structure
vale
ai
render

Four for four. Nicely done.

@ytallo

ytallo commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Companion harness integration PR stacks on this branch: #261 (adopts this worker, retires the in-process gate). Merge this one first.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (2)
approval-gate/README.md (2)

42-44: ⚡ Quick win

Confirm cron worker requirement and sweep binding config.

The README states the sweep needs iii-cron and references a cron trigger binding config, but does not specify the binding config key or defaults. Per the review stack (internals.md), the key is expression with a 6-field cron default "0 * * * * *". If cron is not added, held calls expire only after reaching pending_timeout_ms, which may exceed user expectations.

Consider adding: "The binding config key is expression (6-field cron); see config.yaml for the default."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/README.md` around lines 42 - 44, The README documentation about
the `iii-cron` worker requirement needs clarification on the binding
configuration details. Expand the existing text around the sweep and iii-cron
worker requirement to explicitly state that the binding config key is
`expression` with a default 6-field cron value of `"0 * * * * *"`, and add a
reference to config.yaml where this default is defined. This will help users
understand how to properly configure the cron trigger binding without leaving
them to infer the configuration from other documentation.

113-114: ⚡ Quick win

Clarify what config.yaml contains vs. what lives in the engine's configuration entry.

Line 113-114 state "config.yaml carries runtime wiring only (hook binding globs/budget, sweep cron expression, per-call timeouts)" but do not specify the structure or how to edit it. This could confuse operators who need to tweak timeouts or cron schedules.

Consider adding a note: "See config.yaml (comments in the file document each field). To change deployment defaults (mode, always_allow seed), use the console's Configuration screen to edit the approval-gate configuration entry; to change worker-level settings (hook timeout, sweep schedule), re-deploy with an edited config.yaml."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/README.md` around lines 113 - 114, The documentation at lines
113-114 in the README is unclear about the distinction between config.yaml
settings and engine configuration entry settings, leaving operators confused
about how to modify runtime parameters. Add a clarifying note after the existing
text that explains: (1) config.yaml contains comments documenting each field,
(2) deployment defaults (mode, always_allow seed) are edited via the console's
Configuration screen for the approval-gate entry, and (3) worker-level settings
(hook timeout, sweep schedule) require re-deploying with an edited config.yaml.
This note should guide operators on which approach to use for different types of
configuration changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@approval-gate/src/config.rs`:
- Around line 16-18: The `hook.on_error` configuration field accepts any string
value, which could allow typos or unsafe values to bypass validation and weaken
the fail-closed contract at runtime. At approval-gate/src/config.rs lines 33-35
(where on_error is parsed), add validation to constrain the accepted value to
only "fail_closed" and reject any other strings with a descriptive error during
config loading. The default_on_error function at lines 16-18 can remain as is
since it already provides the correct default. At lines 95-98 (where on_error is
used), verify that the validation at the parsing location is sufficient and no
additional runtime checks are needed, or if those lines represent another
parsing/validation point, apply the same "fail_closed" constraint enforcement
there.

In `@approval-gate/src/events.rs`:
- Around line 190-217: The code is discarding registration errors by using let _
= on both register_trigger_type calls for PENDING_CREATED and PENDING_RESOLVED,
and then unconditionally logging success. Instead of ignoring the Result from
register_trigger_type, properly handle the registration results by checking if
they succeeded before logging, or propagate any errors up the call stack. Each
RegisterTriggerType registration in the ApprovalTriggerHandler setup should have
its error checked, and the tracing::info log statement should only be reached if
all registrations succeeded.

In `@approval-gate/src/functions/gate.rs`:
- Around line 119-123: The duplicate-hold paths are returning the original
timeout duration instead of the remaining time until the persisted expiration.
In both the Ok(Some(existing)) match arm starting at line 119 and the similar
duplicate-hold path at lines 164-166, replace the calculation
(existing.expires_at - existing.pending_at).max(0) with a calculation that
computes the actual remaining time until expiration: subtract the current time
from existing.expires_at to get how much time is left before the hold expires,
ensuring the result does not go below zero with .max(0).

In `@approval-gate/src/functions/get_settings.rs`:
- Around line 29-33: The code has formatting inconsistencies that are blocking
CI checks for cargo fmt. Run cargo fmt --all to apply rustfmt normalization
across the entire codebase. This will automatically fix: (1) the import ordering
and layout in the test module at approval-gate/src/functions/get_settings.rs
lines 29-33, and (2) the imports and signature wrapping in the test helper area
at approval-gate/src/functions/list_pending.rs lines 103-112. After running the
formatter, commit the resulting formatting changes.

In `@approval-gate/src/functions/on_session_deleted.rs`:
- Around line 62-95: The test function
purges_settings_and_only_the_sessions_pending_records has formatting issues that
are causing CI to fail. Additionally, there are formatting issues in
on_turn_completed.rs. Run cargo fmt --all to automatically fix all formatting
issues in the Rust code, which will resolve the failing CI checks.
- Around line 22-30: The error handling in the on_session_deleted function
currently logs a warning when settings::clear fails but continues execution,
which silently acknowledges the session_deleted event despite incomplete
cleanup. Instead of logging and continuing when the settings::clear call fails,
propagate the error by returning it from the function so the trigger can be
retried. Keep the pending purge attempt intact but ensure the clear error is
returned rather than being suppressed.

In `@approval-gate/src/functions/resolve.rs`:
- Around line 45-67: When handling the ResolveDecision::Deny case in the resolve
function, the user_deny_envelope function applies a default reason of "Rejected
by operator." when no explicit reason is provided, but this default reason is
not being carried through to the resolved event emitted later. Extract the
effective deny reason (either from req.reason or the default used by
user_deny_envelope) and ensure that same reason string is passed to the resolved
event emission so that downstream consumers receive consistent deny reason
information in both the envelope content and the event payload.

In `@approval-gate/src/functions/sweep.rs`:
- Around line 38-44: The approval-gate/src/functions/sweep.rs file has rustfmt
formatting drift issues that are causing cargo fmt --check to fail in CI. Run
cargo fmt --all to automatically format the entire codebase according to Rust
formatting standards, which will fix the formatting violations in this file and
allow the lint and test workflows to pass.

In `@approval-gate/src/policy.rs`:
- Around line 26-31: The Rust code formatting does not comply with the project's
style standards enforced by rustfmt. Run cargo fmt --all to automatically format
all Rust code in the approval-gate workspace, which will fix the formatting
violations in policy.rs check function, session.rs, and functions/gate.rs and
resolve the CI failure.

In `@approval-gate/src/settings.rs`:
- Around line 129-133: The Rust code formatting does not comply with the
project's style standards as checked by cargo fmt. Run cargo fmt --all from the
project root to automatically format all Rust source files in the approval-gate
crate and its dependencies according to the configured formatting rules. This
will resolve all formatting failures reported by cargo fmt --check.

In `@approval-gate/src/state.rs`:
- Around line 53-65: The list function in the approval-gate/src/state.rs file
has formatting issues that do not conform to Rust's standard formatting rules
enforced by rustfmt. Run cargo fmt --all from the project root to automatically
reformat the entire codebase, including the list function and any other files
with formatting drift, to ensure the code passes the cargo fmt --check lint
requirement.

In `@approval-gate/src/testkit/engine.rs`:
- Around line 265-273: The code in the register_function call for
"policy::check_permissions" at approval-gate/src/testkit/engine.rs lines 265-273
(and similarly at lines 376-404) does not conform to Rust formatting standards,
particularly around closure and signature wrapping. Run cargo fmt --all to
automatically fix all formatting issues in the file, including the
closure/signature/call wrapping in these register_function calls and any other
formatting inconsistencies.

---

Nitpick comments:
In `@approval-gate/README.md`:
- Around line 42-44: The README documentation about the `iii-cron` worker
requirement needs clarification on the binding configuration details. Expand the
existing text around the sweep and iii-cron worker requirement to explicitly
state that the binding config key is `expression` with a default 6-field cron
value of `"0 * * * * *"`, and add a reference to config.yaml where this default
is defined. This will help users understand how to properly configure the cron
trigger binding without leaving them to infer the configuration from other
documentation.
- Around line 113-114: The documentation at lines 113-114 in the README is
unclear about the distinction between config.yaml settings and engine
configuration entry settings, leaving operators confused about how to modify
runtime parameters. Add a clarifying note after the existing text that explains:
(1) config.yaml contains comments documenting each field, (2) deployment
defaults (mode, always_allow seed) are edited via the console's Configuration
screen for the approval-gate entry, and (3) worker-level settings (hook timeout,
sweep schedule) require re-deploying with an edited config.yaml. This note
should guide operators on which approach to use for different types of
configuration changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 54f2f5d9-8b20-4b81-bfb1-fb19a127fd28

📥 Commits

Reviewing files that changed from the base of the PR and between 9af1674 and 1140eeb.

⛔ Files ignored due to path filters (1)
  • approval-gate/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (44)
  • approval-gate/Cargo.toml
  • approval-gate/README.md
  • approval-gate/architecture/README.md
  • approval-gate/architecture/integration.md
  • approval-gate/architecture/internals.md
  • approval-gate/build.rs
  • approval-gate/config.yaml
  • approval-gate/iii.worker.yaml
  • approval-gate/src/config.rs
  • approval-gate/src/decision.rs
  • approval-gate/src/denial.rs
  • approval-gate/src/error.rs
  • approval-gate/src/events.rs
  • approval-gate/src/functions/add_always_allow.rs
  • approval-gate/src/functions/approve_always.rs
  • approval-gate/src/functions/clear_settings.rs
  • approval-gate/src/functions/gate.rs
  • approval-gate/src/functions/get_pending.rs
  • approval-gate/src/functions/get_settings.rs
  • approval-gate/src/functions/list_pending.rs
  • approval-gate/src/functions/mod.rs
  • approval-gate/src/functions/on_config_change.rs
  • approval-gate/src/functions/on_session_deleted.rs
  • approval-gate/src/functions/on_turn_completed.rs
  • approval-gate/src/functions/purge.rs
  • approval-gate/src/functions/remove_always_allow.rs
  • approval-gate/src/functions/resolve.rs
  • approval-gate/src/functions/set_mode.rs
  • approval-gate/src/functions/sweep.rs
  • approval-gate/src/gate_config.rs
  • approval-gate/src/harness.rs
  • approval-gate/src/lib.rs
  • approval-gate/src/main.rs
  • approval-gate/src/manifest.rs
  • approval-gate/src/pending.rs
  • approval-gate/src/policy.rs
  • approval-gate/src/redact.rs
  • approval-gate/src/session.rs
  • approval-gate/src/settings.rs
  • approval-gate/src/state.rs
  • approval-gate/src/testkit/engine.rs
  • approval-gate/src/testkit/mod.rs
  • approval-gate/src/types.rs
  • approval-gate/tests/integration.rs

Comment on lines +16 to +18
fn default_on_error() -> String {
"fail_closed".to_string()
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constrain hook.on_error to fail-closed values.

Line 34 currently accepts any string, so typos or unsafe values can pass parsing and weaken the fail-closed contract at runtime. Enforce "fail_closed" during config load.

Suggested patch
-use anyhow::{Context, Result};
+use anyhow::{ensure, Context, Result};
@@
 pub fn load_config(path: &str) -> Result<WorkerConfig> {
     let raw = std::fs::read_to_string(path).with_context(|| format!("read config {path}"))?;
-    serde_yaml::from_str(&raw).with_context(|| format!("parse config {path}"))
+    let cfg: WorkerConfig =
+        serde_yaml::from_str(&raw).with_context(|| format!("parse config {path}"))?;
+    ensure!(
+        cfg.hook.on_error == "fail_closed",
+        "parse config {}: hook.on_error must be \"fail_closed\"",
+        path
+    );
+    Ok(cfg)
 }

Also applies to: 33-35, 95-98

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/config.rs` around lines 16 - 18, The `hook.on_error`
configuration field accepts any string value, which could allow typos or unsafe
values to bypass validation and weaken the fail-closed contract at runtime. At
approval-gate/src/config.rs lines 33-35 (where on_error is parsed), add
validation to constrain the accepted value to only "fail_closed" and reject any
other strings with a descriptive error during config loading. The
default_on_error function at lines 16-18 can remain as is since it already
provides the correct default. At lines 95-98 (where on_error is used), verify
that the validation at the parsing location is sufficient and no additional
runtime checks are needed, or if those lines represent another
parsing/validation point, apply the same "fail_closed" constraint enforcement
there.

Comment on lines +190 to +217
let _ = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_CREATED,
"A function call was held for human approval and its inbox record written. \
Payload: PendingApprovalRecord (redacted args, session context, expiry). \
Bind notification workers here.",
ApprovalTriggerHandler {
set: sets.created.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
);
let _ = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_RESOLVED,
"A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \
Emitted exactly once per record; lets UIs clear badges.",
ApprovalTriggerHandler {
set: sets.resolved.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
);

tracing::info!(
trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED],
"registered trigger types"
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not swallow trigger-type registration failures.

Lines 190 and 202 discard registration errors, and Line 214 logs success regardless. If registration fails, subscribers cannot bind and the worker reports a misleading healthy state.

Suggested patch
 pub fn register_trigger_types(iii: &Arc<III>) -> TriggerSets {
     let sets = TriggerSets::new();
+    let mut all_ok = true;
 
-    let _ = iii.register_trigger_type(
+    if let Err(e) = iii.register_trigger_type(
         RegisterTriggerType::new(
             PENDING_CREATED,
@@
         )
         .trigger_request_format::<BindingConfig>(),
-    );
-    let _ = iii.register_trigger_type(
+    ) {
+        all_ok = false;
+        tracing::warn!(trigger_type = PENDING_CREATED, error = %e, "failed to register trigger type");
+    }
+    if let Err(e) = iii.register_trigger_type(
         RegisterTriggerType::new(
             PENDING_RESOLVED,
@@
         )
         .trigger_request_format::<BindingConfig>(),
-    );
+    ) {
+        all_ok = false;
+        tracing::warn!(trigger_type = PENDING_RESOLVED, error = %e, "failed to register trigger type");
+    }
 
-    tracing::info!(
-        trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED],
-        "registered trigger types"
-    );
+    if all_ok {
+        tracing::info!(trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], "registered trigger types");
+    }
     sets
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let _ = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_CREATED,
"A function call was held for human approval and its inbox record written. \
Payload: PendingApprovalRecord (redacted args, session context, expiry). \
Bind notification workers here.",
ApprovalTriggerHandler {
set: sets.created.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
);
let _ = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_RESOLVED,
"A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \
Emitted exactly once per record; lets UIs clear badges.",
ApprovalTriggerHandler {
set: sets.resolved.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
);
tracing::info!(
trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED],
"registered trigger types"
);
pub fn register_trigger_types(iii: &Arc<III>) -> TriggerSets {
let sets = TriggerSets::new();
let mut all_ok = true;
if let Err(e) = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_CREATED,
"A function call was held for human approval and its inbox record written. \
Payload: PendingApprovalRecord (redacted args, session context, expiry). \
Bind notification workers here.",
ApprovalTriggerHandler {
set: sets.created.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
) {
all_ok = false;
tracing::warn!(trigger_type = PENDING_CREATED, error = %e, "failed to register trigger type");
}
if let Err(e) = iii.register_trigger_type(
RegisterTriggerType::new(
PENDING_RESOLVED,
"A pending approval left the inbox (outcome: allow | deny | timeout | aborted). \
Emitted exactly once per record; lets UIs clear badges.",
ApprovalTriggerHandler {
set: sets.resolved.clone(),
},
)
.trigger_request_format::<BindingConfig>(),
) {
all_ok = false;
tracing::warn!(trigger_type = PENDING_RESOLVED, error = %e, "failed to register trigger type");
}
if all_ok {
tracing::info!(trigger_types = ?[PENDING_CREATED, PENDING_RESOLVED], "registered trigger types");
}
sets
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/events.rs` around lines 190 - 217, The code is discarding
registration errors by using let _ = on both register_trigger_type calls for
PENDING_CREATED and PENDING_RESOLVED, and then unconditionally logging success.
Instead of ignoring the Result from register_trigger_type, properly handle the
registration results by checking if they succeeded before logging, or propagate
any errors up the call stack. Each RegisterTriggerType registration in the
ApprovalTriggerHandler setup should have its error checked, and the
tracing::info log statement should only be reached if all registrations
succeeded.

Comment thread approval-gate/src/functions/gate.rs
Comment thread approval-gate/src/functions/get_settings.rs
Comment on lines +22 to +30
if let Err(e) = settings::clear(
deps.iii.as_ref(),
&event.session_id,
deps.cfg.state_timeout_ms,
)
.await
{
tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t acknowledge session::deleted when settings purge fails.

This path logs and continues on settings::clear failure, which can leave per-session approval state behind after deletion. Keep the pending purge attempt, but return the clear error so the trigger can be retried instead of silently accepting partial cleanup.

Suggested fix
-    if let Err(e) = settings::clear(
-        deps.iii.as_ref(),
-        &event.session_id,
-        deps.cfg.state_timeout_ms,
-    )
-    .await
-    {
-        tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed");
-    }
+    let clear_res = settings::clear(
+        deps.iii.as_ref(),
+        &event.session_id,
+        deps.cfg.state_timeout_ms,
+    )
+    .await;
     let purged = purge::purge_matching(deps, |r| r.session_id == event.session_id).await;
+    if let Err(e) = clear_res {
+        tracing::warn!(session_id = %event.session_id, error = %e, "settings purge failed");
+        return Err(e);
+    }
     tracing::info!(session_id = %event.session_id, purged, "session deleted: approval records purged");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/functions/on_session_deleted.rs` around lines 22 - 30, The
error handling in the on_session_deleted function currently logs a warning when
settings::clear fails but continues execution, which silently acknowledges the
session_deleted event despite incomplete cleanup. Instead of logging and
continuing when the settings::clear call fails, propagate the error by returning
it from the function so the trigger can be retried. Keep the pending purge
attempt intact but ensure the clear error is returned rather than being
suppressed.

Comment thread approval-gate/src/functions/sweep.rs Outdated
Comment thread approval-gate/src/policy.rs Outdated
Comment on lines +26 to +31
pub async fn check(
iii: &III,
function_id: &str,
args: &Value,
timeout_ms: u64,
) -> PolicyOutcome {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run rustfmt before merge (CI is currently red).

cargo fmt --all is required; rustfmt check is failing and blocking this PR (also reported for approval-gate/src/session.rs and approval-gate/src/functions/gate.rs).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/policy.rs` around lines 26 - 31, The Rust code formatting
does not comply with the project's style standards enforced by rustfmt. Run
cargo fmt --all to automatically format all Rust code in the approval-gate
workspace, which will fix the formatting violations in policy.rs check function,
session.rs, and functions/gate.rs and resolve the CI failure.

Source: Pipeline failures

Comment thread approval-gate/src/settings.rs Outdated
Comment thread approval-gate/src/state.rs Outdated
Comment thread approval-gate/src/testkit/engine.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
approval-gate/src/functions/gate.rs (1)

118-123: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return remaining hold timeout from persisted expiry, not full duration.

In duplicate-hold branches, Line 121 and Line 164 return full-duration values instead of the remaining time until expires_at. That can re-arm an aging/expired hold on redelivery. Use (record.expires_at - now_ms()).max(0) in both branches.

Suggested fix
         Ok(Some(existing)) => {
             return HookOutput::Hold {
-                pending_timeout_ms: (existing.expires_at - existing.pending_at).max(0),
+                pending_timeout_ms: (existing.expires_at - now_ms()).max(0),
             };
         }
@@
-        Ok(Some(_prior)) => HookOutput::Hold {
-            pending_timeout_ms: defaults.pending_timeout_ms,
-        },
+        Ok(Some(prior)) => HookOutput::Hold {
+            pending_timeout_ms: (prior.expires_at - now_ms()).max(0),
+        },

Also applies to: 163-165

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/functions/gate.rs` around lines 118 - 123, The
pending_timeout_ms calculation is returning the full duration between pending_at
and expires_at instead of the remaining time until expiry, which can re-arm
expired holds on redelivery. In the match branch starting at line 118 (the
Ok(Some(existing)) case) and the similar branch at line 163, replace the timeout
calculation to use the current time obtained from now_ms() and compute the
remaining duration as (expires_at - now_ms()).max(0) instead of (expires_at -
pending_at).max(0). This ensures the returned timeout represents only the time
remaining until actual expiry, not the original full duration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@approval-gate/src/functions/gate.rs`:
- Around line 118-123: The pending_timeout_ms calculation is returning the full
duration between pending_at and expires_at instead of the remaining time until
expiry, which can re-arm expired holds on redelivery. In the match branch
starting at line 118 (the Ok(Some(existing)) case) and the similar branch at
line 163, replace the timeout calculation to use the current time obtained from
now_ms() and compute the remaining duration as (expires_at - now_ms()).max(0)
instead of (expires_at - pending_at).max(0). This ensures the returned timeout
represents only the time remaining until actual expiry, not the original full
duration.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55f54017-321f-4a26-8800-2c5200b20eb0

📥 Commits

Reviewing files that changed from the base of the PR and between 1140eeb and 919e731.

📒 Files selected for processing (32)
  • approval-gate/README.md
  • approval-gate/architecture/internals.md
  • approval-gate/src/error.rs
  • approval-gate/src/events.rs
  • approval-gate/src/functions/add_always_allow.rs
  • approval-gate/src/functions/approve_always.rs
  • approval-gate/src/functions/clear_settings.rs
  • approval-gate/src/functions/gate.rs
  • approval-gate/src/functions/get_pending.rs
  • approval-gate/src/functions/get_settings.rs
  • approval-gate/src/functions/list_pending.rs
  • approval-gate/src/functions/mod.rs
  • approval-gate/src/functions/on_config_change.rs
  • approval-gate/src/functions/on_session_deleted.rs
  • approval-gate/src/functions/on_turn_completed.rs
  • approval-gate/src/functions/purge.rs
  • approval-gate/src/functions/remove_always_allow.rs
  • approval-gate/src/functions/resolve.rs
  • approval-gate/src/functions/set_mode.rs
  • approval-gate/src/functions/sweep.rs
  • approval-gate/src/gate_config.rs
  • approval-gate/src/harness.rs
  • approval-gate/src/lib.rs
  • approval-gate/src/main.rs
  • approval-gate/src/pending.rs
  • approval-gate/src/policy.rs
  • approval-gate/src/session.rs
  • approval-gate/src/settings.rs
  • approval-gate/src/state.rs
  • approval-gate/src/testkit/engine.rs
  • approval-gate/src/testkit/mod.rs
  • approval-gate/tests/integration.rs
✅ Files skipped from review due to trivial changes (2)
  • approval-gate/architecture/internals.md
  • approval-gate/README.md
🚧 Files skipped from review as they are similar to previous changes (26)
  • approval-gate/src/harness.rs
  • approval-gate/src/functions/add_always_allow.rs
  • approval-gate/src/functions/set_mode.rs
  • approval-gate/src/testkit/mod.rs
  • approval-gate/src/functions/on_config_change.rs
  • approval-gate/src/state.rs
  • approval-gate/src/lib.rs
  • approval-gate/src/functions/approve_always.rs
  • approval-gate/src/functions/clear_settings.rs
  • approval-gate/src/functions/remove_always_allow.rs
  • approval-gate/src/policy.rs
  • approval-gate/src/functions/mod.rs
  • approval-gate/src/functions/purge.rs
  • approval-gate/src/functions/on_session_deleted.rs
  • approval-gate/src/error.rs
  • approval-gate/src/session.rs
  • approval-gate/src/functions/resolve.rs
  • approval-gate/src/functions/on_turn_completed.rs
  • approval-gate/src/events.rs
  • approval-gate/src/gate_config.rs
  • approval-gate/src/functions/list_pending.rs
  • approval-gate/src/functions/sweep.rs
  • approval-gate/src/pending.rs
  • approval-gate/tests/integration.rs
  • approval-gate/src/testkit/engine.rs
  • approval-gate/src/settings.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 161-178: The iii engine version is inconsistent across CI jobs
that are supposed to mirror each other. The current "Install iii CLI + engine"
step sets VERSION to 0.19.2, but the interface-smoke step below uses VERSION
0.17.0. Update the VERSION environment variable in one of these locations to
match the other, ensuring both the engine-backed tests and interface-smoke tests
validate against the same iii engine version.

In `@approval-gate/src/functions/resolve.rs`:
- Around line 45-57: The expiry check at record.expires_at <= now_ms() is not
atomic with the subsequent harness::function::resolve call at lines 86-90,
creating a TOCTOU window where a record could expire between the check and the
harness invocation, allowing both the sweep and human-resolve to claim ownership
and call harness for the same record. To fix this, implement an atomic state
transition before any harness call is made: instead of just checking and
returning early, update the record state atomically (in the state store or
database) to claim ownership for the human resolver path before proceeding to
the harness call, ensuring that only one path (human or sweep) can successfully
proceed while the other encounters the claimed ownership state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac9031de-32b1-4c65-9566-5d77774cf25f

📥 Commits

Reviewing files that changed from the base of the PR and between 919e731 and b487eaa.

📒 Files selected for processing (3)
  • .github/workflows/ci.yml
  • approval-gate/src/events.rs
  • approval-gate/src/functions/resolve.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • approval-gate/src/events.rs

Comment thread .github/workflows/ci.yml Outdated
Comment on lines +45 to +57
// An expired record belongs exclusively to the timeout sweep, which
// collects every record with `expires_at <= now` (see `sweep::handle`).
// Resolving it here would race that sweep: both paths would call
// `harness::function::resolve` for the same call, and a late timeout
// could clobber this human decision. Treat a post-deadline resolve as a
// benign no-op — the sweep delivers (or already delivered) the timeout.
// The boundary matches the sweep exactly, so no record is owned by both.
if record.expires_at <= now_ms() {
return Ok(ResolveResponse {
resolved: false,
turn_resumed: None,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Close the expiry TOCTOU window before calling the harness.

The expiry check at Line 52 is not atomic with the harness resolve at Lines 86-90. A request can pass the check just before deadline, then expire while in flight, letting sweep and human-resolve both call harness::function::resolve for the same record. That breaks the “no shared ownership” guarantee in this block.

Please move to an atomic ownership claim (human vs sweep) in state before either path calls harness, so only one resolver can proceed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@approval-gate/src/functions/resolve.rs` around lines 45 - 57, The expiry
check at record.expires_at <= now_ms() is not atomic with the subsequent
harness::function::resolve call at lines 86-90, creating a TOCTOU window where a
record could expire between the check and the harness invocation, allowing both
the sweep and human-resolve to claim ownership and call harness for the same
record. To fix this, implement an atomic state transition before any harness
call is made: instead of just checking and returning early, update the record
state atomically (in the state store or database) to claim ownership for the
human resolver path before proceeding to the harness call, ensuring that only
one path (human or sweep) can successfully proceed while the other encounters
the claimed ownership state.

ytallo added 11 commits June 15, 2026 15:18
…06-agentic spec

The policy and decision surface for human-held function calls
(tech-specs/2026-06-agentic/approval-gate.md):

- approval::gate pre_dispatch hook: human-only defense, per-session
  mode/allow-list evaluation (ported exactly from the proven
  turn-orchestrator consultBefore semantics), policy::check_permissions
  fallback (unparseable reply degrades to hold; transport failure fails
  closed), synchronous pending-record write before hold (never hold
  blind), idempotent on function_call_id.
- approval::resolve: allow releases via harness::function::resolve
  action execute, deny delivers an is_error DenialEnvelope; no decision
  records persisted - resolve first, delete, then emit.
- Ephemeral pending inbox (approval_pending scope) with four deletion
  paths (resolve, turn terminal, session deleted, cron sweep on
  expires_at); deletion is the emit gate, so pending_resolved fires
  exactly once per record. state::set null is the atomic gate plus a
  follow-up state::delete to clean the engine's null tombstone.
- Per-session settings (approval_settings scope) with lazy seeding:
  reads never write; first mutation materializes from the deployment
  defaults (seed entries granted_by seed).
- Two custom trigger types (approval::pending_created /
  approval::pending_resolved) with session_id/metadata tenancy filters.
- approval-gate configuration entry (default_mode, always_allow_seed,
  pending_timeout_ms) with reactive reload; built-in defaults when the
  configuration worker is absent.
- Recursive argument redaction (secret keys, 256-codepoint clip, depth
  cap) so records are safe for notification channels.

All harness/session-manager trigger bindings are best-effort: the
greenfield harness contracts are not implemented yet, so the worker
boots standalone and the engine-backed integration suite fakes
policy::check_permissions, harness::function::resolve, and session::get.

118 unit tests (full prior-art permission matrix) + 6 engine-backed
integration scenarios; 94% line coverage.
…ation contract)

Follows docs/architecture/per-worker-architecture.md and the
session-manager/architecture reference layout: internals.md for
maintainers (gate control flow, emit-gate deletion mechanics, lazy
seeding, verified engine facts), integration.md for consumers (function
and trigger contract, harness handoff requirements, deployment notes,
agent-exposure guidance), plus the index README with system overview and
vocabulary. Links the folder from the consumer README.
…against a real engine

Replace the Bus trait seam (bus.rs + testkit/fake_bus.rs) with direct
iii.trigger() calls through thin per-target wrappers (harness.rs,
session.rs, state.rs), matching the llm-router convention. Rewrite the
unit and integration tests to run against a real spawned iii engine via
testkit/engine.rs instead of the in-memory FakeBus/MemoryState doubles.

The public RPC contract is unchanged: all approval::* function ids,
payloads, and the harness::function::resolve envelope are identical.
Only internal signatures move (Deps.iii: Arc<III>, Emitter::new).

Engine-backed tests self-skip when no iii binary is available
(III_ENGINE_BIN); CI must provide one or coverage drops.
Engine-backed tests spawn a real iii engine and self-skip when no
binary is on PATH, so the approval-gate behavior suite (fail-closed,
exactly-once emission, sweep) passed in CI without actually running.
Install the iii CLI before `cargo test` so those tests execute. Pinned
to 0.19.2 to match the worker's iii-sdk; workers with only pure-unit
tests are unaffected.
A human resolve arriving after a hold's deadline raced the timeout
sweep: both paths called harness::function::resolve for the same call,
and a late timeout could clobber the human decision. Reject a resolve
once the record has expired (expires_at <= now), matching the sweep's
boundary exactly, so an expired record is owned solely by the sweep.

Updates two test seeds that used a past-epoch expires_at to a live
deadline, and adds a test that an expired record is left for the sweep.
Engine-backed tests each spawn their own iii engine; running them in
parallel on a 2-core CI runner starves the engines until state ops hit
their timeout. Run with --test-threads=1 so one engine runs at a time.
Pure-unit-only workers are unaffected (their tests are fast either way).
The prior change ran globally in the rust matrix, so it installed iii
for every changed worker — which made other workers' engine-gated
integration tests (console, iii-directory, session-manager) attempt a
real engine dial and fail. Gate both the install and --test-threads=1
on the approval-gate worker; every other worker keeps its prior command
and behavior.
Per docs/sops/new-worker.md §3 and §6: add the Modules table row, the
create-tag.yml worker option, and the release.yml `approval-gate/v*`
tag pattern. Without these the worker has no README entry, can't be
picked in the manual release dropdown, and pushing an
`approval-gate/vX.Y.Z` tag would trigger nothing.
…od build

The engine-backed tests live inline in src/, so testkit must stay in the
crate — but it should not compile into the production lib/binary. Gate
`mod testkit` on `any(test, feature = "testkit")` and add the feature;
a self dev-dependency enables it for the integration tests so `cargo
test` still works without `--all-features`.
Snapshot the request/response schemas of the 14 approval::* functions
and the 2 trigger payloads under tests/golden/schemas/*.json, generated
with SchemaSettings::draft07() — the same generator iii-sdk uses at
registration, so the goldens pin exactly what the engine publishes. A
catalog()/trigger_catalog() is the single source of truth; function ids
and descriptions are now shared consts used by both register_all and the
catalog. Regenerate with UPDATE_GOLDENS=1.
install.iii.dev's release index can briefly 404 an existing engine
version (observed right after a new release published), which failed the
job even though 0.19.2 is valid. Retry the install up to 3x; the final
`iii --version` still hard-fails if all attempts miss.
ytallo added 2 commits June 15, 2026 15:19
The install script resolves the engine release via the GitHub API; with
no token it shares the 60/hr unauthenticated limit, so under CI load it
fails with "rate limit hit" (and earlier surfaced as a misleading
"release tag not found"). Pass the Actions GITHUB_TOKEN (5000/hr) to both
the rust-job engine install and the pre-existing interface-smoke install,
which flaked across workers for the same reason. The rust-step retry stays
as a genuine-network-blip guard.
Restore .github/workflows/ci.yml to main. The engine-backed tests will
self-skip in CI again (no iii on PATH); they still run locally / via
`cargo test` with an engine available, and the pure unit + golden-schema
tests still run in CI. Engine-test CI can be handled outside this worker PR.
…ng SOP

Conform the worker's wire surface to the new-worker / binary-worker SOP
(§7 Naming: multi-word segments are kebab-case, never snake_case), matching
context-manager (context::count-tokens, context::on-config-change).

Renamed function ids: approval::{list_pending, get_pending, set_mode,
add_always_allow, remove_always_allow, approve_always, get_settings,
clear_settings, on_config_change, on_session_deleted, on_turn_completed}
and trigger ids approval::{pending_created, pending_resolved} to their
kebab equivalents. Updated the harness trigger bindings the worker reads
(harness::hook::pre-dispatch, harness::turn-completed).

Golden wire-schema files renamed and regenerated; gate/resolve/sweep are
single-word and unchanged. The hook payload discriminant value
point:"pre_dispatch" is data, not an id, and is left as-is.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant