feat(shell): v0.4.0 — SDK bump, central config + hot-reload, LLM-friendly typed API, hardening + DX by andersonleal · Pull Request #238 · iii-hq/workers

andersonleal · 2026-06-09T21:15:40Z

Summary

Shell worker v0.3.7 → v0.4.0 — a single breaking release. Bumps the SDK, moves the worker onto the central configuration worker with hot-reload, publishes typed JSON schemas for an LLM-friendly API, and adds a round of pre-landing hardening plus DX features driven by a real agent session.

Two commits: the v0.4.0 base, then the pre-landing hardening + DX additions.

SDK & observability

iii-sdk 0.16.0-next.2 → 0.19.1-next.1; add iii-observability
per-call OpenTelemetry (shell.calls, shell.call.duration_ms, shell.jobs.running gauge, shell.exec.output_truncated) + tracing spans

Central configuration

database-style central config via the configuration worker with serialized hot-reload (fetch → build → swap under one lock); fail-closed boot reconcile; payload-untrusted re-fetch; shell::config-status denied to agents
removed manifest.rs; restored ./config.yaml default; added config.collect.yaml for CI

LLM-friendly typed API

published request/response JSON schemas for all functions (engine tool listing shows field types)
S-codes surfaced as the wire error.code (agents branch on error.code instead of parsing messages); unified S-code table
per-call cwd + gated env for exec/exec_bg; structured, sandbox-safe fs::* responses

Pre-landing hardening (adversarial review)

PID-safe kill: host bg jobs terminate via a per-job notify channel owned by the drain task, not a bare-pid SIGKILL — closes a PID-reuse window that could kill an unrelated process when the worker runs as root
process-group spawn + group-kill on kill/timeout/shutdown (no orphaned grandchildren); bounded drain join; atomic host-bg finalize
client-side RPC timeout on sandbox bg forwards (a hung engine can't leak a concurrency slot)
fs: read cap + binary-skip in sed, byte-bounded grep line reads, recursive chmod/rm/ls/mv moved off the async executor
reject exec command paths that resolve inside the writable fs jail (closes an allowlist-basename → fs::write → exec host RCE)
broadened per-call env denylist (loader + interpreter startup-file keys); fixed the rejection message that named HOME/PATH as settable

LLM-DX API additions (from a real agent session)

inline shell::fs::write: content accepts a plain string (host target) — no streaming channel to construct; sandbox still requires a ContentRef
multi-file batch write: files: [{ path, content, mode?, parents? }] in one call, with per-file results
stdin on shell::exec / shell::exec_bg (host-only): pipe to tee/patch/filters without a shell heredoc
separate max_bg_timeout_ms (default 0 = unbounded), distinct from the foreground max_timeout_ms — long installs/builds/dev servers are no longer killed at the 30s foreground cap

Test plan

cargo build --all-targets — clean
cargo clippy --all-targets — clean
cargo fmt --all --check — clean
cargo test — 540 passing, 0 failing
adversarial verification of the concurrency core + the new write/stdin/bg-timeout features — SOUND
live-verified earlier on the running engine: cwd/env present in the published shell::exec schema; S-code returned as the top-level wire error.code
reviewer: exercise the sandbox round-trip for the new write/exec shapes against a live engine

Leaves 6 unrelated */Cargo.lock files (other workers) untouched.

Summary by CodeRabbit

New Features
- Configuration hot-reload with a config-status endpoint and fail-closed boot.
- Per-call exec overrides: cwd, env, and stdin (host-only; sandbox rejects).
- Background-job hard-cap timeout (configurable max_bg_timeout_ms).
- Inline single-file and multi-file batch filesystem writes; richer fs responses (path, entries_changed, per-file results).
- Exec/FS telemetry and runtime metrics.
Chores
- Release bumped to 0.4.0.
- fs.allow_special_bits toggle and tightened command/env confinement.

…ndly typed API, hardening (v0.4.0) BREAKING CHANGE: bumps iii-sdk to 0.19.1-next.1 and restructures the shell worker around the central `configuration` worker with hot-reload. 0.3.7 -> 0.4.0. SDK & observability - iii-sdk 0.16.0-next.2 -> 0.19.1-next.1; add iii-observability 0.19.1-next.1 - per-call OpenTelemetry: shell.calls, shell.call.duration_ms, shell.jobs.running gauge, shell.exec.output_truncated; tracing spans Central configuration - database-style central config via the `configuration` worker with serialized hot-reload (fetch -> build -> swap under a single lock) - fail-closed boot reconcile; re-fetch authoritative config (no payload trust); validate seed before persist; deny shell::on-config-change for agents - remove manifest.rs; restore ./config.yaml --config default; config.collect.yaml LLM-friendly typed API - publish request/response JSON schemas (typed ExecRequest/ExecBgRequest, empty ListRequest/ConfigStatusRequest) so the engine tool listing shows field types - surface S-codes as the wire error.code (From<FsError/ExecError> for IIIError::Remote) so agents can branch on error.code; unify the S-code table - per-call cwd + gated env for exec/exec_bg (jail-confined cwd, allowed_env gating) - structured sandbox-safe fs responses; discoverable error-code hints in descriptions Hardening - job lifecycle: reaper eviction, host-bg timeout, finalize-once, kill running host jobs by pid (on demand + on shutdown), reap rejected child - fs host: setuid/setgid guard, regex size caps, grep/sed walk off the async executor - deny HOME + glibc exec-vector env keys (GCONV_PATH, LD_*/DYLD_*, IFS, PATH ...) per call 522 tests passing; build/clippy/fmt clean.

…te, stdin, bg timeout) Follow-on work on top of the v0.4.0 squash, adversarially verified (build/clippy/fmt clean, 540 tests passing). Pre-landing review hardening - kill: terminate host bg jobs via a per-job notify channel owned by the drain task instead of a bare-pid SIGKILL — closes a PID-reuse window that could SIGKILL an unrelated process when the worker runs as root - spawn children in their own process group; group-kill on kill/timeout/shutdown so grandchildren don't orphan; bounded drain join so a held pipe can't wedge a job - atomic host-bg finalize (status + exit_code + output published together) - client-side RPC timeout on sandbox bg forwards (a hung engine can't leak a slot) - fs: enforce a read cap + binary-skip in sed, byte-bounded grep line reads, treat 0 caps as defaults, move recursive chmod/rm/ls/mv off the async executor - reject exec command paths that resolve inside the writable fs jail (closes an allowlist-basename -> fs::write -> exec host RCE) - add interpreter startup-file keys (BASH_ENV/ENV/PYTHONSTARTUP/PERL5OPT/RUBYOPT/ NODE_OPTIONS) to the per-call env denylist; stop the env-rejection message from naming HOME/PATH as "settable" LLM-DX API additions (from a real agent session that hit the friction) - shell::fs::write accepts inline string content (host target) in addition to a streaming ContentRef; sandbox still requires a ContentRef (inline -> S210) - shell::fs::write batch form: files: [{path, content, mode?, parents?}] in one call, with per-file results - shell::exec / shell::exec_bg gain an optional host-only stdin (pipe to tee/ patch/filters without a shell heredoc) - separate max_bg_timeout_ms (default 0 = unbounded) for host bg jobs, distinct from the foreground max_timeout_ms — long installs/builds/dev servers are no longer killed at the 30s foreground cap Docs (README / ARCHITECTURE / SKILL / config.yaml / config.collect.yaml) updated for all of the above.

vercel · 2026-06-09T21:15:47Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
workers	Ready	Preview, Comment	Jun 9, 2026 10:20pm

github-actions · 2026-06-09T21:15:56Z

skill-check — worker

0 verified, 14 skipped (no docs/).

Layer	Result
structure	✓
vale	✓
ai	✓
render	✓

Four for four. Nicely done.

coderabbitai · 2026-06-09T21:15:58Z

📝 Walkthrough

Walkthrough

Centralizes shell configuration with hot-reload, enforces per-call exec policy (cwd/env/stdin), unifies S-code error mapping, reworks host background-job lifecycle, redesigns filesystem write/streaming and sandbox contracts, adds telemetry, and updates documentation and tests for v0.4.0.

Changes

Shell Worker v0.4.0 Complete Refactor

Layer / File(s)	Summary
Release surface & operator docs `iii-permissions.yaml`, `shell/ARCHITECTURE.md`, `shell/README.md`, `shell/skills/SKILL.md`, `shell/Cargo.toml`, `shell/config*.yaml`	Permissions denylist extended; CLI/docs updated for config-driven startup, new fields (`max_bg_timeout_ms`, `fs.allow_special_bits`), function contract changes (`entries_changed`, `shell::config-status`, write inline vs stream).
Configuration hot-reload & runtime wiring `shell/src/config.rs`, `shell/src/configuration.rs`, `shell/src/main.rs`	ShellConfig gains JsonSchema and serialization helpers; configuration.rs builds prepared runtime, register/fetch/reload logic, serialized reloads, register trigger `shell::on-config-change`, and boot reconcile; main.rs builds AppState and registers handlers after fail-closed reconcile.
S-code mapping & typed error lifting `shell/src/scode.rs`, `shell/src/exec/error.rs`, `shell/src/fs/error.rs`	Unified S-code canonicalization and recovery; ExecError/FsError lift into `iii_sdk::IIIError::Remote` preserving the original S-code and message; local stringified helpers removed.
Exec overrides & backend changes `shell/src/exec/policy.rs`, `shell/src/exec/mod.rs`, `shell/src/exec/backend.rs`, `shell/src/exec/host.rs`, `shell/src/exec/sandbox.rs`, `shell/src/functions/exec*.rs`, `shell/src/functions/types.rs`	Adds `ExecOverrides` and `build_overrides` validation (cwd confinement, allowed_env + dangerous-key denylist, stdin); threads overrides into ExecBackend::run; host applies overrides; sandbox rejects non-empty overrides; handlers updated to use typed IIIError.
Background-job lifecycle & kill flow `shell/src/jobs.rs`, `shell/src/functions/exec_bg.rs`, `shell/src/functions/kill.rs`, `shell/src/functions/status.rs`	JobHandle tracks `host_pid`, try_reserve_and_insert returns a RAII RunningJobGuard, kill-signal registry added, `spawn_host_job` accepts overrides, host hard-cap `max_bg_timeout_ms` enforced, join_drain for output, and typed status/kill errors introduced.
Filesystem wire & backend redesign `shell/src/fs/mod.rs`, `shell/src/fs/host.rs`, `shell/src/fs/sandbox.rs`, `shell/src/functions/fs_*.rs`	Wire: `WriteContent` (Inline/Stream), `WriteRequest::into_specs` for single vs batch; enriched responses (`entries_changed` alias for `updated`, path/flags, per-file results). Host backend: confine_path, spawn_blocking offload, sed/grep caps, special-bits checks, inline vs streamed write limits. Sandbox write rejects inline content and forwards stream `ContentRef`.
Handlers & error mapping `shell/src/functions/*`	Handlers now return typed `iii_sdk::IIIError`/`ExecError`, removed err_to_string helpers, adapt fs write batching and exec override building, and updated tests.
Telemetry & registration refactor `shell/src/telemetry.rs`, `shell/src/main.rs`, `shell/src/lib.rs`	New OTel metrics and record_call wrapper; main initializes telemetry and uses AppState to source live runtime per-call; fs handler registration centralized.
Tests & E2E harness updates `shell/tests/`, `shell/tests/e2e/workers/harness/src/`	Extensive unit and e2e updates: assert new response fields, typed S-codes, inline/batch write cases, exec stdin cases, bg job timeout cases, and harness expectError matching on wire `code`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

iii-hq/workers#228: Overlapping host fs confinement/path handling changes.
iii-hq/workers#192: Related iii-sdk/registration and main wiring changes.
iii-hq/workers#80: Prior shell exec/config refinements this PR extends.

Suggested reviewers

sergiofilhowz
ytallo

"🐰 I hopped through configs late at night,
tightened paths and made S-codes right,
per-call cages and writes that stream,
hot-reloads humming a careful dream,
tests green, docs tidy — hop, delight!"

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/shell-sdk-bump-config-llm-api

coderabbitai

Actionable comments posted: 18

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@shell/config.yaml`:
- Line 2: Current config sets max_bg_timeout_ms: 0 which allows background jobs
to run indefinitely and can exhaust job slots; change the default to a finite
cap or implement a fallback to max_timeout_ms when max_bg_timeout_ms is
missing/zero. Update the configuration value for max_bg_timeout_ms to a sensible
non-zero default (e.g., same as max_timeout_ms) or update the job scheduler code
that reads max_bg_timeout_ms to treat 0/undefined as "use max_timeout_ms" and
enforce that timeout when scheduling background jobs; reference the config key
max_bg_timeout_ms, the fallback key max_timeout_ms, and the concurrency limiter
max_concurrent_jobs so the change prevents background jobs from holding slots
forever.

In `@shell/README.md`:
- Around line 158-166: The upgrade notes duplicate the "shell::fs::chmod
response field renamed `updated` → `entries_changed`" bullet; edit the README.md
upgrade list to remove the repeated instance so the chmod rename appears only
once (keep the first occurrence and delete the later duplicate referring to
shell::fs::chmod and `entries_changed`), ensuring the rest of the bullets and
ordering remain unchanged.
- Line 105: The README link for shell::exec points to the wrong fragment
`#per-call-cwd-and-env-host-target`; update the link so it matches the actual
section heading slug (which includes "stdin") or change the section title to
match the existing fragment. Concretely, in the shell::exec table entry replace
the fragment with the correct anchor (e.g.,
`#per-call-cwd-and-env-host-target-stdin` if the heading reads "Per-call `cwd`
and `env` (host-target, stdin)") or rename the heading to remove "stdin" so the
existing `#per-call-cwd-and-env-host-target` fragment resolves.

In `@shell/skills/SKILL.md`:
- Around line 66-68: The documentation incorrectly implies that all
sandbox-targeted calls to shell::exec are rejected; update the SKILL.md text for
shell::exec to state that only host-only override fields (e.g., providing a
host-only stdin string) are rejected with error S210 on sandbox targets, while
plain sandbox executions are allowed; specifically edit the sentence mentioning
"optional host-only `stdin`" to clarify that the host-only override is rejected
on sandbox targets (S210) but sandbox exec itself remains valid, and ensure the
wording references shell::exec and S210 for clarity.

In `@shell/src/config.rs`:
- Around line 245-260: The code currently only rejects commands with '/' when
self.fs.host_root is Some, letting an unjailed fs (host_root None +
self.fs.allow_unjailed == true) execute arbitrary host paths; modify the
cmd.contains('/') handling to also return an Err when
self.fs.host_root.is_none() && self.fs.allow_unjailed is true. Concretely,
inside the same branch that checks cmd.contains('/'), add a guard using
self.fs.allow_unjailed and self.fs.host_root to reject any path-form command
(cmd) when the FS is explicitly unjailed, producing a similar error message
referencing cmd and that executing files written via shell::fs::write or
arbitrary host paths is not allowed.

In `@shell/src/configuration.rs`:
- Around line 184-188: Modify try_get_config_value to stop string-matching RPC
errors and instead match the structured IIIError::Remote variant from
trigger_with_retry: ensure trigger_with_retry returns/propagates an IIIError
(not a String) so inside try_get_config_value you can match Err(IIIError::Remote
{ code, .. }) if code == "NOT_FOUND" => Ok(None), map Ok(resp) =>
Ok(resp.get("value").cloned()), and for other Err(e) return Err(e.to_string())
(or propagate the IIIError if you prefer changing the function error type);
reference try_get_config_value, trigger_with_retry, IIIError::Remote, and the
"configuration::get" / CONFIG_ID call when making the change.

In `@shell/src/exec/policy.rs`:
- Around line 53-88: The current is_dangerous_env_key only checks
DANGEROUS_ENV_KEYS for exact matches which misses other loader-related names;
update is_dangerous_env_key to return true if the key is exactly in
DANGEROUS_ENV_KEYS OR if it starts with the loader-family prefixes (e.g.
key.starts_with("LD_") || key.starts_with("DYLD_")), so any LD_/DYLD_ variant is
rejected; keep the existing DANGEROUS_ENV_KEYS for non-family names and
optionally remove redundant LD_*/DYLD_* entries from that constant to avoid
duplication.

In `@shell/src/fs/host.rs`:
- Around line 1243-1350: The loop in sed currently follows symlinks when reading
(std::fs::metadata/read_to_string) but then writes to the symlink path with
std::fs::rename(&tmp, p), which replaces the link with a regular file; fix by
detecting and handling symlinks consistently: call std::fs::symlink_metadata(p)
and check file_type().is_symlink() for each anchored_path (from
lexical_operand_with) and either push an FsSedFileResult error (e.g., "operand
is a symlink; skipped") to reject symlink operands, or else canonicalize the
operand once (std::fs::canonicalize) and use that canonical path for the
metadata/read_to_string/temp_sibling/write/rename so the same inode is read and
rewritten; update the code paths around the
metadata/read_to_string/tmp/temp_sibling/rename sections (the loop over
anchored, the use of looks_binary, temp_sibling, and write/rename) to use the
chosen consistent approach.
- Around line 249-260: The current canonicalize_with_fallback error mapping
treats any dangling symlink as S215, blocking link-preserving operations; change
canonicalize_with_fallback (or add an overload like
canonicalize_with_fallback_allow_dangling_leaf) to accept a flag
"allow_dangling_leaf" and only map dangling-symlink errors to S215 when that
flag is false, otherwise return a lexical-fallback style result (or a distinct
error code) so callers can switch back to lexical resolution; update callers for
link-preserving handlers (rm, mv, chmod, sed or their handler functions) to call
the new variant with allow_dangling_leaf=true, leaving other callers using the
default fully-resolved behavior (allow_dangling_leaf=false) that still maps to
S215 via FsError::new("S215", ...) while non-dangling errors keep S210.
- Around line 859-908: Replace the racy precheck using dst_p.exists() and move
the overwrite/no-overwrite decision into the blocking rename unit: when
req.overwrite is false, call an atomic no-replace move (use libc::renameat2 with
RENAME_NOREPLACE on Linux) inside the tokio::task::spawn_blocking closure
instead of the separate exists check; if renameat2 returns EEXIST convert to
FsError::new("S213", ...) as before, on ENOSYS fall back to the existing
std::fs::rename / EXDEV copy+rename+unlink path, and on success set overwrote =
false (derive the overwrote flag from whether the operation replaced an existing
dst rather than the removed dst_existed precheck); keep using temp_sibling,
std::fs::copy, and FsError::from_io in the fallback and map errors consistently.

In `@shell/src/fs/mod.rs`:
- Around line 485-506: The batch-path handling in WriteRequest::into_specs
currently only rejects when path or content are set but silently ignores
top-level mode and parents; update into_specs (the method on WriteRequest) so
that when self.files.is_some() it checks for any of the single-file fields
(self.path, self.content, self.mode (non-default/empty), and self.parents) and
returns an FsError "S210" if any are present, with a message like "provide
either the single-file `path`+`content`+`mode`+`parents` or a `files` array, not
both"; ensure you reference the WriteRequest fields (path, content, mode,
parents, files) so callers are rejected when mixing forms rather than silently
dropping mode/parents.

In `@shell/src/functions/exec_bg.rs`:
- Around line 55-63: The S210 error message returned in the
!overrides.is_empty() check should mention stdin as a host-only override as
well; update the error string in the exec_bg.rs block that checks overrides (the
!overrides.is_empty() branch) so it references "cwd/env/stdin" (or "cwd, env, or
stdin") and explains they are host-only and not forwarded by the sandbox exec
protocol (keep the S210 code). Ensure the updated message still directs callers
to drop those overrides or use target: host and preserve the existing error flow
using the same return Err(...) in that branch.
- Around line 121-126: The Err branch in exec_bg (the Err((running, mut handle))
path) only calls ch.start_kill() which kills the direct child but can leave
forked descendants running; update this branch (after taking handle.child into
ch and before awaiting ch.wait()) to also tear down the whole process group for
the child PID (send a SIGKILL to the child's process group, e.g. kill(-pid,
SIGKILL) on Unix or equivalent) so rejected spawns cannot leak descendants; keep
the existing start_kill() and await ch.wait() but add the process-group kill
using the child's pid (referencing ch, handle, start_kill, wait, and the exec_bg
error path).

In `@shell/src/functions/fs_write.rs`:
- Around line 30-45: The loop uses backend.write(spec).await and accumulates
WriteFileResult into files and total, but on the first backend error it maps to
iii_sdk::IIIError and returns immediately, discarding partial results; update
the code so that when backend.write fails you return an error variant or
response that includes the accumulated files Vec<WriteFileResult> and total
bytes (e.g., extend IIIError or introduce a BatchWriteError containing files and
bytes_written) instead of dropping them; adjust the return type of the function
(or map_err path) to propagate the partial WriteResponse (files and
bytes_written) alongside the error so callers can see which files succeeded and
how many bytes were written.

In `@shell/src/functions/kill.rs`:
- Around line 25-39: Branch 1 currently calls child.start_kill() which only
kills the direct child; change it to use the same process-group kill path used
in the timeout/notify flow so descendant processes are also terminated. Locate
the branch that checks h.child.as_mut() and replace the start_kill() call with
the process-group kill helper used elsewhere in this module (the code path used
for timeout/notify), ensuring you still update h.record.status to
JobStatus::Killed, set finished_at_ms, and return the same KillResponse
structure.
- Around line 214-228: This test acquires HOST_SWEEP_TEST_GUARD but not
GAUGE_TEST_GUARD, allowing a race on the RUNNING_JOBS gauge when
spawn_host_job() reserves a slot; to fix, acquire GAUGE_TEST_GUARD before
HOST_SWEEP_TEST_GUARD at the top of this test (same ordering used in exec_bg.rs)
so the running-gauge guard serializes with jobs.rs assertions around
RUNNING_JOBS while calling spawn_host_job().

In `@shell/src/functions/types.rs`:
- Around line 270-285: ListRequest and ConfigStatusRequest are empty no-arg
request types that currently accept unknown JSON fields; add serde unknown-field
rejection to match the published empty-object schema by annotating both structs
(ListRequest and ConfigStatusRequest) with #[serde(deny_unknown_fields)] so
deserialization fails on extra fields, ensuring runtime behavior aligns with the
JsonSchema contract.

In `@shell/src/scode.rs`:
- Around line 75-92: scan_s_code currently only checks the byte before the match
and can accept "S211foo" or "S2119"; add a right-hand boundary check so the
3-digit S-code is not immediately followed by an ASCII alphanumeric or
underscore. In the loop in scan_s_code (using bytes and index i), after
confirming bytes[i..i+4] matches, compute a followed_by_word flag by testing
whether i + 4 < bytes.len() and if so whether bytes[i +
4].is_ascii_alphanumeric() || bytes[i + 4] == b'_', and only return
Some(&s[i..i+4]) when !preceded_by_word && !followed_by_word (treat
end-of-string as OK).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 513a10c3-22ad-409b-9b81-aaa780937836

📥 Commits

Reviewing files that changed from the base of the PR and between 1c75ee9 and 3e9f82a.

⛔ Files ignored due to path filters (1)

shell/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (51)

iii-permissions.yaml
shell/ARCHITECTURE.md
shell/Cargo.toml
shell/README.md
shell/config.collect.yaml
shell/config.yaml
shell/skills/SKILL.md
shell/src/config.rs
shell/src/configuration.rs
shell/src/exec/backend.rs
shell/src/exec/error.rs
shell/src/exec/host.rs
shell/src/exec/mod.rs
shell/src/exec/policy.rs
shell/src/exec/sandbox.rs
shell/src/exec_dispatch.rs
shell/src/fs/error.rs
shell/src/fs/host.rs
shell/src/fs/mod.rs
shell/src/fs/sandbox.rs
shell/src/functions/exec.rs
shell/src/functions/exec_bg.rs
shell/src/functions/fs_chmod.rs
shell/src/functions/fs_dispatch.rs
shell/src/functions/fs_grep.rs
shell/src/functions/fs_ls.rs
shell/src/functions/fs_mkdir.rs
shell/src/functions/fs_mv.rs
shell/src/functions/fs_read.rs
shell/src/functions/fs_rm.rs
shell/src/functions/fs_sed.rs
shell/src/functions/fs_stat.rs
shell/src/functions/fs_write.rs
shell/src/functions/kill.rs
shell/src/functions/status.rs
shell/src/functions/types.rs
shell/src/jobs.rs
shell/src/lib.rs
shell/src/main.rs
shell/src/manifest.rs
shell/src/scode.rs
shell/src/telemetry.rs
shell/tests/e2e/run-tests-jailed.sh
shell/tests/e2e/run-tests.sh
shell/tests/e2e/workers/harness/src/cases-fs-host.ts
shell/tests/e2e/workers/harness/src/cases-fs-sandbox.ts
shell/tests/function_handlers.rs
shell/tests/host_fs_branches.rs
shell/tests/jobs_lifecycle.rs
shell/tests/sandbox_dispatch.rs
shell/tests/sandbox_exec_dispatch.rs

💤 Files with no reviewable changes (2)

shell/src/manifest.rs
shell/src/functions/fs_dispatch.rs

… & harness S-code matching Adds live, over-the-engine e2e coverage for the four v0.4.0 features (inline + batch fs::write, exec/exec_bg stdin, separate max_bg_timeout_ms). Running it surfaced two real issues, fixed here: Fixes - fs::write: a single-element `files: [x]` batch wrongly collapsed to the flat single-file response (empty `files`). `WriteRequest::into_specs` now reports `is_batch`, so the `files` form always returns the per-file `files` array even for one entry — a caller that sends `files` always gets `files` back. - e2e harness: the v0.4.0 change that surfaces S-codes as the structured wire `error.code` (so agents branch on it) had silently broken ~43 harness assertions that substring-matched the S-code in the error MESSAGE. The SDK rejects with the raw `{ code, message, stacktrace }` body, so `expectError` now matches the pattern against `e.code` AND `e.message`. Updated the few cases that did their own message inspection, and repurposed pbreak_write_content_wrong_type (a string `content` is now valid inline). New e2e cases - cases-fs-write-inline.ts: inline string write + on-disk round-trip, multi-file batch with per-file results, and the host-only/ambiguity rejections (both forms, empty files, missing content, inline-on-sandbox -> S210). - cases-exec-stdin.ts: stdin piped to `cat` on exec/exec_bg, closed-stdin default, stdin-on-sandbox -> S210. - cases-jobs-bg-timeout.ts: bg job killed at max_bg_timeout_ms (with the hard-cap note, proving it survived past the foreground cap) + under-cap completion. - config.yaml / config-jailed.yaml: max_bg_timeout_ms: 6000 (> max_timeout_ms). Verified: full e2e suite 167/167 pass against the engine; 238 Rust lib tests pass; cargo fmt/clippy clean; harness tsc --noEmit clean.

CI fix: the jailed harness suite read S-codes from error.message, but v0.4.0 moved them to the structured error.code field — update both jailed cases to read e.code (only the jailed suite runs in CI). Security/correctness (CodeRabbit review): - config: reject command paths when fs is unjailed — closes an RCE where an agent could shell::fs::write /tmp/ls then exec command:"/tmp/ls" (basename allowlisted) and bypass the read-only allowlist. Unjailed mode has no writable boundary, so any path could run planted bytes; require bare PATH-resolved names. Jailed mode unchanged. - policy: reject LD_*/DYLD_* env keys by family prefix, not exact name. - exec_bg/kill: SIGKILL the whole process group on cap-rejection and on kill branch 1 (both still own the un-reaped child, so the pid is safe); prevents forked descendants leaking after killed:true. - fs::write: batch form now rejects top-level mode/parents instead of silently dropping them (mode/parents are now Option). - fs::sed: skip symlink operands (rename would replace the link with a regular file, destroying it); per-file error, link + target preserved. - scode: require a right-hand boundary so S2119/S211foo don't parse as S211. - configuration: case-insensitive NOT_FOUND match (real code is function_not_found). Docs: fix broken README fragment link, drop a duplicate chmod migration bullet, clarify SKILL.md sandbox exec semantics (only host-only overrides are rejected on a sandbox target, not all sandbox exec), name stdin in the sandbox override-rejection messages. Declined: finite max_bg_timeout_ms default (intentional/documented), dangling-symlink S215 relaxation and mv RENAME_NOREPLACE (heavy/risky, fail-closed), deny_unknown_fields on no-arg requests — the latter is actively harmful: the engine injects _caller_worker_id into every payload, so a strict struct rejects all calls. Added a regression case pinning that shell::list tolerates extra fields. E2e: +new cases for the unjailed RCE guard, batch-write mode/parents rejection, sed symlink skip, and the list extra-field contract. Default 171/171, jailed 2/2.

coderabbitai

♻️ Duplicate comments (1)

shell/src/fs/host.rs (1)
1259-1280: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Dangling symlink operands still fail before this skip path.

This only skips symlinks that survive validation. A broken link under host_root still gets rejected by the pre-validation loop at Line 1240, so one dangling operand aborts the entire sed call instead of producing the per-file "skipped" result this block is trying to return. If the intent is to make symlink operands non-fatal, sed needs the same kind of "allow dangling final component" confinement path before it reaches this branch.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shell/src/fs/host.rs` around lines 1259 - 1280, The pre-validation loop
currently rejects dangling symlink operands before the later symlink-skip block;
change the pre-validation to accept a final component that is a symlink by using
std::fs::symlink_metadata (or by calling symlink_metadata when metadata fails)
and treating Ok(lmd) where lmd.file_type().is_symlink() as a valid operand
rather than a fatal error, so that the later branch that pushes an
FsSedFileResult with "operand is a symlink; skipped..." (the block that checks
std::fs::symlink_metadata(p) and calls results.push(...)) can handle the
dangling link per-file instead of aborting the entire sed call; keep existing
checks that still enforce confinement to host_root.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@shell/src/fs/host.rs`:
- Around line 1259-1280: The pre-validation loop currently rejects dangling
symlink operands before the later symlink-skip block; change the pre-validation
to accept a final component that is a symlink by using std::fs::symlink_metadata
(or by calling symlink_metadata when metadata fails) and treating Ok(lmd) where
lmd.file_type().is_symlink() as a valid operand rather than a fatal error, so
that the later branch that pushes an FsSedFileResult with "operand is a symlink;
skipped..." (the block that checks std::fs::symlink_metadata(p) and calls
results.push(...)) can handle the dangling link per-file instead of aborting the
entire sed call; keep existing checks that still enforce confinement to
host_root.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a339eb7d-d300-4ba2-bc56-42241bf29a24

📥 Commits

Reviewing files that changed from the base of the PR and between 5a89d63 and 7c261b4.

📒 Files selected for processing (18)

shell/README.md
shell/skills/SKILL.md
shell/src/config.rs
shell/src/configuration.rs
shell/src/exec/policy.rs
shell/src/exec/sandbox.rs
shell/src/fs/host.rs
shell/src/fs/mod.rs
shell/src/functions/exec_bg.rs
shell/src/functions/kill.rs
shell/src/functions/types.rs
shell/src/main.rs
shell/src/scode.rs
shell/tests/e2e/workers/harness/src/cases-fs-protocol-break.ts
shell/tests/e2e/workers/harness/src/cases-fs-write-inline.ts
shell/tests/e2e/workers/harness/src/cases-jobs-break.ts
shell/tests/e2e/workers/harness/src/cases-safety.ts
shell/tests/e2e/workers/harness/src/cases-vuln-repro-jailed.ts

🚧 Files skipped from review as they are similar to previous changes (9)

shell/tests/e2e/workers/harness/src/cases-fs-write-inline.ts
shell/src/functions/types.rs
shell/src/functions/exec_bg.rs
shell/skills/SKILL.md
shell/README.md
shell/src/exec/policy.rs
shell/src/configuration.rs
shell/src/main.rs
shell/src/exec/sandbox.rs

andersonleal added 2 commits June 9, 2026 15:39

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

sergiofilhowz approved these changes Jun 9, 2026

View reviewed changes

vercel Bot deployed to Preview June 9, 2026 21:42 View deployment

vercel Bot deployed to Preview June 9, 2026 22:20 View deployment

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

andersonleal merged commit 9f48415 into main Jun 9, 2026
13 of 15 checks passed

andersonleal deleted the feat/shell-sdk-bump-config-llm-api branch June 9, 2026 22:47

coderabbitai Bot mentioned this pull request Jun 11, 2026

feat(coder): path-jailed code worker for iii agents (v0.5.0) + console views #244

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(shell): v0.4.0 — SDK bump, central config + hot-reload, LLM-friendly typed API, hardening + DX#238

feat(shell): v0.4.0 — SDK bump, central config + hot-reload, LLM-friendly typed API, hardening + DX#238
andersonleal merged 4 commits into
mainfrom
feat/shell-sdk-bump-config-llm-api

andersonleal commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andersonleal commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

SDK & observability

Central configuration

LLM-friendly typed API

Pre-landing hardening (adversarial review)

LLM-DX API additions (from a real agent session)

Test plan

Summary by CodeRabbit

Uh oh!

vercel Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

skill-check — worker

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andersonleal commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading