refactor(harness): unify state leases, atomic approval mutations, typed state layer by ytallo · Pull Request #235 · iii-hq/workers

ytallo · 2026-06-08T18:33:02Z

Summary

Cleanup of the harness state-operations layer for correctness, simplicity, and end-to-end typing. Three themes:

1. One single-writer lease primitive

The harness had two hand-rolled leases solving the same problem two different ways (session FSM serialization vs. compaction/prune). Both are now thin (scope, ttl) adapters over a shared runtime/lease.ts built on the engine's only atomic primitive — state::update (set-CAS): acquire writes a {nonce, ts} claim and inspects the prior value in one atomic step, so exactly one concurrent acquirer wins, and TTL-based crash recovery folds into the same op (no separate "steal" dance, no second timestamp scope).

Correctness fix: a transient state-store outage made tolerant stateUpdate return null, which the old session-lease read as prior-0 → every concurrent contender false-won the lease, duplicating side effects. The shared primitive treats a null envelope as "not acquired."
session-lease moves to nonce-based, owner-checked release (run-transition threads the nonce).

2. Atomic approval-gate settings mutations

set_mode / add_always_allow / approve_always / remove_always_allow did read-whole-record → modify → write-whole-record, so two concurrent mutations on one session lost-updated each other. They now issue field-scoped state::update ops (set/append on a single key), so disjoint-field mutations compose under the engine's per-key write-lock. Partial records that field-scoped writes can persist are backfilled on read. The store is routed through the shared createState wrapper (tolerant reads, strict writes), and clear_settings uses state::delete instead of a null tombstone.

3. Typed state layer (less `unknown`, fewer redundant checks)

runtime/state.ts helpers are generic (stateGet<T> / stateSet<T> / stateUpdate<T>); dropped the dead {value}-row unwrap and the unused stateListGroups.
State is cleared on deploy, so reads trust the type: dropped parseTurnStateRecord, parseModelArray, and parseRunRequest in favor of typed stateGet<T> (the non-null loadRunRequest keeps default-on-absent via defaultRunRequest(); loadRecord relies on stateGet returning null for absent keys).
Runtime validation now survives only where it does real work: write-side boundaries (isModel in models::reconcile), genuinely heterogeneous scopes (llm-budget budgets + spend-logs share one scope), and partial-write backfill (parseSettings).
Removed redundant indirection (scopedGet/scopedSet passthroughs, the never-used optional store param on createTurnStatePorts).

unknown in the state layer dropped from 29 → 18 (every survivor is a genuine boundary).

Test plan

New tests/runtime/lease.test.ts (16 concurrency tests: exactly-one-winner under latency/outage, TTL steal, no-false-win) and tests/turn-orchestrator/session-lease.test.ts.
New approval-gate concurrency/atomicity tests (cross-field no-clobber, append composition, partial-record backfill) added to settings.test.ts.
Existing compaction lease.test.ts (incl. its concurrency/outage/wire-shape cases) passes against the delegated implementation.
parallel-approval.e2e ("no double-execution under parallel approval wakes") passes.
Full harness suite green in a clean checkout (no unrelated WIP): 1296 pass, 0 fail; tsc clean.

Notes

No production behavior change beyond the documented concurrency fixes; durable record shapes for turn_state/run_request are unchanged.
Bounded, documented residual races remain where the engine lacks a primitive (dedup-append / array-element-remove / a native lock) — flagged in code with // Known race: comments. Exposing those engine ops would close them entirely (follow-up, engine-side).

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved concurrent update handling to prevent lost updates when modifying approval settings simultaneously.
- Enhanced turn orchestration locking mechanism for safer serialization during operations.
Reliability
- Switched context compaction trigger from event streaming to queue-based mechanism for improved reliability.
Performance
- Optimized state persistence with atomic field-scoped updates instead of full record rewrites.

vercel · 2026-06-08T18:33:08Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
harness	Error		Jun 11, 2026 1:32pm
workers	Ready	Preview, Comment	Jun 11, 2026 1:32pm

github-actions · 2026-06-08T18:33:18Z

skill-check — worker

0 verified, 14 skipped (no docs/).

Layer	Result
structure	✓
vale	✓
ai	✓
render	✓

Four for four. Nicely done.

…ed state layer Single-writer lease: - Add shared runtime/lease.ts (nonce+ttl set-CAS over atomic state::update); session-lease and compaction-lease become thin (scope, ttl) adapters. - Fix a false-win where a transient state-store outage (tolerant stateUpdate returning null) let every concurrent contender acquire the lease at once. Approval-gate settings: - Make mutations atomic and field-scoped (state::update set/append) to close the read-modify-write lost-update window; backfill partial records on read. - Route the store through the shared createState wrapper; clear via state::delete instead of writing a null tombstone. State layer typing: - Make runtime/state.ts helpers generic (stateGet<T>/stateSet<T>/stateUpdate<T>); drop the dead {value}-row unwrap and the unused stateListGroups. - State is cleared on deploy, so trust types on read: drop parseTurnStateRecord, parseModelArray, and parseRunRequest in favor of typed stateGet<T>. Keep runtime validation only at write boundaries (isModel) and shared scopes (llm-budget). - Remove redundant indirection (scopedGet/Set passthroughs, optional ports store).

coderabbitai · 2026-06-08T19:24:20Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: daad793d-b04f-4e72-8d8b-e3094330f646

📥 Commits

Reviewing files that changed from the base of the PR and between 3c3785d and a0d7fec.

📒 Files selected for processing (6)

harness/src/context-compaction/handler-async.ts
harness/src/turn-orchestrator/events.ts
harness/src/turn-orchestrator/state-runtime/store.ts
harness/src/turn-orchestrator/state-runtime/turn-end.ts
harness/src/turn-orchestrator/state.ts
harness/tests/context-compaction/handler-async.test.ts

✅ Files skipped from review due to trivial changes (1)

harness/src/turn-orchestrator/state-runtime/turn-end.ts

🚧 Files skipped from review as they are similar to previous changes (4)

harness/src/turn-orchestrator/state.ts
harness/src/turn-orchestrator/events.ts
harness/src/turn-orchestrator/state-runtime/store.ts
harness/src/context-compaction/handler-async.ts

📝 Walkthrough

Walkthrough

This PR refactors state persistence and lease infrastructure across the harness: runtime state APIs now use generic typing, approval settings migrate to atomic field-scoped updates, turn-state parsing is removed in favor of typed defaults, shared lease primitives consolidate mutual exclusion, and turn-end compaction routing transitions from dedicated streams to queue-based enqueues with typed payloads.

Changes

State Persistence, Leasing, and Routing Refactor

Layer / File(s)	Summary
Typed generic state API and list semantics `harness/src/runtime/state.ts`, `harness/tests/runtime/state-list.test.ts`	`stateGet`, `stateSet`, `stateUpdate` now use generic `<T>` signatures returning `T \| null` instead of `unknown`; `stateListGroups` removed; `createState().list` implementation treats trigger response as flat array. Tests rewritten to validate `createState().list` behavior against mock `ISdk`.
Shared atomic lease primitives `harness/src/runtime/lease.ts`, `harness/tests/runtime/lease.test.ts`	New module implements single-writer `{nonce, ts}` leases via atomic `state::update`, with `acquireLease`, `releaseLease`, and `acquireLeaseWithWait` providing TTL-based validity checks and nonce-guarded release. Tests verify nonce uniqueness, concurrent single-winner behavior, TTL expiry steal semantics, outage resilience, and exponential backoff polling.
Approval settings atomic field updates `harness/src/approval-gate/settings/store.ts`, `harness/src/approval-gate/settings/{add-always-allow,approve-always,remove-always-allow,set-mode}.ts`, `harness/tests/approval-gate/settings.test.ts`	Store adds `tolerantState`/`strictState` mode separation, `parseSettings(raw)` for backfill/validation, and `updateSettings(iii, session_id, ops)` for field-scoped atomic updates; all mutation handlers now issue `append`/`set` ops instead of full-record writes; `clearSettings` deletes without tombstones. Tests mock `state::delete` and `state::update` operations, verify concurrent mutation atomicity and backfill correctness.
Turn state and run-request parser removal `harness/src/turn-orchestrator/{run-request,state}.ts`, `harness/src/turn-orchestrator/state-runtime/store.ts`, `harness/tests/turn-orchestrator/{run-request,parse-turn-state-record}.test.ts`	`RunRequest` and `TurnStateRecord` are now plain TypeScript types; `parseRunRequest` and `parseTurnStateRecord` exports removed and replaced with `defaultRunRequest()` absent-record fallback. Turn store and state ports now use direct typed `stateGet`/`stateSet` I/O without parsing wrappers, extracting prior values from `old_value` and null-coalescing to defaults on absence.
Session-level lease adapter and serialized transition `harness/src/turn-orchestrator/state-runtime/session-lease.ts`, `harness/src/turn-orchestrator/run-transition.ts`, `harness/tests/turn-orchestrator/session-lease.test.ts`	`acquireSessionLease` and `releaseSessionLease` wrap shared lease primitives, returning/accepting nonce instead of boolean; `runTransition` stores the nonce and uses it on release for per-session mutual exclusion during serialized execution. Tests validate nonce-based grant/release, outage handling with `state::update` failures, and concurrent contender behavior.
Turn-end compaction refactor: payload, model resolution, routing `harness/src/context-compaction/{handler-async,register}.ts`, `harness/src/turn-orchestrator/{events,state-runtime/turn-end}.ts`, `harness/tests/context-compaction/{handler-async,registration,compaction-done-emit,integration/*}.test.ts`, `harness/tests/turn-orchestrator/events.test.ts`	Handler now receives typed `OnTurnEndPayload` via `parseOnTurnEnd`, extracts session_id/usage/provider/model with defaults; `resolveModel` short-circuits on threaded `model_limit`, otherwise scans messages to backfill provider/model before deriving limit. Turn-orchestrator events enqueue compaction wake to `context-compaction::on_turn_end` queue instead of mirroring to stream; registration wires internal handler function instead of stream trigger; manifest descriptions updated. Tests verify payload parsing, model resolution fallback, concurrent frame shapes, and registration behavior.
Models catalog state read simplification `harness/src/models-catalog/state.ts`, `harness/tests/models-catalog/state.test.ts`	State reads now trust stored `Model[]` directly, removing `parseModelArray` helper and re-parsing; `getProviderModels` and `listFromState` return typed values from `stateGet`/`stateListValues` without filtering. Tests updated to focus on `isModel` boundary validation.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Lease as acquireLease/releaseLease
  participant State as state::update
  Caller->>Lease: acquireLease(scope, key, ttlMs)
  Lease->>State: read current lease
  alt Lease exists and active
    Lease-->>Caller: null
  else Lease absent or expired
    Lease->>State: atomic set {nonce, ts}
    State-->>Lease: {old_value, new_value}
    Lease-->>Caller: nonce
  end
  Caller->>Lease: releaseLease(scope, key, nonce)
  Lease->>State: clear lease if nonce matches

sequenceDiagram
  participant TurnEnd as emit(turn_end)
  participant Events as events.ts
  participant Queue as iii.enqueue
  participant Compaction as context-compaction::on_turn_end
  TurnEnd->>Events: emit turn_end event
  Events->>Events: extract session_id, usage, provider, model
  Events->>Queue: enqueue(COMPACTION_ON_TURN_END, payload)
  Queue-->>Events: (best-effort, warn on error)
  Queue->>Compaction: payload
  Compaction->>Compaction: parseOnTurnEnd(payload)
  Compaction->>Compaction: resolveModel(provider, model, model_limit?)
  Compaction->>Compaction: handleAsync(parseOnTurnEnd result)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

iii-hq/workers#195: Directly uses the approval-gate settings field-scoped update infrastructure (updateSettings) to support new permission modes, allowlists, and approve-always mutations.
iii-hq/workers#198: Implements per-session transition serialization using the same nonce-based lease acquisition in run-transition.ts and session-lease.ts.
iii-hq/workers#205: Changes turn_end event routing and stream/queue behavior in emit(), overlapping with the turn-orchestrator events wiring refactored here.

Suggested reviewers

andersonleal

Poem

🐰 Leases and states now dance as one,
Atomic writes when parsing's done,
No more streams where queues belong,
Turn-end whispers, clear and strong,
Infrastructure sings a brighter song!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 26.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'refactor(harness): unify state leases, atomic approval mutations, typed state layer' directly and comprehensively describes the three main refactoring themes: (1) unifying state leases, (2) atomic approval mutations, and (3) a typed state layer. It accurately captures the primary changes across numerous files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/harness-state-operations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

harness/src/approval-gate/settings/store.ts (1)
49-63: 💤 Low value

Consider handling null update result explicitly.

If strictState(iii).update returns null (e.g., atomic write failure due to engine outage), the function silently returns defaults via parseSettings(null) rather than signaling the failure. This differs from how acquireLease in runtime/lease.ts treats a null envelope as "write failed — don't treat as success."

For settings mutations this may be acceptable since a subsequent read will fetch the actual state, but the caller won't know the update didn't persist.
🔧 Optional: throw on null result to surface write failures
 export async function updateSettings(
   iii: ISdk,
   session_id: string,
   ops: UpdateOp[],
 ): Promise<ApprovalSettings> {
   const result = await strictState(iii).update<unknown>({
     scope: SETTINGS_STATE_SCOPE,
     key: session_id,
     ops,
   });
+  if (!result) {
+    throw new Error('approval-settings update failed: state engine returned null');
+  }
   if (result?.errors && result.errors.length > 0) {
     throw new Error(`approval-settings update rejected: ${JSON.stringify(result.errors)}`);
   }
   return parseSettings(result?.new_value ?? null);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/approval-gate/settings/store.ts` around lines 49 - 63, The
updateSettings function currently treats a null result from
strictState(iii).update as success by passing null into parseSettings; instead
explicitly detect when result is null (or result.new_value is undefined) and
throw a descriptive error to surface write failures. Locate updateSettings and
the call to strictState(iii).update, check for a null/undefined result before
inspecting result.errors, and throw (e.g., `new Error("approval-settings update
failed: write did not persist")`) so callers can distinguish a failed atomic
write from a successful update; keep the existing error handling for
result.errors unchanged.
harness/tests/models-catalog/state.test.ts (1)
9-20: ⚡ Quick win

Add an id-only negative case to lock the boundary contract.

Please add an assertion for expect(isModel({ id: 'm1' })).toBe(false) so the suite prevents regressions to overly-permissive write-side validation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/models-catalog/state.test.ts` around lines 9 - 20, Add a
negative assertion inside the same test ('isModel guards the write-side
boundary') to prevent an id-only object from passing validation: call
expect(isModel({ id: 'm1' })).toBe(false) alongside the existing negative cases
so the isModel guard rejects objects that only have an id and enforces the full
model shape.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness/src/context-compaction/lease.ts`:
- Around line 30-35: The function readLeaseTimestampSecs treats NaN/Infinity as
valid because it only checks typeof ts === 'number'; change the guard to require
a finite number (use Number.isFinite or Number.isFinite((v as Record<string,
unknown>).ts)) before computing Math.floor(ts / 1000) and otherwise return 0.
Update references to the ts extraction in readLeaseTimestampSecs so only finite
numeric ts values are converted to epoch seconds; all other values return 0.

In `@harness/src/models-catalog/state.ts`:
- Around line 16-18: isModel currently only checks for a string id and lets
partial objects through; update the isModel type guard to validate the full
persisted Model contract by explicitly checking each required property declared
on the Model interface (not just id) for presence and correct types (including
any nested objects/arrays and date/number formats the Model expects), return
false if any required property is missing or has the wrong type, and add/update
unit tests to cover invalid partial objects; refer to the Model interface and
the isModel function to locate where to add these explicit field/type checks.

In `@harness/src/runtime/lease.ts`:
- Around line 59-60: The release logic reads the lease via stateGet(iii, scope,
key) and then calls stateSet(iii, scope, key, null) if nonces match, which is a
TOCTOU race; change this to an atomic conditional delete so only the lease owner
with matching nonce can clear the key. Replace the two-step pattern with an
atomic compare-and-delete/compare-and-set operation (e.g. a
stateCompareAndDelete or stateCompareAndSet API) that checks stored?.nonce ===
nonce and sets the value to null in one atomic call; update callers using
LeaseRecord, stateGet, and stateSet references accordingly and fall back to a
transactional/lock-based update if an atomic primitive is not available.

---

Nitpick comments:
In `@harness/src/approval-gate/settings/store.ts`:
- Around line 49-63: The updateSettings function currently treats a null result
from strictState(iii).update as success by passing null into parseSettings;
instead explicitly detect when result is null (or result.new_value is undefined)
and throw a descriptive error to surface write failures. Locate updateSettings
and the call to strictState(iii).update, check for a null/undefined result
before inspecting result.errors, and throw (e.g., `new Error("approval-settings
update failed: write did not persist")`) so callers can distinguish a failed
atomic write from a successful update; keep the existing error handling for
result.errors unchanged.

In `@harness/tests/models-catalog/state.test.ts`:
- Around line 9-20: Add a negative assertion inside the same test ('isModel
guards the write-side boundary') to prevent an id-only object from passing
validation: call expect(isModel({ id: 'm1' })).toBe(false) alongside the
existing negative cases so the isModel guard rejects objects that only have an
id and enforces the full model shape.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bae5c6ae-d156-41e2-a1d6-a7cd4e321339

📥 Commits

Reviewing files that changed from the base of the PR and between 66c1701 and 6f10427.

📒 Files selected for processing (23)

harness/src/approval-gate/settings/add-always-allow.ts
harness/src/approval-gate/settings/approve-always.ts
harness/src/approval-gate/settings/remove-always-allow.ts
harness/src/approval-gate/settings/set-mode.ts
harness/src/approval-gate/settings/store.ts
harness/src/context-compaction/lease.ts
harness/src/models-catalog/state.ts
harness/src/runtime/lease.ts
harness/src/runtime/state.ts
harness/src/turn-orchestrator/run-request.ts
harness/src/turn-orchestrator/run-transition.ts
harness/src/turn-orchestrator/state-runtime/ports.ts
harness/src/turn-orchestrator/state-runtime/session-lease.ts
harness/src/turn-orchestrator/state-runtime/store.ts
harness/src/turn-orchestrator/state.ts
harness/tests/approval-gate/settings.test.ts
harness/tests/integration/parallel-approval-harness.ts
harness/tests/models-catalog/state.test.ts
harness/tests/runtime/lease.test.ts
harness/tests/runtime/state-list.test.ts
harness/tests/turn-orchestrator/parse-turn-state-record.test.ts
harness/tests/turn-orchestrator/run-request.test.ts
harness/tests/turn-orchestrator/session-lease.test.ts

💤 Files with no reviewable changes (1)

harness/tests/turn-orchestrator/parse-turn-state-record.test.ts

coderabbitai · 2026-06-08T19:34:53Z

+/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */
 export function readLeaseTimestampSecs(v: unknown): number {
  if (!v || typeof v !== 'object') return 0;
  const ts = (v as Record<string, unknown>).ts;
-  if (typeof ts === 'number') return Math.floor(ts / 1000);
-  return 0;
-}
-
-function isLeaseActive(v: unknown, now_secs: number): boolean {
-  const ts_secs = readLeaseTimestampSecs(v);
-  return ts_secs > 0 && now_secs - ts_secs < LEASE_TTL_SECS;
+  return typeof ts === 'number' ? Math.floor(ts / 1000) : 0;
 }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard ts with finite-number validation.

Line 34 currently treats NaN/Infinity as valid timestamps, which violates the “0 for any other shape” contract and can propagate invalid time values.

Suggested patch

export function readLeaseTimestampSecs(v: unknown): number { if (!v || typeof v !== 'object') return 0; const ts = (v as Record<string, unknown>).ts; - return typeof ts === 'number' ? Math.floor(ts / 1000) : 0; + return typeof ts === 'number' && Number.isFinite(ts) ? Math.floor(ts / 1000) : 0; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */

export function readLeaseTimestampSecs(v: unknown): number {

if (!v || typeof v !== 'object') return 0;

const ts = (v as Record<string, unknown>).ts;

if (typeof ts === 'number') return Math.floor(ts / 1000);

return 0;

}

function isLeaseActive(v: unknown, now_secs: number): boolean {

const ts_secs = readLeaseTimestampSecs(v);

return ts_secs > 0 && now_secs - ts_secs < LEASE_TTL_SECS;

return typeof ts === 'number' ? Math.floor(ts / 1000) : 0;

}

/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */

export function readLeaseTimestampSecs(v: unknown): number {

if (!v || typeof v !== 'object') return 0;

const ts = (v as Record<string, unknown>).ts;

return typeof ts === 'number' && Number.isFinite(ts) ? Math.floor(ts / 1000) : 0;

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/context-compaction/lease.ts` around lines 30 - 35, The function readLeaseTimestampSecs treats NaN/Infinity as valid because it only checks typeof ts === 'number'; change the guard to require a finite number (use Number.isFinite or Number.isFinite((v as Record<string, unknown>).ts)) before computing Math.floor(ts / 1000) and otherwise return 0. Update references to the ts extraction in readLeaseTimestampSecs so only finite numeric ts values are converted to epoch seconds; all other values return 0.

coderabbitai · 2026-06-08T19:34:53Z

 export function isModel(v: unknown): v is Model {
  return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');
 }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Strengthen isModel to validate the actual persisted Model contract.

Line 16 currently accepts any object with a string id, but this function is the write-side boundary while read-side parsing was removed. That lets partial objects be stored and later treated as full Model values.

Proposed fix

export function isModel(v: unknown): v is Model { - return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string'); + if (!v || typeof v !== 'object') return false; + const m = v as Partial<Model>; + return ( + typeof m.id === 'string' && + typeof m.provider === 'string' && + typeof m.api === 'string' && + typeof m.display_name === 'string' && + typeof m.context_window === 'number' && + Number.isFinite(m.context_window) + ); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function isModel(v: unknown): v is Model {

return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');

}

export function isModel(v: unknown): v is Model {

if (!v || typeof v !== 'object') return false;

const m = v as Partial<Model>;

return (

typeof m.id === 'string' &&

typeof m.provider === 'string' &&

typeof m.api === 'string' &&

typeof m.display_name === 'string' &&

typeof m.context_window === 'number' &&

Number.isFinite(m.context_window)

);

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/models-catalog/state.ts` around lines 16 - 18, isModel currently only checks for a string id and lets partial objects through; update the isModel type guard to validate the full persisted Model contract by explicitly checking each required property declared on the Model interface (not just id) for presence and correct types (including any nested objects/arrays and date/number formats the Model expects), return false if any required property is missing or has the wrong type, and add/update unit tests to cover invalid partial objects; refer to the Model interface and the isModel function to locate where to add these explicit field/type checks.

coderabbitai · 2026-06-08T19:34:53Z

+  const stored = await stateGet<LeaseRecord>(iii, scope, key);
+  if (stored?.nonce === nonce) await stateSet(iii, scope, key, null);


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Stale-owner release can clear a newer lease (TOCTOU).

Line 59 and Line 60 do a non-atomic get→set(null) owner check. If lease A is read, then contender B acquires before the set, A’s release can delete B’s active claim. That reopens the key and can allow overlapping workers.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/runtime/lease.ts` around lines 59 - 60, The release logic reads the lease via stateGet(iii, scope, key) and then calls stateSet(iii, scope, key, null) if nonces match, which is a TOCTOU race; change this to an atomic conditional delete so only the lease owner with matching nonce can clear the key. Replace the two-step pattern with an atomic compare-and-delete/compare-and-set operation (e.g. a stateCompareAndDelete or stateCompareAndSet API) that checks stored?.nonce === nonce and sets the value to null in one atomic call; update callers using LeaseRecord, stateGet, and stateSet references accordingly and fall back to a transactional/lock-based update if an atomic primitive is not available.

…ads and improve event parsing - Refactor `handleAsync` to accept a structured payload instead of a generic frame, enhancing clarity and type safety. - Introduce `parseOnTurnEnd` to handle the new payload format, replacing the previous `extractEventPayload` function. - Update related functions to align with the new payload structure, ensuring consistent handling of session IDs, usage, provider, and model. - Modify YAML and main registration files to reflect changes in event handling and descriptions. - Remove obsolete stream subscriptions, transitioning to a queue-based wake mechanism for compaction on turn end. - Update tests to validate the new payload structure and ensure backward compatibility with existing functionality.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

harness/tests/context-compaction/handler-async.test.ts (1)

25-28: ⚡ Quick win

Add a malformed token-field case to lock parser behavior.

Please add a case where usage is an object but token fields are invalid types (e.g., { input: '100' }) so parser normalization/rejection stays protected.

✅ Test addition example

   it('returns null usage when usage is missing or malformed', () => {
     expect(parseOnTurnEnd({ session_id: 'sess-3' })?.usage).toBeNull();
     expect(parseOnTurnEnd({ session_id: 'sess-3', usage: 'nope' })?.usage).toBeNull();
+    expect(parseOnTurnEnd({ session_id: 'sess-3', usage: { input: '100' } })?.usage).toBeNull();
   });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/context-compaction/handler-async.test.ts` around lines 25 - 28,
Add another assertion in the existing test for parseOnTurnEnd to cover a
malformed token-field case: call parseOnTurnEnd with usage as an object where
token fields are invalid types (for example { input: '100' } or { prompt_tokens:
'10' }) and assert that the returned ?.usage is null; this ensures
parseOnTurnEnd's normalization/rejection logic (function parseOnTurnEnd) treats
non-numeric token fields as malformed and returns null for usage.

harness/tests/turn-orchestrator/events.test.ts (1)

7-12: ⚡ Quick win

Assert enqueue action in the turn_end wake test.

The current stub only captures function_id and payload, so this test cannot verify the queue contract (Enqueue on default). A regression to non-enqueued trigger would still pass.

Proposed test hardening

-function buildSdk() {
-  const calls: Array<{ function_id: string; payload: Record<string, unknown> }> = [];
-  const trigger = vi.fn(async (req: { function_id: string; payload?: unknown }) => {
+function buildSdk() {
+  const calls: Array<{ function_id: string; payload: Record<string, unknown>; action?: unknown }> = [];
+  const trigger = vi.fn(async (req: { function_id: string; payload?: unknown; action?: unknown }) => {
     calls.push({
       function_id: req.function_id,
       payload: (req.payload ?? {}) as Record<string, unknown>,
+      action: req.action,
     });
     return {};
   });
   return { iii: { trigger } as unknown as ISdk, calls };
 }
@@
     const wake = calls.find((c) => c.function_id === 'context-compaction::on_turn_end');
     expect(wake).toBeDefined();
+    expect(wake?.action).toBeDefined();
     expect(wake?.payload).toMatchObject({
       session_id: SID,
       usage: { input: 100, output: 5 },
       provider: 'anthropic',
       model: 'claude-haiku-4-5',
     });

Also applies to: 57-64

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/turn-orchestrator/events.test.ts` around lines 7 - 12, The
test's trigger stub (calls array and trigger fn) only records function_id and
payload so it cannot assert that an Enqueue was issued to the default queue;
update the stub used in the "turn_end wake" test (and the similar one at lines
57-64) to capture the full enqueue contract by recording action type and queue
name (e.g., record an object with function_id, payload, action: "Enqueue",
queue: "default") so assertions can verify calls include { action: "Enqueue",
queue: "default" } for the expected function_id and payload; locate the trigger
stub referenced by calls and trigger to implement this additional fields and
update the test assertions accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness/src/context-compaction/handler-async.ts`:
- Around line 35-38: The returned payload currently trusts obj.usage as a valid
Usage object; instead validate and normalize its fields before returning: when
typeof obj.usage === 'object' && obj.usage !== null construct a normalized usage
object (e.g., ensure prompt_tokens, completion_tokens, total_tokens or
tokens_before/tokens_after are coerced with Number(...), default to 0 for
NaN/undefined, clamp to Number.MAX_SAFE_INTEGER to avoid overflow, and if total
is missing compute total = prompt+completion), otherwise set usage to null;
update the current assignment at the usage return site and ensure the later
logic that reads tokens_before/tokens_after (the code around tokens_before usage
in lines 111–118) relies on these normalized numeric fields.

---

Nitpick comments:
In `@harness/tests/context-compaction/handler-async.test.ts`:
- Around line 25-28: Add another assertion in the existing test for
parseOnTurnEnd to cover a malformed token-field case: call parseOnTurnEnd with
usage as an object where token fields are invalid types (for example { input:
'100' } or { prompt_tokens: '10' }) and assert that the returned ?.usage is
null; this ensures parseOnTurnEnd's normalization/rejection logic (function
parseOnTurnEnd) treats non-numeric token fields as malformed and returns null
for usage.

In `@harness/tests/turn-orchestrator/events.test.ts`:
- Around line 7-12: The test's trigger stub (calls array and trigger fn) only
records function_id and payload so it cannot assert that an Enqueue was issued
to the default queue; update the stub used in the "turn_end wake" test (and the
similar one at lines 57-64) to capture the full enqueue contract by recording
action type and queue name (e.g., record an object with function_id, payload,
action: "Enqueue", queue: "default") so assertions can verify calls include {
action: "Enqueue", queue: "default" } for the expected function_id and payload;
locate the trigger stub referenced by calls and trigger to implement this
additional fields and update the test assertions accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59da39fb-5471-4681-8613-f76d79417f3e

📥 Commits

Reviewing files that changed from the base of the PR and between 6f10427 and 3c3785d.

📒 Files selected for processing (18)

harness/src/context-compaction/handler-async.ts
harness/src/context-compaction/iii.worker.yaml
harness/src/context-compaction/main.ts
harness/src/context-compaction/register.ts
harness/src/turn-orchestrator/events.ts
harness/src/turn-orchestrator/state-runtime/turn-end.ts
harness/src/turn-orchestrator/steering-check/ports.ts
harness/src/turn-orchestrator/steering-check/process.ts
harness/src/turn-orchestrator/steering-check/run.ts
harness/tests/context-compaction/compaction-done-emit.test.ts
harness/tests/context-compaction/handler-async.test.ts
harness/tests/context-compaction/integration/backward-compat.test.ts
harness/tests/context-compaction/integration/flow-async.test.ts
harness/tests/context-compaction/registration.test.ts
harness/tests/context-compaction/turn-end-subscription.test.ts
harness/tests/integration/parallel-approval-harness.ts
harness/tests/turn-orchestrator/events.test.ts
harness/tests/turn-orchestrator/steering-check-layer.test.ts

💤 Files with no reviewable changes (3)

harness/tests/context-compaction/turn-end-subscription.test.ts
harness/src/turn-orchestrator/steering-check/ports.ts
harness/tests/turn-orchestrator/steering-check-layer.test.ts

✅ Files skipped from review due to trivial changes (2)

harness/src/context-compaction/iii.worker.yaml
harness/src/context-compaction/main.ts

coderabbitai · 2026-06-08T23:00:16Z

+    usage: obj.usage && typeof obj.usage === 'object' ? (obj.usage as Usage) : null,
+    provider: typeof obj.provider === 'string' ? obj.provider : '',
+    model: typeof obj.model === 'string' ? obj.model : '',
+  };


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate and normalize usage fields before returning parsed payload.

Line 35 currently trusts any object as Usage; malformed token fields can produce wrong tokens_before math and overflow decisions in Lines 111–118.

💡 Suggested fix

export function parseOnTurnEnd(payload: unknown): OnTurnEndPayload | null { if (!payload || typeof payload !== 'object') return null; const obj = payload as Record<string, unknown>; if (typeof obj.session_id !== 'string' || !obj.session_id) return null; + const normalizeUsage = (raw: unknown): Usage | null => { + if (!raw || typeof raw !== 'object' || Array.isArray(raw)) return null; + const u = raw as Record<string, unknown>; + const toToken = (v: unknown): number | undefined => + typeof v === 'number' && Number.isFinite(v) && v >= 0 ? v : undefined; + const normalized: Usage = {}; + const input = toToken(u.input); + const output = toToken(u.output); + const cacheRead = toToken(u.cache_read); + const cacheWrite = toToken(u.cache_write); + if (input !== undefined) normalized.input = input; + if (output !== undefined) normalized.output = output; + if (cacheRead !== undefined) normalized.cache_read = cacheRead; + if (cacheWrite !== undefined) normalized.cache_write = cacheWrite; + return Object.keys(normalized).length > 0 ? normalized : null; + }; return { session_id: obj.session_id, - usage: obj.usage && typeof obj.usage === 'object' ? (obj.usage as Usage) : null, + usage: normalizeUsage(obj.usage), provider: typeof obj.provider === 'string' ? obj.provider : '', model: typeof obj.model === 'string' ? obj.model : '', }; }

Also applies to: 111-118

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/context-compaction/handler-async.ts` around lines 35 - 38, The returned payload currently trusts obj.usage as a valid Usage object; instead validate and normalize its fields before returning: when typeof obj.usage === 'object' && obj.usage !== null construct a normalized usage object (e.g., ensure prompt_tokens, completion_tokens, total_tokens or tokens_before/tokens_after are coerced with Number(...), default to 0 for NaN/undefined, clamp to Number.MAX_SAFE_INTEGER to avoid overflow, and if total is missing compute total = prompt+completion), otherwise set usage to null; update the current assignment at the usage return site and ensure the later logic that reads tokens_before/tokens_after (the code around tokens_before usage in lines 111–118) relies on these normalized numeric fields.

…e-operations # Conflicts: # harness/src/context-compaction/handler-async.ts # harness/src/turn-orchestrator/state-runtime/store.ts # harness/src/turn-orchestrator/state-runtime/turn-end.ts # harness/src/turn-orchestrator/state.ts # harness/src/turn-orchestrator/steering-check/process.ts # harness/src/turn-orchestrator/steering-check/run.ts # harness/tests/context-compaction/handler-async.test.ts

ytallo force-pushed the refactor/harness-state-operations branch from e884d6e to 6f10427 Compare June 8, 2026 19:24

vercel Bot deployed to Preview June 8, 2026 19:24 View deployment

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

vercel Bot deployed to Preview June 8, 2026 22:52 View deployment

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

vercel Bot had a problem deploying to Preview – harness June 11, 2026 13:31 Failure

vercel Bot deployed to Preview – workers June 11, 2026 13:32 View deployment

andersonleal approved these changes Jun 11, 2026

View reviewed changes

ytallo merged commit 8b96bd6 into main Jun 11, 2026
13 of 14 checks passed

coderabbitai Bot mentioned this pull request Jun 15, 2026

feat: context manager #259

Merged

-export function isModel(v: unknown): v is Model {
-  return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');
-}
+export function isModel(v: unknown): v is Model {
+  if (!v || typeof v !== 'object') return false;
+  const m = v as Partial<Model>;
+  return (
+    typeof m.id === 'string' &&
+    typeof m.provider === 'string' &&
+    typeof m.api === 'string' &&
+    typeof m.display_name === 'string' &&
+    typeof m.context_window === 'number' &&
+    Number.isFinite(m.context_window)
+  );
+}

		const stored = await stateGet<LeaseRecord>(iii, scope, key);
		if (stored?.nonce === nonce) await stateSet(iii, scope, key, null);

Conversation

ytallo commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. One single-writer lease primitive

2. Atomic approval-gate settings mutations

3. Typed state layer (less unknown, fewer redundant checks)

Test plan

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

vercel Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

skill-check — worker

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ytallo commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

3. Typed state layer (less `unknown`, fewer redundant checks)

vercel Bot commented Jun 8, 2026 •

edited

Loading

github-actions Bot commented Jun 8, 2026 •

edited

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading