Skip to content

refactor(harness): unify state leases, atomic approval mutations, typed state layer#235

Merged
ytallo merged 3 commits into
mainfrom
refactor/harness-state-operations
Jun 11, 2026
Merged

refactor(harness): unify state leases, atomic approval mutations, typed state layer#235
ytallo merged 3 commits into
mainfrom
refactor/harness-state-operations

Conversation

@ytallo

@ytallo ytallo commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Cleanup of the harness state-operations layer for correctness, simplicity, and end-to-end typing. Three themes:

1. One single-writer lease primitive

The harness had two hand-rolled leases solving the same problem two different ways (session FSM serialization vs. compaction/prune). Both are now thin (scope, ttl) adapters over a shared runtime/lease.ts built on the engine's only atomic primitive — state::update (set-CAS): acquire writes a {nonce, ts} claim and inspects the prior value in one atomic step, so exactly one concurrent acquirer wins, and TTL-based crash recovery folds into the same op (no separate "steal" dance, no second timestamp scope).

  • Correctness fix: a transient state-store outage made tolerant stateUpdate return null, which the old session-lease read as prior-0every concurrent contender false-won the lease, duplicating side effects. The shared primitive treats a null envelope as "not acquired."
  • session-lease moves to nonce-based, owner-checked release (run-transition threads the nonce).

2. Atomic approval-gate settings mutations

set_mode / add_always_allow / approve_always / remove_always_allow did read-whole-record → modify → write-whole-record, so two concurrent mutations on one session lost-updated each other. They now issue field-scoped state::update ops (set/append on a single key), so disjoint-field mutations compose under the engine's per-key write-lock. Partial records that field-scoped writes can persist are backfilled on read. The store is routed through the shared createState wrapper (tolerant reads, strict writes), and clear_settings uses state::delete instead of a null tombstone.

3. Typed state layer (less unknown, fewer redundant checks)

  • runtime/state.ts helpers are generic (stateGet<T> / stateSet<T> / stateUpdate<T>); dropped the dead {value}-row unwrap and the unused stateListGroups.
  • State is cleared on deploy, so reads trust the type: dropped parseTurnStateRecord, parseModelArray, and parseRunRequest in favor of typed stateGet<T> (the non-null loadRunRequest keeps default-on-absent via defaultRunRequest(); loadRecord relies on stateGet returning null for absent keys).
  • Runtime validation now survives only where it does real work: write-side boundaries (isModel in models::reconcile), genuinely heterogeneous scopes (llm-budget budgets + spend-logs share one scope), and partial-write backfill (parseSettings).
  • Removed redundant indirection (scopedGet/scopedSet passthroughs, the never-used optional store param on createTurnStatePorts).

unknown in the state layer dropped from 29 → 18 (every survivor is a genuine boundary).

Test plan

  • New tests/runtime/lease.test.ts (16 concurrency tests: exactly-one-winner under latency/outage, TTL steal, no-false-win) and tests/turn-orchestrator/session-lease.test.ts.
  • New approval-gate concurrency/atomicity tests (cross-field no-clobber, append composition, partial-record backfill) added to settings.test.ts.
  • Existing compaction lease.test.ts (incl. its concurrency/outage/wire-shape cases) passes against the delegated implementation.
  • parallel-approval.e2e ("no double-execution under parallel approval wakes") passes.
  • Full harness suite green in a clean checkout (no unrelated WIP): 1296 pass, 0 fail; tsc clean.

Notes

  • No production behavior change beyond the documented concurrency fixes; durable record shapes for turn_state/run_request are unchanged.
  • Bounded, documented residual races remain where the engine lacks a primitive (dedup-append / array-element-remove / a native lock) — flagged in code with // Known race: comments. Exposing those engine ops would close them entirely (follow-up, engine-side).

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved concurrent update handling to prevent lost updates when modifying approval settings simultaneously.
    • Enhanced turn orchestration locking mechanism for safer serialization during operations.
  • Reliability

    • Switched context compaction trigger from event streaming to queue-based mechanism for improved reliability.
  • Performance

    • Optimized state persistence with atomic field-scoped updates instead of full record rewrites.

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
harness Error Error Jun 11, 2026 1:32pm
workers Ready Ready Preview, Comment Jun 11, 2026 1:32pm

Request Review

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

skill-check — worker

0 verified, 14 skipped (no docs/).

Layer Result
structure
vale
ai
render

Four for four. Nicely done.

…ed state layer

Single-writer lease:
- Add shared runtime/lease.ts (nonce+ttl set-CAS over atomic state::update);
  session-lease and compaction-lease become thin (scope, ttl) adapters.
- Fix a false-win where a transient state-store outage (tolerant stateUpdate
  returning null) let every concurrent contender acquire the lease at once.

Approval-gate settings:
- Make mutations atomic and field-scoped (state::update set/append) to close the
  read-modify-write lost-update window; backfill partial records on read.
- Route the store through the shared createState wrapper; clear via state::delete
  instead of writing a null tombstone.

State layer typing:
- Make runtime/state.ts helpers generic (stateGet<T>/stateSet<T>/stateUpdate<T>);
  drop the dead {value}-row unwrap and the unused stateListGroups.
- State is cleared on deploy, so trust types on read: drop parseTurnStateRecord,
  parseModelArray, and parseRunRequest in favor of typed stateGet<T>. Keep runtime
  validation only at write boundaries (isModel) and shared scopes (llm-budget).
- Remove redundant indirection (scopedGet/Set passthroughs, optional ports store).
@ytallo ytallo force-pushed the refactor/harness-state-operations branch from e884d6e to 6f10427 Compare June 8, 2026 19:24
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: daad793d-b04f-4e72-8d8b-e3094330f646

📥 Commits

Reviewing files that changed from the base of the PR and between 3c3785d and a0d7fec.

📒 Files selected for processing (6)
  • harness/src/context-compaction/handler-async.ts
  • harness/src/turn-orchestrator/events.ts
  • harness/src/turn-orchestrator/state-runtime/store.ts
  • harness/src/turn-orchestrator/state-runtime/turn-end.ts
  • harness/src/turn-orchestrator/state.ts
  • harness/tests/context-compaction/handler-async.test.ts
✅ Files skipped from review due to trivial changes (1)
  • harness/src/turn-orchestrator/state-runtime/turn-end.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • harness/src/turn-orchestrator/state.ts
  • harness/src/turn-orchestrator/events.ts
  • harness/src/turn-orchestrator/state-runtime/store.ts
  • harness/src/context-compaction/handler-async.ts

📝 Walkthrough

Walkthrough

This PR refactors state persistence and lease infrastructure across the harness: runtime state APIs now use generic typing, approval settings migrate to atomic field-scoped updates, turn-state parsing is removed in favor of typed defaults, shared lease primitives consolidate mutual exclusion, and turn-end compaction routing transitions from dedicated streams to queue-based enqueues with typed payloads.

Changes

State Persistence, Leasing, and Routing Refactor

Layer / File(s) Summary
Typed generic state API and list semantics
harness/src/runtime/state.ts, harness/tests/runtime/state-list.test.ts
stateGet, stateSet, stateUpdate now use generic <T> signatures returning T | null instead of unknown; stateListGroups removed; createState().list implementation treats trigger response as flat array. Tests rewritten to validate createState().list behavior against mock ISdk.
Shared atomic lease primitives
harness/src/runtime/lease.ts, harness/tests/runtime/lease.test.ts
New module implements single-writer {nonce, ts} leases via atomic state::update, with acquireLease, releaseLease, and acquireLeaseWithWait providing TTL-based validity checks and nonce-guarded release. Tests verify nonce uniqueness, concurrent single-winner behavior, TTL expiry steal semantics, outage resilience, and exponential backoff polling.
Approval settings atomic field updates
harness/src/approval-gate/settings/store.ts, harness/src/approval-gate/settings/{add-always-allow,approve-always,remove-always-allow,set-mode}.ts, harness/tests/approval-gate/settings.test.ts
Store adds tolerantState/strictState mode separation, parseSettings(raw) for backfill/validation, and updateSettings(iii, session_id, ops) for field-scoped atomic updates; all mutation handlers now issue append/set ops instead of full-record writes; clearSettings deletes without tombstones. Tests mock state::delete and state::update operations, verify concurrent mutation atomicity and backfill correctness.
Turn state and run-request parser removal
harness/src/turn-orchestrator/{run-request,state}.ts, harness/src/turn-orchestrator/state-runtime/store.ts, harness/tests/turn-orchestrator/{run-request,parse-turn-state-record}.test.ts
RunRequest and TurnStateRecord are now plain TypeScript types; parseRunRequest and parseTurnStateRecord exports removed and replaced with defaultRunRequest() absent-record fallback. Turn store and state ports now use direct typed stateGet/stateSet I/O without parsing wrappers, extracting prior values from old_value and null-coalescing to defaults on absence.
Session-level lease adapter and serialized transition
harness/src/turn-orchestrator/state-runtime/session-lease.ts, harness/src/turn-orchestrator/run-transition.ts, harness/tests/turn-orchestrator/session-lease.test.ts
acquireSessionLease and releaseSessionLease wrap shared lease primitives, returning/accepting nonce instead of boolean; runTransition stores the nonce and uses it on release for per-session mutual exclusion during serialized execution. Tests validate nonce-based grant/release, outage handling with state::update failures, and concurrent contender behavior.
Turn-end compaction refactor: payload, model resolution, routing
harness/src/context-compaction/{handler-async,register}.ts, harness/src/turn-orchestrator/{events,state-runtime/turn-end}.ts, harness/tests/context-compaction/{handler-async,registration,compaction-done-emit,integration/*}.test.ts, harness/tests/turn-orchestrator/events.test.ts
Handler now receives typed OnTurnEndPayload via parseOnTurnEnd, extracts session_id/usage/provider/model with defaults; resolveModel short-circuits on threaded model_limit, otherwise scans messages to backfill provider/model before deriving limit. Turn-orchestrator events enqueue compaction wake to context-compaction::on_turn_end queue instead of mirroring to stream; registration wires internal handler function instead of stream trigger; manifest descriptions updated. Tests verify payload parsing, model resolution fallback, concurrent frame shapes, and registration behavior.
Models catalog state read simplification
harness/src/models-catalog/state.ts, harness/tests/models-catalog/state.test.ts
State reads now trust stored Model[] directly, removing parseModelArray helper and re-parsing; getProviderModels and listFromState return typed values from stateGet/stateListValues without filtering. Tests updated to focus on isModel boundary validation.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Lease as acquireLease/releaseLease
  participant State as state::update
  Caller->>Lease: acquireLease(scope, key, ttlMs)
  Lease->>State: read current lease
  alt Lease exists and active
    Lease-->>Caller: null
  else Lease absent or expired
    Lease->>State: atomic set {nonce, ts}
    State-->>Lease: {old_value, new_value}
    Lease-->>Caller: nonce
  end
  Caller->>Lease: releaseLease(scope, key, nonce)
  Lease->>State: clear lease if nonce matches
Loading
sequenceDiagram
  participant TurnEnd as emit(turn_end)
  participant Events as events.ts
  participant Queue as iii.enqueue
  participant Compaction as context-compaction::on_turn_end
  TurnEnd->>Events: emit turn_end event
  Events->>Events: extract session_id, usage, provider, model
  Events->>Queue: enqueue(COMPACTION_ON_TURN_END, payload)
  Queue-->>Events: (best-effort, warn on error)
  Queue->>Compaction: payload
  Compaction->>Compaction: parseOnTurnEnd(payload)
  Compaction->>Compaction: resolveModel(provider, model, model_limit?)
  Compaction->>Compaction: handleAsync(parseOnTurnEnd result)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • iii-hq/workers#195: Directly uses the approval-gate settings field-scoped update infrastructure (updateSettings) to support new permission modes, allowlists, and approve-always mutations.
  • iii-hq/workers#198: Implements per-session transition serialization using the same nonce-based lease acquisition in run-transition.ts and session-lease.ts.
  • iii-hq/workers#205: Changes turn_end event routing and stream/queue behavior in emit(), overlapping with the turn-orchestrator events wiring refactored here.

Suggested reviewers

  • andersonleal

Poem

🐰 Leases and states now dance as one,
Atomic writes when parsing's done,
No more streams where queues belong,
Turn-end whispers, clear and strong,
Infrastructure sings a brighter song!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor(harness): unify state leases, atomic approval mutations, typed state layer' directly and comprehensively describes the three main refactoring themes: (1) unifying state leases, (2) atomic approval mutations, and (3) a typed state layer. It accurately captures the primary changes across numerous files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/harness-state-operations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
harness/src/approval-gate/settings/store.ts (1)

49-63: 💤 Low value

Consider handling null update result explicitly.

If strictState(iii).update returns null (e.g., atomic write failure due to engine outage), the function silently returns defaults via parseSettings(null) rather than signaling the failure. This differs from how acquireLease in runtime/lease.ts treats a null envelope as "write failed — don't treat as success."

For settings mutations this may be acceptable since a subsequent read will fetch the actual state, but the caller won't know the update didn't persist.

🔧 Optional: throw on null result to surface write failures
 export async function updateSettings(
   iii: ISdk,
   session_id: string,
   ops: UpdateOp[],
 ): Promise<ApprovalSettings> {
   const result = await strictState(iii).update<unknown>({
     scope: SETTINGS_STATE_SCOPE,
     key: session_id,
     ops,
   });
+  if (!result) {
+    throw new Error('approval-settings update failed: state engine returned null');
+  }
   if (result?.errors && result.errors.length > 0) {
     throw new Error(`approval-settings update rejected: ${JSON.stringify(result.errors)}`);
   }
   return parseSettings(result?.new_value ?? null);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/approval-gate/settings/store.ts` around lines 49 - 63, The
updateSettings function currently treats a null result from
strictState(iii).update as success by passing null into parseSettings; instead
explicitly detect when result is null (or result.new_value is undefined) and
throw a descriptive error to surface write failures. Locate updateSettings and
the call to strictState(iii).update, check for a null/undefined result before
inspecting result.errors, and throw (e.g., `new Error("approval-settings update
failed: write did not persist")`) so callers can distinguish a failed atomic
write from a successful update; keep the existing error handling for
result.errors unchanged.
harness/tests/models-catalog/state.test.ts (1)

9-20: ⚡ Quick win

Add an id-only negative case to lock the boundary contract.

Please add an assertion for expect(isModel({ id: 'm1' })).toBe(false) so the suite prevents regressions to overly-permissive write-side validation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/models-catalog/state.test.ts` around lines 9 - 20, Add a
negative assertion inside the same test ('isModel guards the write-side
boundary') to prevent an id-only object from passing validation: call
expect(isModel({ id: 'm1' })).toBe(false) alongside the existing negative cases
so the isModel guard rejects objects that only have an id and enforces the full
model shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness/src/context-compaction/lease.ts`:
- Around line 30-35: The function readLeaseTimestampSecs treats NaN/Infinity as
valid because it only checks typeof ts === 'number'; change the guard to require
a finite number (use Number.isFinite or Number.isFinite((v as Record<string,
unknown>).ts)) before computing Math.floor(ts / 1000) and otherwise return 0.
Update references to the ts extraction in readLeaseTimestampSecs so only finite
numeric ts values are converted to epoch seconds; all other values return 0.

In `@harness/src/models-catalog/state.ts`:
- Around line 16-18: isModel currently only checks for a string id and lets
partial objects through; update the isModel type guard to validate the full
persisted Model contract by explicitly checking each required property declared
on the Model interface (not just id) for presence and correct types (including
any nested objects/arrays and date/number formats the Model expects), return
false if any required property is missing or has the wrong type, and add/update
unit tests to cover invalid partial objects; refer to the Model interface and
the isModel function to locate where to add these explicit field/type checks.

In `@harness/src/runtime/lease.ts`:
- Around line 59-60: The release logic reads the lease via stateGet(iii, scope,
key) and then calls stateSet(iii, scope, key, null) if nonces match, which is a
TOCTOU race; change this to an atomic conditional delete so only the lease owner
with matching nonce can clear the key. Replace the two-step pattern with an
atomic compare-and-delete/compare-and-set operation (e.g. a
stateCompareAndDelete or stateCompareAndSet API) that checks stored?.nonce ===
nonce and sets the value to null in one atomic call; update callers using
LeaseRecord, stateGet, and stateSet references accordingly and fall back to a
transactional/lock-based update if an atomic primitive is not available.

---

Nitpick comments:
In `@harness/src/approval-gate/settings/store.ts`:
- Around line 49-63: The updateSettings function currently treats a null result
from strictState(iii).update as success by passing null into parseSettings;
instead explicitly detect when result is null (or result.new_value is undefined)
and throw a descriptive error to surface write failures. Locate updateSettings
and the call to strictState(iii).update, check for a null/undefined result
before inspecting result.errors, and throw (e.g., `new Error("approval-settings
update failed: write did not persist")`) so callers can distinguish a failed
atomic write from a successful update; keep the existing error handling for
result.errors unchanged.

In `@harness/tests/models-catalog/state.test.ts`:
- Around line 9-20: Add a negative assertion inside the same test ('isModel
guards the write-side boundary') to prevent an id-only object from passing
validation: call expect(isModel({ id: 'm1' })).toBe(false) alongside the
existing negative cases so the isModel guard rejects objects that only have an
id and enforces the full model shape.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bae5c6ae-d156-41e2-a1d6-a7cd4e321339

📥 Commits

Reviewing files that changed from the base of the PR and between 66c1701 and 6f10427.

📒 Files selected for processing (23)
  • harness/src/approval-gate/settings/add-always-allow.ts
  • harness/src/approval-gate/settings/approve-always.ts
  • harness/src/approval-gate/settings/remove-always-allow.ts
  • harness/src/approval-gate/settings/set-mode.ts
  • harness/src/approval-gate/settings/store.ts
  • harness/src/context-compaction/lease.ts
  • harness/src/models-catalog/state.ts
  • harness/src/runtime/lease.ts
  • harness/src/runtime/state.ts
  • harness/src/turn-orchestrator/run-request.ts
  • harness/src/turn-orchestrator/run-transition.ts
  • harness/src/turn-orchestrator/state-runtime/ports.ts
  • harness/src/turn-orchestrator/state-runtime/session-lease.ts
  • harness/src/turn-orchestrator/state-runtime/store.ts
  • harness/src/turn-orchestrator/state.ts
  • harness/tests/approval-gate/settings.test.ts
  • harness/tests/integration/parallel-approval-harness.ts
  • harness/tests/models-catalog/state.test.ts
  • harness/tests/runtime/lease.test.ts
  • harness/tests/runtime/state-list.test.ts
  • harness/tests/turn-orchestrator/parse-turn-state-record.test.ts
  • harness/tests/turn-orchestrator/run-request.test.ts
  • harness/tests/turn-orchestrator/session-lease.test.ts
💤 Files with no reviewable changes (1)
  • harness/tests/turn-orchestrator/parse-turn-state-record.test.ts

Comment on lines +30 to 35
/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */
export function readLeaseTimestampSecs(v: unknown): number {
if (!v || typeof v !== 'object') return 0;
const ts = (v as Record<string, unknown>).ts;
if (typeof ts === 'number') return Math.floor(ts / 1000);
return 0;
}

function isLeaseActive(v: unknown, now_secs: number): boolean {
const ts_secs = readLeaseTimestampSecs(v);
return ts_secs > 0 && now_secs - ts_secs < LEASE_TTL_SECS;
return typeof ts === 'number' ? Math.floor(ts / 1000) : 0;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard ts with finite-number validation.

Line 34 currently treats NaN/Infinity as valid timestamps, which violates the “0 for any other shape” contract and can propagate invalid time values.

Suggested patch
 export function readLeaseTimestampSecs(v: unknown): number {
   if (!v || typeof v !== 'object') return 0;
   const ts = (v as Record<string, unknown>).ts;
-  return typeof ts === 'number' ? Math.floor(ts / 1000) : 0;
+  return typeof ts === 'number' && Number.isFinite(ts) ? Math.floor(ts / 1000) : 0;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */
export function readLeaseTimestampSecs(v: unknown): number {
if (!v || typeof v !== 'object') return 0;
const ts = (v as Record<string, unknown>).ts;
if (typeof ts === 'number') return Math.floor(ts / 1000);
return 0;
}
function isLeaseActive(v: unknown, now_secs: number): boolean {
const ts_secs = readLeaseTimestampSecs(v);
return ts_secs > 0 && now_secs - ts_secs < LEASE_TTL_SECS;
return typeof ts === 'number' ? Math.floor(ts / 1000) : 0;
}
/** Epoch-secs of a `{nonce, ts}` lease claim; 0 for any other shape. */
export function readLeaseTimestampSecs(v: unknown): number {
if (!v || typeof v !== 'object') return 0;
const ts = (v as Record<string, unknown>).ts;
return typeof ts === 'number' && Number.isFinite(ts) ? Math.floor(ts / 1000) : 0;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/context-compaction/lease.ts` around lines 30 - 35, The function
readLeaseTimestampSecs treats NaN/Infinity as valid because it only checks
typeof ts === 'number'; change the guard to require a finite number (use
Number.isFinite or Number.isFinite((v as Record<string, unknown>).ts)) before
computing Math.floor(ts / 1000) and otherwise return 0. Update references to the
ts extraction in readLeaseTimestampSecs so only finite numeric ts values are
converted to epoch seconds; all other values return 0.

Comment on lines 16 to 18
export function isModel(v: unknown): v is Model {
return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Strengthen isModel to validate the actual persisted Model contract.

Line 16 currently accepts any object with a string id, but this function is the write-side boundary while read-side parsing was removed. That lets partial objects be stored and later treated as full Model values.

Proposed fix
 export function isModel(v: unknown): v is Model {
-  return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');
+  if (!v || typeof v !== 'object') return false;
+  const m = v as Partial<Model>;
+  return (
+    typeof m.id === 'string' &&
+    typeof m.provider === 'string' &&
+    typeof m.api === 'string' &&
+    typeof m.display_name === 'string' &&
+    typeof m.context_window === 'number' &&
+    Number.isFinite(m.context_window)
+  );
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function isModel(v: unknown): v is Model {
return Boolean(v && typeof v === 'object' && typeof (v as Model).id === 'string');
}
export function isModel(v: unknown): v is Model {
if (!v || typeof v !== 'object') return false;
const m = v as Partial<Model>;
return (
typeof m.id === 'string' &&
typeof m.provider === 'string' &&
typeof m.api === 'string' &&
typeof m.display_name === 'string' &&
typeof m.context_window === 'number' &&
Number.isFinite(m.context_window)
);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/models-catalog/state.ts` around lines 16 - 18, isModel currently
only checks for a string id and lets partial objects through; update the isModel
type guard to validate the full persisted Model contract by explicitly checking
each required property declared on the Model interface (not just id) for
presence and correct types (including any nested objects/arrays and date/number
formats the Model expects), return false if any required property is missing or
has the wrong type, and add/update unit tests to cover invalid partial objects;
refer to the Model interface and the isModel function to locate where to add
these explicit field/type checks.

Comment on lines +59 to +60
const stored = await stateGet<LeaseRecord>(iii, scope, key);
if (stored?.nonce === nonce) await stateSet(iii, scope, key, null);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Stale-owner release can clear a newer lease (TOCTOU).

Line 59 and Line 60 do a non-atomic getset(null) owner check. If lease A is read, then contender B acquires before the set, A’s release can delete B’s active claim. That reopens the key and can allow overlapping workers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/runtime/lease.ts` around lines 59 - 60, The release logic reads
the lease via stateGet(iii, scope, key) and then calls stateSet(iii, scope, key,
null) if nonces match, which is a TOCTOU race; change this to an atomic
conditional delete so only the lease owner with matching nonce can clear the
key. Replace the two-step pattern with an atomic
compare-and-delete/compare-and-set operation (e.g. a stateCompareAndDelete or
stateCompareAndSet API) that checks stored?.nonce === nonce and sets the value
to null in one atomic call; update callers using LeaseRecord, stateGet, and
stateSet references accordingly and fall back to a transactional/lock-based
update if an atomic primitive is not available.

…ads and improve event parsing

- Refactor `handleAsync` to accept a structured payload instead of a generic frame, enhancing clarity and type safety.
- Introduce `parseOnTurnEnd` to handle the new payload format, replacing the previous `extractEventPayload` function.
- Update related functions to align with the new payload structure, ensuring consistent handling of session IDs, usage, provider, and model.
- Modify YAML and main registration files to reflect changes in event handling and descriptions.
- Remove obsolete stream subscriptions, transitioning to a queue-based wake mechanism for compaction on turn end.
- Update tests to validate the new payload structure and ensure backward compatibility with existing functionality.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
harness/tests/context-compaction/handler-async.test.ts (1)

25-28: ⚡ Quick win

Add a malformed token-field case to lock parser behavior.

Please add a case where usage is an object but token fields are invalid types (e.g., { input: '100' }) so parser normalization/rejection stays protected.

✅ Test addition example
   it('returns null usage when usage is missing or malformed', () => {
     expect(parseOnTurnEnd({ session_id: 'sess-3' })?.usage).toBeNull();
     expect(parseOnTurnEnd({ session_id: 'sess-3', usage: 'nope' })?.usage).toBeNull();
+    expect(parseOnTurnEnd({ session_id: 'sess-3', usage: { input: '100' } })?.usage).toBeNull();
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/context-compaction/handler-async.test.ts` around lines 25 - 28,
Add another assertion in the existing test for parseOnTurnEnd to cover a
malformed token-field case: call parseOnTurnEnd with usage as an object where
token fields are invalid types (for example { input: '100' } or { prompt_tokens:
'10' }) and assert that the returned ?.usage is null; this ensures
parseOnTurnEnd's normalization/rejection logic (function parseOnTurnEnd) treats
non-numeric token fields as malformed and returns null for usage.
harness/tests/turn-orchestrator/events.test.ts (1)

7-12: ⚡ Quick win

Assert enqueue action in the turn_end wake test.

The current stub only captures function_id and payload, so this test cannot verify the queue contract (Enqueue on default). A regression to non-enqueued trigger would still pass.

Proposed test hardening
-function buildSdk() {
-  const calls: Array<{ function_id: string; payload: Record<string, unknown> }> = [];
-  const trigger = vi.fn(async (req: { function_id: string; payload?: unknown }) => {
+function buildSdk() {
+  const calls: Array<{ function_id: string; payload: Record<string, unknown>; action?: unknown }> = [];
+  const trigger = vi.fn(async (req: { function_id: string; payload?: unknown; action?: unknown }) => {
     calls.push({
       function_id: req.function_id,
       payload: (req.payload ?? {}) as Record<string, unknown>,
+      action: req.action,
     });
     return {};
   });
   return { iii: { trigger } as unknown as ISdk, calls };
 }
@@
     const wake = calls.find((c) => c.function_id === 'context-compaction::on_turn_end');
     expect(wake).toBeDefined();
+    expect(wake?.action).toBeDefined();
     expect(wake?.payload).toMatchObject({
       session_id: SID,
       usage: { input: 100, output: 5 },
       provider: 'anthropic',
       model: 'claude-haiku-4-5',
     });

Also applies to: 57-64

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/turn-orchestrator/events.test.ts` around lines 7 - 12, The
test's trigger stub (calls array and trigger fn) only records function_id and
payload so it cannot assert that an Enqueue was issued to the default queue;
update the stub used in the "turn_end wake" test (and the similar one at lines
57-64) to capture the full enqueue contract by recording action type and queue
name (e.g., record an object with function_id, payload, action: "Enqueue",
queue: "default") so assertions can verify calls include { action: "Enqueue",
queue: "default" } for the expected function_id and payload; locate the trigger
stub referenced by calls and trigger to implement this additional fields and
update the test assertions accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness/src/context-compaction/handler-async.ts`:
- Around line 35-38: The returned payload currently trusts obj.usage as a valid
Usage object; instead validate and normalize its fields before returning: when
typeof obj.usage === 'object' && obj.usage !== null construct a normalized usage
object (e.g., ensure prompt_tokens, completion_tokens, total_tokens or
tokens_before/tokens_after are coerced with Number(...), default to 0 for
NaN/undefined, clamp to Number.MAX_SAFE_INTEGER to avoid overflow, and if total
is missing compute total = prompt+completion), otherwise set usage to null;
update the current assignment at the usage return site and ensure the later
logic that reads tokens_before/tokens_after (the code around tokens_before usage
in lines 111–118) relies on these normalized numeric fields.

---

Nitpick comments:
In `@harness/tests/context-compaction/handler-async.test.ts`:
- Around line 25-28: Add another assertion in the existing test for
parseOnTurnEnd to cover a malformed token-field case: call parseOnTurnEnd with
usage as an object where token fields are invalid types (for example { input:
'100' } or { prompt_tokens: '10' }) and assert that the returned ?.usage is
null; this ensures parseOnTurnEnd's normalization/rejection logic (function
parseOnTurnEnd) treats non-numeric token fields as malformed and returns null
for usage.

In `@harness/tests/turn-orchestrator/events.test.ts`:
- Around line 7-12: The test's trigger stub (calls array and trigger fn) only
records function_id and payload so it cannot assert that an Enqueue was issued
to the default queue; update the stub used in the "turn_end wake" test (and the
similar one at lines 57-64) to capture the full enqueue contract by recording
action type and queue name (e.g., record an object with function_id, payload,
action: "Enqueue", queue: "default") so assertions can verify calls include {
action: "Enqueue", queue: "default" } for the expected function_id and payload;
locate the trigger stub referenced by calls and trigger to implement this
additional fields and update the test assertions accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59da39fb-5471-4681-8613-f76d79417f3e

📥 Commits

Reviewing files that changed from the base of the PR and between 6f10427 and 3c3785d.

📒 Files selected for processing (18)
  • harness/src/context-compaction/handler-async.ts
  • harness/src/context-compaction/iii.worker.yaml
  • harness/src/context-compaction/main.ts
  • harness/src/context-compaction/register.ts
  • harness/src/turn-orchestrator/events.ts
  • harness/src/turn-orchestrator/state-runtime/turn-end.ts
  • harness/src/turn-orchestrator/steering-check/ports.ts
  • harness/src/turn-orchestrator/steering-check/process.ts
  • harness/src/turn-orchestrator/steering-check/run.ts
  • harness/tests/context-compaction/compaction-done-emit.test.ts
  • harness/tests/context-compaction/handler-async.test.ts
  • harness/tests/context-compaction/integration/backward-compat.test.ts
  • harness/tests/context-compaction/integration/flow-async.test.ts
  • harness/tests/context-compaction/registration.test.ts
  • harness/tests/context-compaction/turn-end-subscription.test.ts
  • harness/tests/integration/parallel-approval-harness.ts
  • harness/tests/turn-orchestrator/events.test.ts
  • harness/tests/turn-orchestrator/steering-check-layer.test.ts
💤 Files with no reviewable changes (3)
  • harness/tests/context-compaction/turn-end-subscription.test.ts
  • harness/src/turn-orchestrator/steering-check/ports.ts
  • harness/tests/turn-orchestrator/steering-check-layer.test.ts
✅ Files skipped from review due to trivial changes (2)
  • harness/src/context-compaction/iii.worker.yaml
  • harness/src/context-compaction/main.ts

Comment on lines +35 to +38
usage: obj.usage && typeof obj.usage === 'object' ? (obj.usage as Usage) : null,
provider: typeof obj.provider === 'string' ? obj.provider : '',
model: typeof obj.model === 'string' ? obj.model : '',
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate and normalize usage fields before returning parsed payload.

Line 35 currently trusts any object as Usage; malformed token fields can produce wrong tokens_before math and overflow decisions in Lines 111–118.

💡 Suggested fix
 export function parseOnTurnEnd(payload: unknown): OnTurnEndPayload | null {
   if (!payload || typeof payload !== 'object') return null;
   const obj = payload as Record<string, unknown>;
   if (typeof obj.session_id !== 'string' || !obj.session_id) return null;
+  const normalizeUsage = (raw: unknown): Usage | null => {
+    if (!raw || typeof raw !== 'object' || Array.isArray(raw)) return null;
+    const u = raw as Record<string, unknown>;
+    const toToken = (v: unknown): number | undefined =>
+      typeof v === 'number' && Number.isFinite(v) && v >= 0 ? v : undefined;
+    const normalized: Usage = {};
+    const input = toToken(u.input);
+    const output = toToken(u.output);
+    const cacheRead = toToken(u.cache_read);
+    const cacheWrite = toToken(u.cache_write);
+    if (input !== undefined) normalized.input = input;
+    if (output !== undefined) normalized.output = output;
+    if (cacheRead !== undefined) normalized.cache_read = cacheRead;
+    if (cacheWrite !== undefined) normalized.cache_write = cacheWrite;
+    return Object.keys(normalized).length > 0 ? normalized : null;
+  };
   return {
     session_id: obj.session_id,
-    usage: obj.usage && typeof obj.usage === 'object' ? (obj.usage as Usage) : null,
+    usage: normalizeUsage(obj.usage),
     provider: typeof obj.provider === 'string' ? obj.provider : '',
     model: typeof obj.model === 'string' ? obj.model : '',
   };
 }

Also applies to: 111-118

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/context-compaction/handler-async.ts` around lines 35 - 38, The
returned payload currently trusts obj.usage as a valid Usage object; instead
validate and normalize its fields before returning: when typeof obj.usage ===
'object' && obj.usage !== null construct a normalized usage object (e.g., ensure
prompt_tokens, completion_tokens, total_tokens or tokens_before/tokens_after are
coerced with Number(...), default to 0 for NaN/undefined, clamp to
Number.MAX_SAFE_INTEGER to avoid overflow, and if total is missing compute total
= prompt+completion), otherwise set usage to null; update the current assignment
at the usage return site and ensure the later logic that reads
tokens_before/tokens_after (the code around tokens_before usage in lines
111–118) relies on these normalized numeric fields.

…e-operations

# Conflicts:
#	harness/src/context-compaction/handler-async.ts
#	harness/src/turn-orchestrator/state-runtime/store.ts
#	harness/src/turn-orchestrator/state-runtime/turn-end.ts
#	harness/src/turn-orchestrator/state.ts
#	harness/src/turn-orchestrator/steering-check/process.ts
#	harness/src/turn-orchestrator/steering-check/run.ts
#	harness/tests/context-compaction/handler-async.test.ts
@ytallo ytallo merged commit 8b96bd6 into main Jun 11, 2026
13 of 14 checks passed
@coderabbitai coderabbitai Bot mentioned this pull request Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants