Skip to content

[Story 2.2] Serialize reopenParentFromDelegation steps 4–5 #365

Description

@edelauna

Part of #357 (Epic 2: Task Lifecycle Fixes).

Depends on: Nothing — can start immediately. Uses atomicReadAndUpdate from Story 2.1, but the implementation is independent.

Context

The current reopenParentFromDelegation writes parent and child status in separate, non-atomic steps:

  1. Step 3 — write parent status: "active" (with awaitingChildId still set as a crash-recovery sentinel)
  2. Step 4 — close child instance via removeClineFromStack
  3. Step 5 — write child status: "completed"
  4. Step 5B — write parent awaitingChildId: undefined to clear the sentinel

This multi-step saga with a sentinel is the wrong shape. It:

  • Introduces an intermediate persisted state (active + awaitingChildId set) that is not a valid steady state in the status machine
  • Couples reconcileDelegationState to the internal protocol of reopenParentFromDelegation (the reconciler must understand what the sentinel means)
  • Creates a window where parent is active but child is still active — visible to any concurrent reader between steps 3 and 5

The tech brief (#355) explicitly rejected sentinel/intent-state approaches in favour of the atomicReadAndUpdate pattern.

Correct Implementation

1. Add atomicUpdatePair to TaskHistoryStore

/**
 * Atomically update two related HistoryItems within a single lock acquisition.
 * Both updaters run synchronously (no I/O, no lock re-entry). Both writes are
 * committed before the lock releases — no concurrent writer can observe an
 * intermediate state.
 */
public atomicUpdatePair(
  firstId: string,
  secondId: string,
  firstUpdater: (current: HistoryItem) => HistoryItem,
  secondUpdater: (current: HistoryItem) => HistoryItem,
): Promise<HistoryItem[]> {
  return this.withLock(async () => {
    const first = this.cache.get(firstId)
    if (!first) throw new Error(`[TaskHistoryStore] atomicUpdatePair: ${firstId} not found`)
    const second = this.cache.get(secondId)
    if (!second) throw new Error(`[TaskHistoryStore] atomicUpdatePair: ${secondId} not found`)

    const updatedFirst = firstUpdater(structuredClone(first))
    const updatedSecond = secondUpdater(structuredClone(second))

    await this.upsertCore(updatedFirst)
    await this.upsertCore(updatedSecond)

    return this.getAll()
  })
}

2. Rewrite reopenParentFromDelegation steps 3–5B

Replace the current sentinel approach (steps 3, 5, 5B) with a single atomicUpdatePair call after removeClineFromStack (step 4):

// 4) Close child instance (must happen before marking child completed —
//    removeClineFromStack → abortTask → saveClineMessages writes status: "active"
//    which would overwrite a "completed" status set earlier)
const current = this.getCurrentTask()
if (current?.taskId === childTaskId) {
  await this.removeClineFromStack({ skipDelegationRepair: true })
}

// 3+5) Atomically mark child completed and parent active in one lock acquisition.
//      No intermediate state is ever persisted — no sentinel needed.
const childIds = Array.from(new Set([...(historyItem.childIds ?? []), childTaskId]))
await this.taskHistoryStore.atomicUpdatePair(
  childTaskId,
  parentTaskId,
  (child) => {
    assertValidTransition(child.status, "completed")
    return { ...child, status: "completed" }
  },
  (parent) => {
    if (parent.status !== "active") assertValidTransition(parent.status, "active")
    return {
      ...parent,
      status: "active",
      completedByChildId: childTaskId,
      completionResultSummary,
      awaitingChildId: undefined,
      childIds,
    }
  },
)

3. Remove the sentinel recovery case from reconcileDelegationState

The active + awaitingChildId recovery case added in PR #692 exists solely to handle the sentinel. Once atomicUpdatePair is in place, no valid handoff ever leaves a parent active with awaitingChildId set — that intermediate state is no longer persisted. Remove that case.

Order matters: child before parent

The child must be written first (or atomically with the parent). Writing parent first (as the current sentinel approach does) means a crash window where parent is active with no awaitingChildId — the parent looks complete but the child is still active with no way to report back. Writing child first (or both atomically) means a crash leaves child active and parent delegated — which reconcileDelegationState already handles correctly (delegated parent + active child = leave as-is, resumable).

Files

  • src/core/task-persistence/TaskHistoryStore.ts — add atomicUpdatePair
  • src/core/webview/ClineProvider.ts — replace steps 3/5/5B in reopenParentFromDelegation
  • src/core/task-persistence/__tests__/TaskHistoryStore.reconciliation.spec.ts — remove the active + awaitingChildId sentinel recovery tests added in PR feat(task-lifecycle): task status transition guard and startup delegation reconciliation #692; add atomicUpdatePair unit tests
  • src/__tests__/history-resume-delegation.spec.ts — update handoff tests; add partial-failure test

Tests

  • atomicUpdatePair unit tests: assert both records are updated atomically; assert a concurrent upsert queued during the pair write sees the completed state of both records after the lock releases.
  • Partial failure: mock upsertCore to throw on the second write; assert neither record is in an inconsistent final state (first write already landed — document this known limitation, or add a compensating write in the catch).
  • Handoff correctness: after reopenParentFromDelegation, assert child.status === "completed" AND parent.status === "active" AND parent.awaitingChildId === undefined — never one without the others.
  • All 11 existing tests in history-resume-delegation.spec.ts continue to pass.

Acceptance Criteria

  • No observable window where child.status !== "completed" while parent.status === "active" (or vice versa) after reopenParentFromDelegation returns.
  • reconcileDelegationState has no knowledge of reopenParentFromDelegation internals — the sentinel recovery case is gone.
  • Partial failure within atomicUpdatePair (second upsertCore throws after first succeeds) surfaces the error; behaviour is documented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions