Skip to content

P0: enforce Specrew lifecycle gates consistently across all host harnesses #2884

@alonf

Description

@alonf

Summary

Manual downstream testing exposed a release-blocking Specrew governance gap: host runtimes can receive different artifact/skill surfaces and may bypass the Specrew lifecycle even when .specrew/start-context.json declares every boundary as human-judgment-required.

This is not only an Antigravity issue. We must ensure every supported harness/host has the artifacts, skills, prompts, state, and enforcement semantics needed to run the same Specrew process, and that no host can self-authorize past a lifecycle boundary.

Observed failure

Downstream repo: C:\Temp\test-f197
Host: Antigravity CLI 1.0.9
Scenario: greenfield feature request for an animated console flag

Evidence:

  • .specrew/start-context.json had boundary_enforcement.enabled = true.
  • All policy classes were human-judgment-required.
  • verdict_history was empty.
  • The host still scaffolded a feature, wrote spec.md, committed it, ran clarify-like reasoning, and began writing plan.md.
  • The host explicitly recognized clarify -> plan was human-judgment-required, then rationalized proceeding anyway.
  • specs/001-israel-flag-console/plan.md existed without a captured clarify -> plan verdict.

Root concern

The current host deployment appears uneven:

  • Some hosts receive gate-stop artifacts/skills while others do not.
  • Host-specific instructions may be advisory rather than enforceable.
  • Prompt wording such as "just execute" can conflict with boundary-stop obligations.
  • Raw artifact writes are not mechanically blocked when verdict_history lacks the required authorization.
  • A host can use scripts or direct file creation to move the lifecycle forward without the Specrew boundary being captured.

Required outcome

Specrew must provide a host-neutral enforcement model across all supported harnesses:

  1. Every supported host/harness receives equivalent lifecycle artifacts and process instructions.
  2. Every supported host/harness has a boundary-stop mechanism or deterministic fallback that requires a human verdict before advancing.
  3. No host may produce the next phase's substantive artifacts when the transition target is human-judgment-required and the matching verdict is absent.
  4. Host-specific skill deployment differences must not change the lifecycle semantics.
  5. If a host cannot enforce the boundary safely, it must stop with an infrastructure/process failure rather than proceed.
  6. Tests must cover artifact parity and gate enforcement for all supported hosts, not just one host.

Acceptance criteria

  • Inventory all supported host harnesses and their deployed Specrew artifacts/skills/instruction surfaces.
  • Add a parity test proving each host receives the required lifecycle/gate artifacts.
  • Add a gate-enforcement test proving a missing verdict prevents spec.md -> clarify, clarify -> plan, plan -> tasks, and later human-judgment transitions from producing next-phase substantive artifacts.
  • Add or define a host-neutral gate-stop fallback for hosts without a native skill mechanism.
  • Remove or rewrite ambiguous prompt wording that lets a host rationalize bypassing a required boundary.
  • Ensure raw script/direct-file paths cannot silently bypass boundary authorization.
  • Dogfood the downstream greenfield run across the supported host set and record evidence.

Priority

P0. This blocks confidence in Feature 197 and in Specrew as a lifecycle-governed process: if one host can self-advance, the process guarantee is not portable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feedbackUser feedback — high signal, needs attentiongo:needs-researchNeeds investigationphase:reviewingIteration is in review/demopriority:p0Blocking releasespecrew:lifecycleIteration lifecycle issue mirrored from iteration artifactssquadSquad triage inbox — Lead will assign to a membersquad:alonAssigned to Alon (Chief Architect & Reviewer)type:bugSomething broken

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions