A deterministic, role-based orchestration system that turns unstructured AI work into reproducible, auditable, multi-agent pipelines. Every piece of state lives on the filesystem. Every output is schema-validated. Every decision is traceable.
Problem. AI agents lose work. Conversations get compacted. Context windows overflow. Terminal scrollback disappears. There is no persistent record of what was decided, who did it, or why.
Solution. This system replaces ad-hoc AI usage with a structured execution pipeline. Tickets enter the system. A deterministic scheduler picks the next task. A fixed sequence of role stages (Analyst, Architect, Dev, QA, Review) executes the work. Each stage produces schema-validated artifacts. Gates prevent advancement until quality checks pass. Everything is written to disk.
Typical outcomes:
- Every ticket has a complete audit trail from intake through review.
- Role boundaries are enforced: a QA agent cannot write code, an Analyst cannot modify architecture.
- Schema validation catches structural errors before they propagate.
- Work survives agent restarts, context compaction, and session loss.
- The entire system runs on Node.js built-ins with zero npm dependencies.
Who it is for. Teams and individuals building multi-agent AI systems who need determinism, auditability, and reproducibility. Useful for anyone who has lost work to a crashed agent session or an overflowed context window.
What this is, stated honestly:
- Local-only execution. Every tool runs against a workspace on the local filesystem. There is no hosted service, no managed runner, no scheduled execution beyond the optional cron-friendly drive loop.
- Filesystem-as-API by design. No database, no queue, no HTTP service surface
except the optional read-only dashboard bound to
127.0.0.1. State is files. - Not a vector store or semantic RAG. Context retrieval for task packs is filename-, schema-, and graph-based. There is no embedding store.
- Not LangChain, LangGraph, or MCP. The orchestrator drives an agent adapter (the Claude CLI) directly via subprocess. No agent-framework dependency.
- The orchestrator dispatches; the LLM does not pick tools. Stage transitions and tool dispatch are deterministic. The LLM authors artifacts; it does not decide what runs next.
- No token or cost telemetry yet. Stage durations and an append-only audit log are written; per-call token accounting is not.
- Single-machine. Distributed execution is out of scope at the current maturity level.
Run validation at any point with bash tools/test-all.sh. Zero failures expected.
For the full design — state machine, determinism model, failure modes — see
ARCHITECTURE.md (overview) and
docs/ARCHITECTURE_DETAILED.md (full).
| Concept | Description |
|---|---|
| Project | A workspace initialized with .claw/ containing metadata, agent roles, and a backlog of tasks. |
| Backlog Item | A unit of work (.claw/backlog/<id>.json) with status, priority, owner role, dependencies, and an optional parent epic. |
| Run | A single pipeline execution (.claw/runs/<timestamp>_<ticket>/) with a state machine from intake to done. |
| Stage | A step in the pipeline owned by one role. Each stage requires specific artifacts validated against JSON schemas. |
| Picker | A deterministic scheduler that selects the next eligible task using stable sort rules (priority bucket > status > priority > ID). |
| Driver | A one-shot executor that creates a run for the picked task, invokes the autonomous runner, and writes results. |
Ticket --> Backlog Item --> Task Pack --> Run Creation --> Pipeline Stages --> Done
|
intake -> task-pack-generated
analyze -> plan
implement -> validate
review -> done
| Stage | Role | Required Artifact | Schema |
|---|---|---|---|
| analyze | Analyst | 10-pm-brief.json |
pm-brief.schema.json |
| plan | Architect | 20-arch-design.json |
arch-design.schema.json |
| implement | Dev | 40-dev-patch.diff, 41-dev-notes.json |
dev-notes.schema.json |
| validate | QA | 50-qa-report.json |
qa-report.schema.json |
| review | Review | 60-review-report.json |
review-report.schema.json |
A stage advances only when all required artifacts exist and pass schema validation. Roles are enforced at runtime: a Dev agent cannot produce an Analyst brief.
| Level | Name | Status |
|---|---|---|
| Level 1 | Manual AI usage | Past |
| Level 2 | Structured multi-agent execution | Done |
| Level 3 | Autonomous org with memory | Done |
| Level 4 | Self-improving AI organization | Done |
| Level 5 | Closed-loop adaptive execution | Done |
| Level 6 | Template-enriched agent execution | Done |
| Level 7 | Last-mile delivery | Current |
The system provides deterministic multi-agent pipeline execution with full schema enforcement, role boundaries, and a structured work graph.
Phase 1 (Agent Identity & Control) — Complete.
- Persistent agent state (
.claw/agents/<id>/state.json) with role, workload counters, and last-active tracking. - Runtime role-to-stage enforcement. A role must match the stage before it can produce artifacts.
responsible_agenttracked per run and per stage transition.- Agent workload visible in the project dashboard.
- Picker rejects backlog items without an assigned
owner_role. - Cross-role leakage prevention tested end-to-end.
Phase 2 (Structured Work Graph) — Complete.
- Epic-to-child hierarchy via
parent_idon backlog items. - DAG validation with cycle detection (Kahn's algorithm).
- Graph-aware picker: skips tasks with unsatisfied dependencies and children of blocked epics.
- Epic completion rule: epics cannot be marked done while children are incomplete.
- Write-time epic completion guard (
backlog-update-status.js). - Preflight graph validation in the project driver.
- Dashboard dependency chain visualization.
Hardening & Ops — Complete.
additionalProperties: falseenforced on all schemas (output, input, reference).- GitHub Actions CI gate (
bash tools/test-all.sh) on every push and PR. - Preflight graph validation before creating or driving runs.
- Cron-friendly drive loop wrapper with safe stop.
Phase 3 (Knowledge & Artifact Layer) — Complete.
| Milestone | Description | Status |
|---|---|---|
| Artifact classification | Tag each artifact with a semantic type (decision, design, implementation, test-result, research-finding) | DONE |
| Global artifact index | Searchable index of all artifacts across projects and runs | DONE |
| Research workflow | Dedicated workflow for research tasks with structured findings schema | DONE |
| Agent memory | Append-only memory layer at .claw/agents/<id>/memory/, schema-validated |
DONE |
| Cross-run knowledge | Task pack generator references artifacts from prior runs when building context | DONE |
Phase 4 (Self-Improving AI Organization) — Complete.
| Milestone | Description | Status |
|---|---|---|
| Run analytics engine | Per-run metrics and project-level aggregates from completed runs | DONE |
| Self-evaluation loops | Agents assess the quality of their own outputs against historical baselines | DONE |
| Workflow optimization | System proposes pipeline improvements based on execution patterns | DONE |
| Autonomous ticket creation | System identifies gaps and creates tickets without human intervention | DONE |
| Adaptive role allocation | Agent assignment optimized based on workload and historical performance | DONE |
Phase 5 (Closed-Loop Adaptive Execution) — Complete.
| Milestone | Description | Status |
|---|---|---|
| Post-run lifecycle hooks | Self-eval + gap scan auto-triggered after every completed run | DONE |
| Adaptive agent prompt | Agent memory and workflow suggestions injected into Claude Code prompt | DONE |
| Agent assignment actuation | recommended_agent flows from picker through driver to runner and prompt |
DONE |
| Adaptive JS drive loop | Adaptive sleep, post-run hooks, project filtering, graceful stop conditions | DONE |
| Dashboard Phase 5 fields | Last evaluation, hooks status, loop status in dashboard summary | DONE |
Phase 6 (Template-Enriched Agent Execution) — Complete.
| Milestone | Description | Status |
|---|---|---|
| Template-enriched prompt | buildAdapterPrompt() reads stage task files, includes GOAL/STEPS/OUTPUT instructions |
DONE |
| Prior artifact context | buildArtifactContext() reads and injects prior artifact content into prompt |
DONE |
| Validation retry loop | buildRetryPrompt() provides error feedback, adapter retries up to maxRetries times |
DONE |
| Auto-patch application | Post-implement hook applies 40-dev-patch.diff via applyDevPatch() (dry-run first) |
DONE |
| Dashboard Phase 6 fields | Feature flags in dashboard summary | DONE |
Phase 7 (Last-Mile Delivery) — Complete.
| Milestone | Description | Status |
|---|---|---|
artifactList scope fix |
Hoisted artifactList to function scope in claudeCodeAdapter |
DONE |
| Post-patch test execution | discoverTestCommand() + runPostPatchTests() in post-patch-verify.js |
DONE |
| Auto-commit | autoCommit() creates structured git commit (opt-in, never pushes) |
DONE |
| Backlog auto-completion | Run reaching done auto-transitions backlog item via updateBacklogStatus |
DONE |
recommendAgent fallback |
Picker fallback to recommendAgent() from agent-performance.js |
DONE |
| Claude adapter test coverage | Mock-based tests for all claudeCodeAdapter paths |
DONE |
| Dashboard Phase 7 fields | Feature flags in dashboard summary | DONE |
- Deterministic pipeline with 8-stage state machine (intake through done)
- 5 role stages (Analyst, Architect, Dev, QA, Review) with enforced boundaries
- Schema-validated artifacts with
additionalProperties: falseon all schemas - Deterministic project scheduler with priority buckets and stable sort
- One-shot and loop project drivers with preflight graph validation
- Epic-child hierarchy with DAG validation and cycle detection
- Write-time epic completion guard
- Task pack generation (deterministic, no LLM calls)
- Autonomous multi-agent runner with scaffold, Claude CLI, and draft-file adapters
- Append-only audit logging per run
- Agent identity with persistent state and workload tracking
- Runtime role enforcement (cross-role leakage prevention)
- Project dashboard with dependency chains and agent workload
- Ticket persistence with anti-truncation guards
- Stop/resume mechanism for autonomous runs
- Stall detection (30-minute threshold on audit log)
- HTTP dashboard on localhost:18790
- External workspace model (pure engine,
.claw/in target repos) - Workspace bootstrap and patch application tools
- Pluggable capability system (capability registry, manifest-driven stage injection)
- Goal-driven mission layer (deterministic intent + stack detection, capability activation)
- Pluggable capabilities: UX audit, security audit, performance audit, research
- Artifact classification with semantic types and global artifact index
- Research workflow with structured findings schema
- Agent memory (append-only, schema-validated, cross-run)
- Cross-run knowledge retention in task pack generation
- Run analytics engine (per-run metrics + project-level aggregates)
- Self-evaluation with quality score, deviation analysis, and memory persistence
- Workflow suggestion engine (4 detection rules with evidence and confidence)
- Gap scanner with auto-create backlog items (idempotent)
- Agent performance profiling with recommended_agent in picker
- Dashboard Phase 4 summary (performance, suggestions, gaps)
- Post-run lifecycle hooks (auto self-eval + gap scan after every run)
- Adaptive agent prompt (memory + workflow suggestions injected into Claude Code prompt)
- Agent assignment actuation (recommended_agent flows pick → drive → runner → prompt)
- Adaptive JS drive loop (adaptive sleep, hooks, project filter, graceful stop)
- Dashboard Phase 5 summary (last evaluation, hooks status, loop status)
- Template-enriched agent prompt (stage task files, prior artifact context, retry feedback)
- Validation retry loop (maxRetries with error feedback to agent)
- Auto-patch application (post-implement dry-run + apply, non-fatal)
- Dashboard Phase 6 feature flags
- Post-patch test execution (discover + run project tests after patch application)
- Auto-commit (opt-in structured git commit after successful tests, never pushes)
- Backlog auto-completion (run → done transitions backlog item to done)
recommendAgentfallback (picker uses agent-performance when no recommendation)- Dashboard Phase 7 feature flags
- GitHub Actions CI (run
bash tools/test-all.shfor current counts) - Zero external npm dependencies
.
├── ARCHITECTURE.md # Concise architecture overview (links to detailed)
├── LICENSE # MIT
├── docs/
│ ├── ARCHITECTURE_DETAILED.md # System design, state machine, determinism model, evolution roadmap
│ ├── GOVERNANCE.md # Golden rules: schema enforcement, testing, no external deps
│ ├── ux-spec-autonomous-runner.md # CLI contract for autonomous runner
│ ├── product-diagram.md # System overview diagram
│ ├── drift-report.md # Domain leakage verification report
│ └── phase{3,4,5,6}-plan.md # Phase plan documents
├── skills/
│ ├── dev-pipeline/
│ │ ├── SKILL.md # Comprehensive operational reference
│ │ ├── scripts/ # Core engine (run `bash tools/doc-stats.sh`)
│ │ │ ├── dev-pipeline.js # Core pipeline engine (state machine, schema validation, stage gates)
│ │ │ ├── capability-registry.js # Capability loader, stage injection, template/schema resolution
│ │ │ ├── goal-selector.js # Deterministic intent + stack → capability mapping
│ │ │ ├── create-mission.js # CLI: goal → mission + capabilities.json
│ │ │ ├── autonomous-runner.js # Multi-agent autonomous execution loop (Phase 6: template-enriched)
│ │ │ ├── adapter-prompt-builder.js # Template-enriched prompt assembly (Phase 6)
│ │ │ ├── project-next-pick.js # Deterministic task picker
│ │ │ ├── project-next-drive.js # One-shot project driver with preflight validation
│ │ │ ├── validate-backlog-graph.js # DAG validator (cycles, parents, epic completion)
│ │ │ └── ... # Index, dashboard, task-pack, ticket-store, agent-state
│ │ ├── schemas/ # 27 JSON Schema files (input + output schemas)
│ │ ├── references/ # 7 artifact schemas (pm-brief, arch-design, dev-notes, etc.)
│ │ └── tests/ # Test suites (run `bash tools/test-all.sh`)
│ └── capabilities/ # Pluggable capability extensions
│ ├── ux_audit/ # UX audit stage (after analyze)
│ ├── security_audit/ # Security audit stage (after analyze)
│ ├── performance_audit/ # Performance audit stage (after analyze)
│ └── research/ # Research workflow (after analyze)
├── tools/ # shell wrappers (the public CLI surface)
│ ├── dp.sh # Main CLI entry point
│ ├── create-mission.sh # Goal → capability activation
│ ├── test-all.sh # Master test gate
│ ├── project-next-drive.sh # One-shot project driver
│ ├── project-drive-loop.sh # Cron-friendly loop wrapper
│ ├── backlog-update-status.sh # Status transition with guards
│ ├── init-workspace.sh # Bootstrap .claw/ in a target repo
│ ├── apply-dev-patch.sh # Apply dev patch to workspace
│ ├── _workspace.sh # Shared --workspace flag parser
│ └── ... # run-next-*, project-*, dashboard-*, ticket-*
├── openclaw/ # OpenClaw integration docs and example config
├── templates/ # Core role-specific task pack templates
├── .github/workflows/test.yml # CI: runs test-all.sh on push and PR
├── SOUL.md # Workspace agent contract: principles
├── BOOTSTRAP.md # Workspace agent contract: first-run init
├── HEARTBEAT.md # Workspace agent contract: periodic-check marker
├── IDENTITY.md # Workspace agent contract: identity template
├── USER.md # Workspace agent contract: facts about the human
├── TOOLS.md # Workspace agent contract: local environment notes
├── SECURITY.md # Security boundaries and access controls
└── AGENTS.md # Workspace orientation for agents (industry standard)
The seven root markdown files (AGENTS.md, BOOTSTRAP.md, HEARTBEAT.md,
IDENTITY.md, SOUL.md, USER.md, TOOLS.md) are operating instructions for an
agent — for example Claude Code — that visits this workspace as a personal
assistant. They are intentionally kept at the workspace root because the
AGENTS.md contract instructs the visiting agent to read them by bare name.
This is a separate concern from the dev-pipeline orchestrator described in
this README. The orchestrator's roles, schemas, and execution surface live in
agents.json, skills/dev-pipeline/, and tools/. The orchestrator does not
read SOUL.md, IDENTITY.md, or USER.md. The two layers coexist in the same
repository but solve different problems.
- Node.js 20+ (LTS). No other runtime dependencies.
- No npm install. The system uses only Node.js built-in modules (
node:fs,node:path,node:os,node:crypto). - Bash for shell wrappers.
bash tools/test-all.shExpected output: TOTAL: N passed, 0 failed (M suites) with zero failures.
./tools/init-workspace.sh --workspace /path/to/my-project \
--project_id my-project --title "My Project"Creates .claw/ directory structure with project.json, agents.json, and all required subdirectories.
./tools/project-dashboard.sh --workspace /path/to/my-project | jq./tools/project-next-pick.sh --workspace /path/to/my-project | jq./tools/project-next-drive.sh --workspace /path/to/my-project --dry_run | jqRemove --dry_run to execute for real.
./tools/apply-dev-patch.sh --workspace /path/to/my-project \
--run_folder .claw/runs/20260220_T-01 --dry_run./tools/project-drive-loop.sh --workspace /path/to/my-project --sleep 5 --max 10Stops on .stop file, max iterations, or no eligible work. Prints JSON summary on exit.
./tools/create-mission.sh --workspace /path/to/my-project --goal "improve UX of checkout"Detects intents (ux) and stack (nextjs), activates the ux_audit capability, and writes .claw/capabilities.json + .claw/missions/<id>.json.
./tools/validate-backlog-graph.sh --workspace /path/to/my-project my-project | jq./tools/backlog-update-status.sh --workspace /path/to/my-project my-project TASK-01 doneRejects epic-to-done transitions when children are incomplete.
| Variable | Default | Description |
|---|---|---|
WORKSPACE_ROOT |
~/dev/agent-work |
Root of the target project repo. All .claw/ paths resolved relative to this. Can also be set via --workspace flag. |
DP_AUDIT_LOG |
0 |
Set to 1 to enable append-only audit logging. |
LOOP_SLEEP_SECONDS |
5 |
Seconds between drive loop iterations. |
LOOP_MAX_ITERATIONS |
100 |
Max iterations for drive loop (0 = unlimited). |
All configuration is via environment variables. No config files to manage.
- Path confinement. Every file operation goes through
safePath()which rejects any path resolving outsideWORKSPACE_ROOT. - Schema enforcement. Every JSON output is validated against a schema with
additionalProperties: false. Undeclared fields are rejected. - Role enforcement. Agents can only produce artifacts for stages matching their assigned role.
- Read-only tools. Index, pick, dashboard, watch, and list tools never create, modify, or delete files.
- Write guards. The autonomous runner never overwrites existing artifacts. The backlog updater rejects invalid epic transitions.
- Graph validation. Dependency cycles and invalid parent references are detected before runs start.
Given the same filesystem state:
- The picker always returns the same task.
- The index always returns the same project/run arrays in the same order.
- Task packs are generated identically (excluding timestamps).
- Schema validation produces the same result.
Timestamps and git HEAD are the only sources of non-determinism.
- Comprehensive test coverage (
bash tools/test-all.sh) covering:- State machine transitions and gate enforcement
- Schema validation round-trips for all artifact types
- Picker determinism and graph-aware constraint enforcement
- Autonomous runner safety (no directory creation, no artifact overwrite)
- Role enforcement and cross-role leakage prevention
- Dashboard computed fields and dependency chain enrichment
- Epic completion guards (validation-time and write-time)
- Real data validation against live schemas
- Capability registry, goal-selector, and mission layer
- End-to-end capability injection (UX, security, performance audits)
tools/test-all.shis the single gate. Zero failures required.- GitHub Actions CI runs on every push and PR.
| Failure | Behavior |
|---|---|
| Corrupted JSON | Skipped by index/pick tools. run_next_safe returns action: "error". |
| Missing status.json | Run listed with has_status: false, not picked. |
| Stalled run | Detected when audit log untouched for 30+ minutes. Flagged in dashboard. |
| Schema violation | Artifact rejected. Stage cannot advance. |
| Dependency cycle | Detected by validator. Driver skips project with JSON warning. |
| Agent role mismatch | Artifact submission rejected with clear error. |
| Document | Purpose |
|---|---|
ARCHITECTURE.md / docs/ARCHITECTURE_DETAILED.md |
Concise overview at the root; full design, state machine, determinism model, and evolution roadmap in the detailed doc. Single source of truth for phase scope and stop conditions. |
docs/GOVERNANCE.md |
Golden rules: every change tied to a ticket, every JSON has a schema, every schema has tests, no external dependencies. |
skills/dev-pipeline/SKILL.md |
Operational reference for all tools, commands, schemas, and behaviors. |
.claw/tickets/<ticket_id>.md |
Individual ticket definitions with goals, steps, and acceptance criteria. |
- Create a ticket file in
.claw/tickets/with frontmatter and required sections. - Create a backlog item in
.claw/backlog/. - Reference the active phase. Changes outside the current phase are rejected.
- Implement. Run
bash tools/test-all.sh. Zero failures required. - Update
SKILL.mdif new tools or behaviors were added. - Commit with a descriptive message.
- Only one phase may be active at a time.
- A phase is complete when every stop condition evaluates to true.
- Every ticket must reference its phase.
docs/ARCHITECTURE_DETAILED.mdis the single source of truth for phase scope. Conflicts between tickets and the architecture doc are resolved in favor of the architecture doc.
- Create a ticket file before starting work.
- Run
bash tools/test-all.shand confirm zero failures before committing. - Keep commits focused: one logical change per commit.
- Use conventional commit prefixes:
feat:,fix:,chore:,refactor:,docs:,ci:. - Do not introduce external npm dependencies.
- Do not add
additionalPropertiesto schemas without the: falseconstraint. - Do not modify the state machine or role boundaries without a ticket referencing a specific phase.
# 1. Run the full test suite
bash tools/test-all.sh
# 2. Initialize a workspace and validate its graph
./tools/init-workspace.sh --workspace /tmp/test --project_id test --title "Test"
./tools/validate-backlog-graph.sh --workspace /tmp/test test | jq '.valid'
# 3. Confirm dashboard produces valid output
./tools/project-dashboard.sh --workspace /tmp/test | jq '.ok'
# 4. Confirm git status is clean
git status| Risk | Mitigation |
|---|---|
| Determinism boundary. Timestamps and git HEAD introduce non-determinism. | Timestamps are informational only, never used for ordering decisions. |
| Agent hallucination. LLM-generated artifacts may contain incorrect content. | Schema validation catches structural errors. QA and Review stages provide content checks. |
| Cost and latency. Autonomous runner invokes Claude CLI per stage. | --dry_run mode for testing. Scaffold adapter for development without API calls. Max step/agent call limits. |
| Data privacy. All data stays on the local filesystem. | No outbound network calls from pipeline scripts. Dashboard binds to 127.0.0.1 only. Workspace permissions set to 700. |
| Single-machine limitation. No distributed execution. | By design for the current maturity level. Filesystem-as-API is the intentional constraint. |
| No rollback mechanism. Status transitions are one-way writes. | Append-only audit log provides full history. Artifacts are never overwritten. Safe reruns from current state. |
- Deterministic 8-stage pipeline with role boundaries
- Schema validation on all artifacts, tool outputs, input data, and reference schemas
- Persistent agent identity with workload tracking and role enforcement
- Structured work graph: epic hierarchy, DAG validation, cycle detection
- Graph-aware scheduler: dependency satisfaction, parent blocking, epic completion
- Write-time epic completion guard
- Preflight graph validation in project driver
- Task pack generation (deterministic, no LLM)
- Autonomous multi-agent runner with stop/resume and audit logging
- Project dashboard with dependency chains and agent workload
- Ticket persistence with anti-truncation guards
- External workspace model (pure engine,
.claw/state in target repos) - Workspace bootstrap (
init-workspace) and patch application (apply-dev-patch) - Pluggable capability system with manifest-driven stage injection
- Goal-driven mission layer (deterministic intent + stack detection)
- Built-in capabilities: UX audit, security audit, performance audit
- GitHub Actions CI (all tests green, zero failures)
-
additionalProperties: falseon all schemas (governance rule enforced) - Cron-friendly drive loop with safe stop
- Zero external dependencies
- Phase 3: Artifact classification with semantic types
- Phase 3: Global artifact index across projects and runs
- Phase 3: Research workflow with structured findings
- Phase 3: Agent memory persistence across runs
- Phase 3: Cross-run knowledge retention in task packs
- Phase 4: Run analytics engine (per-run metrics + project aggregates)
- Phase 4: Self-evaluation (quality score, deviations, suggestions, memory write)
- Phase 4: Workflow suggestions (recurring QA failures, bottlenecks, rejection rates, quality trends)
- Phase 4: Gap scanner with auto-create backlog items
- Phase 4: Agent performance profiles with recommended_agent in picker
- Phase 5: Post-run lifecycle hooks (auto self-eval + gap scan)
- Phase 5: Adaptive agent prompt (memory + suggestions in Claude Code prompt)
- Phase 5: Agent assignment actuation (recommended_agent pick → drive → runner)
- Phase 5: Adaptive JS drive loop (sleep, hooks, project filter, stop conditions)
- Phase 5: Dashboard closed-loop status fields
- Phase 6: Template-enriched adapter prompt (stage task files as primary instructions)
- Phase 6: Prior artifact context injection (STAGE_ARTIFACT_DEPS, per-artifact truncation)
- Phase 6: Validation retry loop (maxRetries with error feedback)
- Phase 6: Post-implement auto-patch application (dry-run first, non-fatal)
- Phase 6: Dashboard feature flags
- Phase 7:
artifactListscope fix inclaudeCodeAdapter - Phase 7: Post-patch test execution (
discoverTestCommand+runPostPatchTests) - Phase 7: Auto-commit after successful tests (opt-in, never pushes)
- Phase 7: Backlog auto-completion (run done → backlog item done)
- Phase 7:
recommendAgentfallback in project driver - Phase 7: Claude adapter mock-based test coverage
- Phase 7: Dashboard feature flags
MIT — see LICENSE.