Skip to content

henrio123/agent-work

Repository files navigation

AI Organisation OS

A deterministic, role-based orchestration system that turns unstructured AI work into reproducible, auditable, multi-agent pipelines. Every piece of state lives on the filesystem. Every output is schema-validated. Every decision is traceable.


What It Does

Problem. AI agents lose work. Conversations get compacted. Context windows overflow. Terminal scrollback disappears. There is no persistent record of what was decided, who did it, or why.

Solution. This system replaces ad-hoc AI usage with a structured execution pipeline. Tickets enter the system. A deterministic scheduler picks the next task. A fixed sequence of role stages (Analyst, Architect, Dev, QA, Review) executes the work. Each stage produces schema-validated artifacts. Gates prevent advancement until quality checks pass. Everything is written to disk.

Typical outcomes:

  • Every ticket has a complete audit trail from intake through review.
  • Role boundaries are enforced: a QA agent cannot write code, an Analyst cannot modify architecture.
  • Schema validation catches structural errors before they propagate.
  • Work survives agent restarts, context compaction, and session loss.
  • The entire system runs on Node.js built-ins with zero npm dependencies.

Who it is for. Teams and individuals building multi-agent AI systems who need determinism, auditability, and reproducibility. Useful for anyone who has lost work to a crashed agent session or an overflowed context window.


Scope and Limits

What this is, stated honestly:

  • Local-only execution. Every tool runs against a workspace on the local filesystem. There is no hosted service, no managed runner, no scheduled execution beyond the optional cron-friendly drive loop.
  • Filesystem-as-API by design. No database, no queue, no HTTP service surface except the optional read-only dashboard bound to 127.0.0.1. State is files.
  • Not a vector store or semantic RAG. Context retrieval for task packs is filename-, schema-, and graph-based. There is no embedding store.
  • Not LangChain, LangGraph, or MCP. The orchestrator drives an agent adapter (the Claude CLI) directly via subprocess. No agent-framework dependency.
  • The orchestrator dispatches; the LLM does not pick tools. Stage transitions and tool dispatch are deterministic. The LLM authors artifacts; it does not decide what runs next.
  • No token or cost telemetry yet. Stage durations and an append-only audit log are written; per-call token accounting is not.
  • Single-machine. Distributed execution is out of scope at the current maturity level.

Run validation at any point with bash tools/test-all.sh. Zero failures expected.

For the full design — state machine, determinism model, failure modes — see ARCHITECTURE.md (overview) and docs/ARCHITECTURE_DETAILED.md (full).


How It Works

Key Concepts

Concept Description
Project A workspace initialized with .claw/ containing metadata, agent roles, and a backlog of tasks.
Backlog Item A unit of work (.claw/backlog/<id>.json) with status, priority, owner role, dependencies, and an optional parent epic.
Run A single pipeline execution (.claw/runs/<timestamp>_<ticket>/) with a state machine from intake to done.
Stage A step in the pipeline owned by one role. Each stage requires specific artifacts validated against JSON schemas.
Picker A deterministic scheduler that selects the next eligible task using stable sort rules (priority bucket > status > priority > ID).
Driver A one-shot executor that creates a run for the picked task, invokes the autonomous runner, and writes results.

Pipeline Flow

Ticket  -->  Backlog Item  -->  Task Pack  -->  Run Creation  -->  Pipeline Stages  -->  Done
                                                                     |
                                                    intake -> task-pack-generated
                                                    analyze -> plan
                                                    implement -> validate
                                                    review -> done

Role Stages

Stage Role Required Artifact Schema
analyze Analyst 10-pm-brief.json pm-brief.schema.json
plan Architect 20-arch-design.json arch-design.schema.json
implement Dev 40-dev-patch.diff, 41-dev-notes.json dev-notes.schema.json
validate QA 50-qa-report.json qa-report.schema.json
review Review 60-review-report.json review-report.schema.json

A stage advances only when all required artifacts exist and pass schema validation. Roles are enforced at runtime: a Dev agent cannot produce an Analyst brief.


Maturity Model and Roadmap

Where We Are

Level Name Status
Level 1 Manual AI usage Past
Level 2 Structured multi-agent execution Done
Level 3 Autonomous org with memory Done
Level 4 Self-improving AI organization Done
Level 5 Closed-loop adaptive execution Done
Level 6 Template-enriched agent execution Done
Level 7 Last-mile delivery Current

Levels 1–2 — What's Done

The system provides deterministic multi-agent pipeline execution with full schema enforcement, role boundaries, and a structured work graph.

Phase 1 (Agent Identity & Control) — Complete.

  • Persistent agent state (.claw/agents/<id>/state.json) with role, workload counters, and last-active tracking.
  • Runtime role-to-stage enforcement. A role must match the stage before it can produce artifacts.
  • responsible_agent tracked per run and per stage transition.
  • Agent workload visible in the project dashboard.
  • Picker rejects backlog items without an assigned owner_role.
  • Cross-role leakage prevention tested end-to-end.

Phase 2 (Structured Work Graph) — Complete.

  • Epic-to-child hierarchy via parent_id on backlog items.
  • DAG validation with cycle detection (Kahn's algorithm).
  • Graph-aware picker: skips tasks with unsatisfied dependencies and children of blocked epics.
  • Epic completion rule: epics cannot be marked done while children are incomplete.
  • Write-time epic completion guard (backlog-update-status.js).
  • Preflight graph validation in the project driver.
  • Dashboard dependency chain visualization.

Hardening & Ops — Complete.

  • additionalProperties: false enforced on all schemas (output, input, reference).
  • GitHub Actions CI gate (bash tools/test-all.sh) on every push and PR.
  • Preflight graph validation before creating or driving runs.
  • Cron-friendly drive loop wrapper with safe stop.

Level 3 — What's Done

Phase 3 (Knowledge & Artifact Layer) — Complete.

Milestone Description Status
Artifact classification Tag each artifact with a semantic type (decision, design, implementation, test-result, research-finding) DONE
Global artifact index Searchable index of all artifacts across projects and runs DONE
Research workflow Dedicated workflow for research tasks with structured findings schema DONE
Agent memory Append-only memory layer at .claw/agents/<id>/memory/, schema-validated DONE
Cross-run knowledge Task pack generator references artifacts from prior runs when building context DONE

Level 4 — What's Done

Phase 4 (Self-Improving AI Organization) — Complete.

Milestone Description Status
Run analytics engine Per-run metrics and project-level aggregates from completed runs DONE
Self-evaluation loops Agents assess the quality of their own outputs against historical baselines DONE
Workflow optimization System proposes pipeline improvements based on execution patterns DONE
Autonomous ticket creation System identifies gaps and creates tickets without human intervention DONE
Adaptive role allocation Agent assignment optimized based on workload and historical performance DONE

Level 5 — What's Done

Phase 5 (Closed-Loop Adaptive Execution) — Complete.

Milestone Description Status
Post-run lifecycle hooks Self-eval + gap scan auto-triggered after every completed run DONE
Adaptive agent prompt Agent memory and workflow suggestions injected into Claude Code prompt DONE
Agent assignment actuation recommended_agent flows from picker through driver to runner and prompt DONE
Adaptive JS drive loop Adaptive sleep, post-run hooks, project filtering, graceful stop conditions DONE
Dashboard Phase 5 fields Last evaluation, hooks status, loop status in dashboard summary DONE

Level 6 — What's Done

Phase 6 (Template-Enriched Agent Execution) — Complete.

Milestone Description Status
Template-enriched prompt buildAdapterPrompt() reads stage task files, includes GOAL/STEPS/OUTPUT instructions DONE
Prior artifact context buildArtifactContext() reads and injects prior artifact content into prompt DONE
Validation retry loop buildRetryPrompt() provides error feedback, adapter retries up to maxRetries times DONE
Auto-patch application Post-implement hook applies 40-dev-patch.diff via applyDevPatch() (dry-run first) DONE
Dashboard Phase 6 fields Feature flags in dashboard summary DONE

Level 7 — What's Done

Phase 7 (Last-Mile Delivery) — Complete.

Milestone Description Status
artifactList scope fix Hoisted artifactList to function scope in claudeCodeAdapter DONE
Post-patch test execution discoverTestCommand() + runPostPatchTests() in post-patch-verify.js DONE
Auto-commit autoCommit() creates structured git commit (opt-in, never pushes) DONE
Backlog auto-completion Run reaching done auto-transitions backlog item via updateBacklogStatus DONE
recommendAgent fallback Picker fallback to recommendAgent() from agent-performance.js DONE
Claude adapter test coverage Mock-based tests for all claudeCodeAdapter paths DONE
Dashboard Phase 7 fields Feature flags in dashboard summary DONE

Features

Implemented

  • Deterministic pipeline with 8-stage state machine (intake through done)
  • 5 role stages (Analyst, Architect, Dev, QA, Review) with enforced boundaries
  • Schema-validated artifacts with additionalProperties: false on all schemas
  • Deterministic project scheduler with priority buckets and stable sort
  • One-shot and loop project drivers with preflight graph validation
  • Epic-child hierarchy with DAG validation and cycle detection
  • Write-time epic completion guard
  • Task pack generation (deterministic, no LLM calls)
  • Autonomous multi-agent runner with scaffold, Claude CLI, and draft-file adapters
  • Append-only audit logging per run
  • Agent identity with persistent state and workload tracking
  • Runtime role enforcement (cross-role leakage prevention)
  • Project dashboard with dependency chains and agent workload
  • Ticket persistence with anti-truncation guards
  • Stop/resume mechanism for autonomous runs
  • Stall detection (30-minute threshold on audit log)
  • HTTP dashboard on localhost:18790
  • External workspace model (pure engine, .claw/ in target repos)
  • Workspace bootstrap and patch application tools
  • Pluggable capability system (capability registry, manifest-driven stage injection)
  • Goal-driven mission layer (deterministic intent + stack detection, capability activation)
  • Pluggable capabilities: UX audit, security audit, performance audit, research
  • Artifact classification with semantic types and global artifact index
  • Research workflow with structured findings schema
  • Agent memory (append-only, schema-validated, cross-run)
  • Cross-run knowledge retention in task pack generation
  • Run analytics engine (per-run metrics + project-level aggregates)
  • Self-evaluation with quality score, deviation analysis, and memory persistence
  • Workflow suggestion engine (4 detection rules with evidence and confidence)
  • Gap scanner with auto-create backlog items (idempotent)
  • Agent performance profiling with recommended_agent in picker
  • Dashboard Phase 4 summary (performance, suggestions, gaps)
  • Post-run lifecycle hooks (auto self-eval + gap scan after every run)
  • Adaptive agent prompt (memory + workflow suggestions injected into Claude Code prompt)
  • Agent assignment actuation (recommended_agent flows pick → drive → runner → prompt)
  • Adaptive JS drive loop (adaptive sleep, hooks, project filter, graceful stop)
  • Dashboard Phase 5 summary (last evaluation, hooks status, loop status)
  • Template-enriched agent prompt (stage task files, prior artifact context, retry feedback)
  • Validation retry loop (maxRetries with error feedback to agent)
  • Auto-patch application (post-implement dry-run + apply, non-fatal)
  • Dashboard Phase 6 feature flags
  • Post-patch test execution (discover + run project tests after patch application)
  • Auto-commit (opt-in structured git commit after successful tests, never pushes)
  • Backlog auto-completion (run → done transitions backlog item to done)
  • recommendAgent fallback (picker uses agent-performance when no recommendation)
  • Dashboard Phase 7 feature flags
  • GitHub Actions CI (run bash tools/test-all.sh for current counts)
  • Zero external npm dependencies

Repo Map

.
├── ARCHITECTURE.md              # Concise architecture overview (links to detailed)
├── LICENSE                      # MIT
├── docs/
│   ├── ARCHITECTURE_DETAILED.md # System design, state machine, determinism model, evolution roadmap
│   ├── GOVERNANCE.md            # Golden rules: schema enforcement, testing, no external deps
│   ├── ux-spec-autonomous-runner.md  # CLI contract for autonomous runner
│   ├── product-diagram.md       # System overview diagram
│   ├── drift-report.md          # Domain leakage verification report
│   └── phase{3,4,5,6}-plan.md   # Phase plan documents
├── skills/
│   ├── dev-pipeline/
│   │   ├── SKILL.md                 # Comprehensive operational reference
│   │   ├── scripts/                 # Core engine (run `bash tools/doc-stats.sh`)
│   │   │   ├── dev-pipeline.js      # Core pipeline engine (state machine, schema validation, stage gates)
│   │   │   ├── capability-registry.js # Capability loader, stage injection, template/schema resolution
│   │   │   ├── goal-selector.js     # Deterministic intent + stack → capability mapping
│   │   │   ├── create-mission.js    # CLI: goal → mission + capabilities.json
│   │   │   ├── autonomous-runner.js # Multi-agent autonomous execution loop (Phase 6: template-enriched)
│   │   │   ├── adapter-prompt-builder.js # Template-enriched prompt assembly (Phase 6)
│   │   │   ├── project-next-pick.js # Deterministic task picker
│   │   │   ├── project-next-drive.js # One-shot project driver with preflight validation
│   │   │   ├── validate-backlog-graph.js  # DAG validator (cycles, parents, epic completion)
│   │   │   └── ...                  # Index, dashboard, task-pack, ticket-store, agent-state
│   │   ├── schemas/                 # 27 JSON Schema files (input + output schemas)
│   │   ├── references/              # 7 artifact schemas (pm-brief, arch-design, dev-notes, etc.)
│   │   └── tests/                   # Test suites (run `bash tools/test-all.sh`)
│   └── capabilities/               # Pluggable capability extensions
│       ├── ux_audit/                # UX audit stage (after analyze)
│       ├── security_audit/          # Security audit stage (after analyze)
│       ├── performance_audit/       # Performance audit stage (after analyze)
│       └── research/               # Research workflow (after analyze)
├── tools/                       # shell wrappers (the public CLI surface)
│   ├── dp.sh                    # Main CLI entry point
│   ├── create-mission.sh        # Goal → capability activation
│   ├── test-all.sh              # Master test gate
│   ├── project-next-drive.sh    # One-shot project driver
│   ├── project-drive-loop.sh    # Cron-friendly loop wrapper
│   ├── backlog-update-status.sh # Status transition with guards
│   ├── init-workspace.sh          # Bootstrap .claw/ in a target repo
│   ├── apply-dev-patch.sh         # Apply dev patch to workspace
│   ├── _workspace.sh              # Shared --workspace flag parser
│   └── ...                      # run-next-*, project-*, dashboard-*, ticket-*
├── openclaw/                    # OpenClaw integration docs and example config
├── templates/                   # Core role-specific task pack templates
├── .github/workflows/test.yml   # CI: runs test-all.sh on push and PR
├── SOUL.md                      # Workspace agent contract: principles
├── BOOTSTRAP.md                 # Workspace agent contract: first-run init
├── HEARTBEAT.md                 # Workspace agent contract: periodic-check marker
├── IDENTITY.md                  # Workspace agent contract: identity template
├── USER.md                      # Workspace agent contract: facts about the human
├── TOOLS.md                     # Workspace agent contract: local environment notes
├── SECURITY.md                  # Security boundaries and access controls
└── AGENTS.md                    # Workspace orientation for agents (industry standard)

Workspace Agent Contracts

The seven root markdown files (AGENTS.md, BOOTSTRAP.md, HEARTBEAT.md, IDENTITY.md, SOUL.md, USER.md, TOOLS.md) are operating instructions for an agent — for example Claude Code — that visits this workspace as a personal assistant. They are intentionally kept at the workspace root because the AGENTS.md contract instructs the visiting agent to read them by bare name.

This is a separate concern from the dev-pipeline orchestrator described in this README. The orchestrator's roles, schemas, and execution surface live in agents.json, skills/dev-pipeline/, and tools/. The orchestrator does not read SOUL.md, IDENTITY.md, or USER.md. The two layers coexist in the same repository but solve different problems.


Quickstart

Requirements

  • Node.js 20+ (LTS). No other runtime dependencies.
  • No npm install. The system uses only Node.js built-in modules (node:fs, node:path, node:os, node:crypto).
  • Bash for shell wrappers.

Run Tests

bash tools/test-all.sh

Expected output: TOTAL: N passed, 0 failed (M suites) with zero failures.

Initialize a Target Repo

./tools/init-workspace.sh --workspace /path/to/my-project \
  --project_id my-project --title "My Project"

Creates .claw/ directory structure with project.json, agents.json, and all required subdirectories.

Run the Project Dashboard

./tools/project-dashboard.sh --workspace /path/to/my-project | jq

Pick the Next Eligible Task

./tools/project-next-pick.sh --workspace /path/to/my-project | jq

Drive One Task (One-Shot)

./tools/project-next-drive.sh --workspace /path/to/my-project --dry_run | jq

Remove --dry_run to execute for real.

Apply a Dev Patch

./tools/apply-dev-patch.sh --workspace /path/to/my-project \
  --run_folder .claw/runs/20260220_T-01 --dry_run

Drive in a Loop (Cron-Friendly)

./tools/project-drive-loop.sh --workspace /path/to/my-project --sleep 5 --max 10

Stops on .stop file, max iterations, or no eligible work. Prints JSON summary on exit.

Create a Mission (Goal-Driven Capability Activation)

./tools/create-mission.sh --workspace /path/to/my-project --goal "improve UX of checkout"

Detects intents (ux) and stack (nextjs), activates the ux_audit capability, and writes .claw/capabilities.json + .claw/missions/<id>.json.

Validate a Project's Backlog Graph

./tools/validate-backlog-graph.sh --workspace /path/to/my-project my-project | jq

Update Backlog Item Status (With Guards)

./tools/backlog-update-status.sh --workspace /path/to/my-project my-project TASK-01 done

Rejects epic-to-done transitions when children are incomplete.


Configuration

Variable Default Description
WORKSPACE_ROOT ~/dev/agent-work Root of the target project repo. All .claw/ paths resolved relative to this. Can also be set via --workspace flag.
DP_AUDIT_LOG 0 Set to 1 to enable append-only audit logging.
LOOP_SLEEP_SECONDS 5 Seconds between drive loop iterations.
LOOP_MAX_ITERATIONS 100 Max iterations for drive loop (0 = unlimited).

All configuration is via environment variables. No config files to manage.


Safety, Determinism, and Quality Controls

Guardrails

  • Path confinement. Every file operation goes through safePath() which rejects any path resolving outside WORKSPACE_ROOT.
  • Schema enforcement. Every JSON output is validated against a schema with additionalProperties: false. Undeclared fields are rejected.
  • Role enforcement. Agents can only produce artifacts for stages matching their assigned role.
  • Read-only tools. Index, pick, dashboard, watch, and list tools never create, modify, or delete files.
  • Write guards. The autonomous runner never overwrites existing artifacts. The backlog updater rejects invalid epic transitions.
  • Graph validation. Dependency cycles and invalid parent references are detected before runs start.

Determinism

Given the same filesystem state:

  • The picker always returns the same task.
  • The index always returns the same project/run arrays in the same order.
  • Task packs are generated identically (excluding timestamps).
  • Schema validation produces the same result.

Timestamps and git HEAD are the only sources of non-determinism.

Testing Strategy

  • Comprehensive test coverage (bash tools/test-all.sh) covering:
    • State machine transitions and gate enforcement
    • Schema validation round-trips for all artifact types
    • Picker determinism and graph-aware constraint enforcement
    • Autonomous runner safety (no directory creation, no artifact overwrite)
    • Role enforcement and cross-role leakage prevention
    • Dashboard computed fields and dependency chain enrichment
    • Epic completion guards (validation-time and write-time)
    • Real data validation against live schemas
    • Capability registry, goal-selector, and mission layer
    • End-to-end capability injection (UX, security, performance audits)
  • tools/test-all.sh is the single gate. Zero failures required.
  • GitHub Actions CI runs on every push and PR.

Failure Modes

Failure Behavior
Corrupted JSON Skipped by index/pick tools. run_next_safe returns action: "error".
Missing status.json Run listed with has_status: false, not picked.
Stalled run Detected when audit log untouched for 30+ minutes. Flagged in dashboard.
Schema violation Artifact rejected. Stage cannot advance.
Dependency cycle Detected by validator. Driver skips project with JSON warning.
Agent role mismatch Artifact submission rejected with clear error.

Governance

Where Decisions Live

Document Purpose
ARCHITECTURE.md / docs/ARCHITECTURE_DETAILED.md Concise overview at the root; full design, state machine, determinism model, and evolution roadmap in the detailed doc. Single source of truth for phase scope and stop conditions.
docs/GOVERNANCE.md Golden rules: every change tied to a ticket, every JSON has a schema, every schema has tests, no external dependencies.
skills/dev-pipeline/SKILL.md Operational reference for all tools, commands, schemas, and behaviors.
.claw/tickets/<ticket_id>.md Individual ticket definitions with goals, steps, and acceptance criteria.

How Changes Are Proposed

  1. Create a ticket file in .claw/tickets/ with frontmatter and required sections.
  2. Create a backlog item in .claw/backlog/.
  3. Reference the active phase. Changes outside the current phase are rejected.
  4. Implement. Run bash tools/test-all.sh. Zero failures required.
  5. Update SKILL.md if new tools or behaviors were added.
  6. Commit with a descriptive message.

Evolution Governance

  • Only one phase may be active at a time.
  • A phase is complete when every stop condition evaluates to true.
  • Every ticket must reference its phase.
  • docs/ARCHITECTURE_DETAILED.md is the single source of truth for phase scope. Conflicts between tickets and the architecture doc are resolved in favor of the architecture doc.

Contributing

Branch and PR Rules

  1. Create a ticket file before starting work.
  2. Run bash tools/test-all.sh and confirm zero failures before committing.
  3. Keep commits focused: one logical change per commit.
  4. Use conventional commit prefixes: feat:, fix:, chore:, refactor:, docs:, ci:.
  5. Do not introduce external npm dependencies.
  6. Do not add additionalProperties to schemas without the : false constraint.
  7. Do not modify the state machine or role boundaries without a ticket referencing a specific phase.

Verification Checklist

# 1. Run the full test suite
bash tools/test-all.sh

# 2. Initialize a workspace and validate its graph
./tools/init-workspace.sh --workspace /tmp/test --project_id test --title "Test"
./tools/validate-backlog-graph.sh --workspace /tmp/test test | jq '.valid'

# 3. Confirm dashboard produces valid output
./tools/project-dashboard.sh --workspace /tmp/test | jq '.ok'

# 4. Confirm git status is clean
git status

Risks and Constraints

Risk Mitigation
Determinism boundary. Timestamps and git HEAD introduce non-determinism. Timestamps are informational only, never used for ordering decisions.
Agent hallucination. LLM-generated artifacts may contain incorrect content. Schema validation catches structural errors. QA and Review stages provide content checks.
Cost and latency. Autonomous runner invokes Claude CLI per stage. --dry_run mode for testing. Scaffold adapter for development without API calls. Max step/agent call limits.
Data privacy. All data stays on the local filesystem. No outbound network calls from pipeline scripts. Dashboard binds to 127.0.0.1 only. Workspace permissions set to 700.
Single-machine limitation. No distributed execution. By design for the current maturity level. Filesystem-as-API is the intentional constraint.
No rollback mechanism. Status transitions are one-way writes. Append-only audit log provides full history. Artifacts are never overwritten. Safe reruns from current state.

Done

  • Deterministic 8-stage pipeline with role boundaries
  • Schema validation on all artifacts, tool outputs, input data, and reference schemas
  • Persistent agent identity with workload tracking and role enforcement
  • Structured work graph: epic hierarchy, DAG validation, cycle detection
  • Graph-aware scheduler: dependency satisfaction, parent blocking, epic completion
  • Write-time epic completion guard
  • Preflight graph validation in project driver
  • Task pack generation (deterministic, no LLM)
  • Autonomous multi-agent runner with stop/resume and audit logging
  • Project dashboard with dependency chains and agent workload
  • Ticket persistence with anti-truncation guards
  • External workspace model (pure engine, .claw/ state in target repos)
  • Workspace bootstrap (init-workspace) and patch application (apply-dev-patch)
  • Pluggable capability system with manifest-driven stage injection
  • Goal-driven mission layer (deterministic intent + stack detection)
  • Built-in capabilities: UX audit, security audit, performance audit
  • GitHub Actions CI (all tests green, zero failures)
  • additionalProperties: false on all schemas (governance rule enforced)
  • Cron-friendly drive loop with safe stop
  • Zero external dependencies
  • Phase 3: Artifact classification with semantic types
  • Phase 3: Global artifact index across projects and runs
  • Phase 3: Research workflow with structured findings
  • Phase 3: Agent memory persistence across runs
  • Phase 3: Cross-run knowledge retention in task packs
  • Phase 4: Run analytics engine (per-run metrics + project aggregates)
  • Phase 4: Self-evaluation (quality score, deviations, suggestions, memory write)
  • Phase 4: Workflow suggestions (recurring QA failures, bottlenecks, rejection rates, quality trends)
  • Phase 4: Gap scanner with auto-create backlog items
  • Phase 4: Agent performance profiles with recommended_agent in picker
  • Phase 5: Post-run lifecycle hooks (auto self-eval + gap scan)
  • Phase 5: Adaptive agent prompt (memory + suggestions in Claude Code prompt)
  • Phase 5: Agent assignment actuation (recommended_agent pick → drive → runner)
  • Phase 5: Adaptive JS drive loop (sleep, hooks, project filter, stop conditions)
  • Phase 5: Dashboard closed-loop status fields
  • Phase 6: Template-enriched adapter prompt (stage task files as primary instructions)
  • Phase 6: Prior artifact context injection (STAGE_ARTIFACT_DEPS, per-artifact truncation)
  • Phase 6: Validation retry loop (maxRetries with error feedback)
  • Phase 6: Post-implement auto-patch application (dry-run first, non-fatal)
  • Phase 6: Dashboard feature flags
  • Phase 7: artifactList scope fix in claudeCodeAdapter
  • Phase 7: Post-patch test execution (discoverTestCommand + runPostPatchTests)
  • Phase 7: Auto-commit after successful tests (opt-in, never pushes)
  • Phase 7: Backlog auto-completion (run done → backlog item done)
  • Phase 7: recommendAgent fallback in project driver
  • Phase 7: Claude adapter mock-based test coverage
  • Phase 7: Dashboard feature flags

License

MIT — see LICENSE.

About

Deterministic multi-agent orchestrator for software-development workflows. Schema-validated artifacts, retry-with-feedback, and local run analytics.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors