Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,11 @@ MultiTool.yaml
/worker/**
wrangler.toml
wrangler.json

# Scratch prompt location.
/prompt.md

# storybloq local session state (roadmap/tickets ARE tracked)
.story/snapshots/
.story/sessions/
.story/status.json
3 changes: 3 additions & 0 deletions .story/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
snapshots/
status.json
sessions/
26 changes: 26 additions & 0 deletions .story/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"features": {
"handovers": true,
"issues": true,
"reviews": true,
"roadmap": true,
"tickets": true
},
"language": "rust",
"project": "multitool",
"schemaVersion": 2,
"type": "cargo",
"version": 2,
"recipeOverrides": {
"stages": {
"TEST": {
"enabled": true,
"command": "cargo nextest run --workspace"
},
"BUILD": {
"enabled": true,
"command": "cargo build --workspace"
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Session handover — storybloq setup for MultiTool Checks

## What this project is

`.story/` was initialized for the **multitool** repo (Rust / cargo CLI, the `multi`
binary; mature v0.4.0 canary-deployment tool). The roadmap tracks the **MultiTool
Checks** feature — a new `multi check` subcommand that validates declared
non-functional ("ility") requirements by running AI-agent checks inside copy-on-write
sandboxes and collecting verdicts through an in-process `rmcp` MCP server. Active
branch: `robbie/mt-check`. The PRD lives at `prompt.md` (gitignored scratch file).

## How this roadmap was created

This storybloq roadmap is a **one-to-one mirror of the existing Linear project
"MultiTool Checks"** (team MULTI, project id 781c5b95-…). It was NOT independently
designed — Robbie had already authored the Linear project + tickets from `prompt.md`
and asked to mirror them.

- **Linear milestones → storybloq phases** (9): M0 · Subcommand skeleton, M1 ·
Discovery, M2 · Configuration & executor, M3 · Sandboxing, M4 · MCP result server,
M5 · Execution, M6 · Reporting & exit, Tests & docs, Future work (post-MVP).
- **Linear issues → storybloq tickets** (34, one-to-one): MULTI-1331..MULTI-1364.
Created in Linear-numeric order so T-001..T-034 line up with the issue order.
- **Each ticket's description carries its Linear ID, URL, and `gitBranchName`**
(`robbie/multi-XXXX`, verbatim from the Linear API). Pushing that branch to GitHub
auto-links the Linear issue. The branch is the linkage mechanism — storybloq has no
dedicated Linear field, so it lives in the description footer.
- **Epic preserved**: MULTI-1332 (the in-process MCP server) = **T-013**, a `feature`;
its 5 Linear sub-issues (MULTI-1344..1348) = T-014..T-018, set as storybloq
sub-tickets (`parentTicket: T-013`).

## Ticket-number map (storybloq ↔ Linear)

T-001→1331 · T-002→1333 · T-003→1334 · T-004→1335 · T-005→1336 · T-006→1337 ·
T-007→1338 · T-008→1339 · T-009→1340 · T-010→1341 · T-011→1342 · T-012→1343 ·
T-013→1332(epic) · T-014→1344 · T-015→1345 · T-016→1346 · T-017→1347 · T-018→1348 ·
T-019→1349 · T-020→1350 · T-021→1351 · T-022→1352 · T-023→1353 · T-024→1354 ·
T-025→1355 · T-026→1356 · T-027→1357 · T-028→1358 · T-029→1359 · T-030→1360 ·
T-031→1361 · T-032→1362 · T-033→1363 · T-034→1364.
(Stable map is the Linear ID in each ticket's description, not the T-number.)

## Type mapping convention

Linear labels don't map cleanly to storybloq's task/feature/chore, so: MVP
implementation issues → `task`; the MCP epic (T-013) and all 8 post-MVP "Future
work" items → `feature`; the two test issues + the docs issue → `chore`.

## Decisions captured this session

- **Markdown parser = `comrak`** (NOT pulldown-cmark), per Robbie. Baked into T-004's
description AND written back to Linear MULTI-1335 (its description previously left
comrak/pulldown-cmark as open "candidates"; now a firm Decision section).

## Config

- `recipeOverrides.stages`: TEST = `cargo nextest run --workspace`,
BUILD = `cargo build --workspace`. WRITE_TESTS/VERIFY left off (CLI, no dev server).
- `.gitignore`: added `.story/snapshots/`, `.story/sessions/`, `.story/status.json`
(roadmap + tickets are tracked).

## State / next steps

- All 34 tickets are `open`; nothing started. Implementation order follows the phases
(M0 → M6, then Tests & docs; Future work is post-MVP backlog).
- Natural first ticket: **T-001** (wire up `multi check` subcommand + phase skeleton),
which everything else builds on.
- If new Linear issues are added to the project later, mirror them the same way
(one ticket, Linear ID + branch in the description, correct phase).
76 changes: 76 additions & 0 deletions .story/roadmap.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
{
"blockers": [],
"date": "2026-06-23",
"phases": [
{
"description": "Wire `multi check` into the CLI and stand up the phase-orchestration skeleton (configuration → discovery → execution → reporting) with error/exit plumbing. Mirrors the existing `Run` command. Linear milestone: M0 · Subcommand skeleton.",
"id": "m0-subcommand-skeleton",
"label": "M0",
"name": "M0 · Subcommand skeleton",
"summary": "Wire `multi check` + phase-orchestration skeleton"
},
{
"description": "Discover `CHECKS.md` files, parse them to a Markdown AST (comrak), and extract the requirement/check model (anonymous checks + discovery-time validation). Produces the `Vec<Requirement>` consumed by execution. Linear milestone: M1 · Discovery.",
"id": "m1-discovery",
"label": "M1",
"name": "M1 · Discovery",
"summary": "Discover + parse CHECKS.md into Vec&lt;Requirement&gt;"
},
{
"description": "The hardcoded-but-injected configuration phase and the boxed agent-executor trait abstraction, plus the concrete `claude -p` executor. Linear milestone: M2 · Configuration &amp; executor.",
"id": "m2-configuration-executor",
"label": "M2",
"name": "M2 · Configuration &amp; executor",
"summary": "Injected config + boxed CheckExecutor + claude -p"
},
{
"description": "Per-check copy-on-write filesystem sandboxing. macOS-only for the MVP, behind a `cfg`-gated boxed trait so other OSes can be added later. Linear milestone: M3 · Sandboxing.",
"id": "m3-sandboxing",
"label": "M3",
"name": "M3 · Sandboxing",
"summary": "Per-check CoW sandbox (macOS, cfg-gated trait)"
},
{
"description": "The trustworthy result-reporting guardrail: one in-process `rmcp` server on a localhost port (dedicated tokio task, never a subprocess) exposing the `report-check-result` tool across N per-check endpoints. Built as a parent epic with sub-issues. Linear milestone: M4 · MCP result server.",
"id": "m4-mcp-result-server",
"label": "M4",
"name": "M4 · MCP result server",
"summary": "In-process rmcp result server, N per-check endpoints"
},
{
"description": "The parallel execution phase: for each check, create a sandbox, dispatch the agent against its MCP endpoint, reconcile the reported result, and aggregate checks into per-requirement verdicts via logical AND. Linear milestone: M5 · Execution.",
"id": "m5-execution",
"label": "M5",
"name": "M5 · Execution",
"summary": "Parallel execution + reconcile + AND aggregation"
},
{
"description": "Render results and set the process exit code: green/red requirement titles, failing checks in red with evidence, passing checks omitted; exit 0 iff all requirements satisfied, else 1. Linear milestone: M6 · Reporting &amp; exit.",
"id": "m6-reporting-exit",
"label": "M6",
"name": "M6 · Reporting &amp; exit",
"summary": "Colored report + exit code (0 pass / 1 fail)"
},
{
"description": "Cross-cutting test infrastructure (a fake executor so the pipeline can run without invoking `claude`), unit tests for AST extraction, an end-to-end pipeline test, and user-facing docs. Linear milestone: Tests &amp; docs.",
"id": "tests-docs",
"label": "TESTS",
"name": "Tests &amp; docs",
"summary": "Fake executor, AST unit tests, E2E test, docs"
},
{
"description": "Post-MVP work explicitly out of scope for the initial release, captured so nothing is lost. These are real tickets but carry no roadmap label. Linear milestone: Future work (post-MVP).",
"id": "future-work",
"label": "FUTURE",
"name": "Future work (post-MVP)",
"summary": "Post-MVP backlog (shell checks, more OSes, hooks…)"
},
{
"description": "Initial project setup.",
"id": "p0",
"label": "PHASE 0",
"name": "Setup"
}
],
"title": "multitool"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-001.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Introduce the `multi check` subcommand and the top-level orchestration skeleton the rest of the project fills in. Add a `CheckSubcommand` (clap `Args`) under `src/config/check/`, exported from `src/config/mod.rs`, starting with a working-directory arg (default `.`). Land the configuration → discovery → execution → reporting phases as stubs; mirrors the existing `Run` command (`src/cmd/run`, `src/config/run`).\n\n———\nMirrors Linear **MULTI-1331** · git branch `robbie/multi-1331`\nhttps://linear.app/wack-incorporated/issue/MULTI-1331/wire-up-the-multi-check-subcommand-and-phase-orchestration-skeleton",
"id": "T-001",
"order": 10,
"phase": "m0-subcommand-skeleton",
"status": "open",
"title": "Wire up the `multi check` subcommand and phase-orchestration skeleton",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-002.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Define the in-memory representation produced by discovery and consumed by execution: `Requirement { filepath: PathBuf, title: String, checks: Vec<Check> }` and `Check { title: String, prompt: String }`. Titles are NOT unique across the set; requirements/checks are grouped by declaring file. `checks` is guaranteed non-empty after M1 validation.\n\n———\nMirrors Linear **MULTI-1333** · git branch `robbie/multi-1333`\nhttps://linear.app/wack-incorporated/issue/MULTI-1333/define-the-requirement-and-check-domain-types",
"id": "T-002",
"order": 10,
"phase": "m1-discovery",
"status": "open",
"title": "Define the `Requirement` and `Check` domain types",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-003.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Recursively scan the working directory (and subdirectories) for files named exactly `CHECKS.md`, producing the list of files to parse. Use the `ignore` crate (already a dependency — the ripgrep walker) for a fast, parallel, gitignore-aware walk rooted at the working directory. Default to respecting `.gitignore`; document the decision.\n\n———\nMirrors Linear **MULTI-1334** · git branch `robbie/multi-1334`\nhttps://linear.app/wack-incorporated/issue/MULTI-1334/recursively-discover-checksmd-files-from-the-working-directory",
"id": "T-003",
"order": 20,
"phase": "m1-discovery",
"status": "open",
"title": "Recursively discover `CHECKS.md` files from the working directory",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-004.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Parse each discovered `CHECKS.md` into a Markdown AST that the extraction step walks. **Decision: use `comrak`** (CommonMark, AST node tree) — NOT pulldown-cmark — because its node tree is the easier fit for finding H1/H2 headers and capturing the text between them. Add `comrak` as a dependency (none present today).\n\n———\nMirrors Linear **MULTI-1335** · git branch `robbie/multi-1335`\nhttps://linear.app/wack-incorporated/issue/MULTI-1335/parse-checksmd-into-a-markdown-ast",
"id": "T-004",
"order": 30,
"phase": "m1-discovery",
"status": "open",
"title": "Parse `CHECKS.md` into a Markdown AST",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-005.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Walk a single file's AST and produce its `Requirement`s with their `Check`s. Requirement = an H1 whose text matches `^(Requirement|Req) ` (`Req` is an alias); the remainder of the line is the title. Check = an H2 whose text matches `^Check `, associated with the nearest preceding requirement; the Markdown beneath it is the prompt.\n\n———\nMirrors Linear **MULTI-1336** · git branch `robbie/multi-1336`\nhttps://linear.app/wack-incorporated/issue/MULTI-1336/extract-requirements-and-checks-by-walking-the-markdown-ast",
"id": "T-005",
"order": 40,
"phase": "m1-discovery",
"status": "open",
"title": "Extract requirements and checks by walking the Markdown AST",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-006.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "When a requirement declares no `## Check`, its prose body becomes a single anonymous check that inherits the requirement's title. Produces one `Check { title: <requirement title>, prompt: <prose body> }`.\n\n———\nMirrors Linear **MULTI-1337** · git branch `robbie/multi-1337`\nhttps://linear.app/wack-incorporated/issue/MULTI-1337/infer-anonymous-checks-from-requirement-prose",
"id": "T-006",
"order": 50,
"phase": "m1-discovery",
"status": "open",
"title": "Infer anonymous checks from requirement prose",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-007.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Validate the extracted model and assemble the final `Vec<Requirement>` discovery returns. Errors (not warnings): orphan check — a `## Check` with no preceding requirement in its file; checkless requirement — a requirement with no explicit checks AND no prose to promote to an anonymous check. Surface as miette diagnostics. Joins the parallel per-file parse results.\n\n———\nMirrors Linear **MULTI-1338** · git branch `robbie/multi-1338`\nhttps://linear.app/wack-incorporated/issue/MULTI-1338/validate-discovery-results-and-assemble-the-requirement-set",
"id": "T-007",
"order": 60,
"phase": "m1-discovery",
"status": "open",
"title": "Validate discovery results and assemble the requirement set",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-008.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Define the abstraction over \"run one check's agent\" — the seam that lets us swap `claude -p` for a Claude Code SDK (or other providers) later without touching execution. Per the spec it MUST be a boxed `#[async_trait]` trait object for dynamic dispatch, following the repo convention (`BoxedIngress`/`BoxedMonitor`/`BoxedPlatform`).\n\n———\nMirrors Linear **MULTI-1339** · git branch `robbie/multi-1339`\nhttps://linear.app/wack-incorporated/issue/MULTI-1339/define-the-boxed-agent-executor-trait-checkexecutor",
"id": "T-008",
"order": 10,
"phase": "m2-configuration-executor",
"status": "open",
"title": "Define the boxed agent-executor trait (`CheckExecutor`)",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-009.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "The concrete `CheckExecutor` for the MVP: shell out to the Claude Code CLI via `claude -p` with the assembled prompt/instructions, the `haiku` model family, and `--mcp-config` pointing at this check's dedicated endpoint (payload from M4). Run with the sandbox directory (M3) as the working directory; capture exit status/stderr into an `AgentOutcome`.\n\n———\nMirrors Linear **MULTI-1340** · git branch `robbie/multi-1340`\nhttps://linear.app/wack-incorporated/issue/MULTI-1340/implement-the-claude-p-claude-code-executor",
"id": "T-009",
"order": 20,
"phase": "m2-configuration-executor",
"status": "open",
"title": "Implement the `claude -p` Claude Code executor",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-010.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "The configuration phase: hardcoded for the MVP (no env/file loading) but its values are dependency-injected forward rather than read at point of use, so swapping providers later is a config change not a rewrite. Define a `Config` carrying provider/model-provider URL, model (default the `haiku` family), and effort; construct the concrete `BoxedExecutor` from it and inject into execution.\n\n———\nMirrors Linear **MULTI-1341** · git branch `robbie/multi-1341`\nhttps://linear.app/wack-incorporated/issue/MULTI-1341/implement-the-hardcoded-configuration-phase-and-dependency-injection",
"id": "T-010",
"order": 30,
"phase": "m2-configuration-executor",
"status": "open",
"title": "Implement the hardcoded configuration phase and dependency injection",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-011.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Each check runs inside a copy-on-write filesystem sandbox so an agent can read/modify the tree freely without corrupting the real working directory. Define a boxed `#[async_trait] Sandbox: Send + Sync` trait (mirroring `BoxedIngress` etc.) plus `cfg`-gated platform selection. The macOS implementation is a separate ticket.\n\n———\nMirrors Linear **MULTI-1342** · git branch `robbie/multi-1342`\nhttps://linear.app/wack-incorporated/issue/MULTI-1342/define-the-copy-on-write-sandbox-trait-with-cfg-gated-platform",
"id": "T-011",
"order": 10,
"phase": "m3-sandboxing",
"status": "open",
"title": "Define the copy-on-write sandbox trait with `cfg`-gated platform selection",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-012.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Implement the `Sandbox` trait on macOS using APFS copy-on-write cloning so each check gets a near-instant, space-efficient clone of the working tree. Prefer `libc::clonefile(2)` (`libc` already a dependency) for a direct `clonefile`/`clonefileat` call over `cp -c`. Place clones under a temp root; clean up on drop.\n\n———\nMirrors Linear **MULTI-1343** · git branch `robbie/multi-1343`\nhttps://linear.app/wack-incorporated/issue/MULTI-1343/implement-the-macos-apfs-copy-on-write-sandbox",
"id": "T-012",
"order": 20,
"phase": "m3-sandboxing",
"status": "open",
"title": "Implement the macOS APFS copy-on-write sandbox",
"type": "task"
}
12 changes: 12 additions & 0 deletions .story/tickets/T-013.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Epic. The result-reporting guardrail. Because agents are nondeterministic, do NOT trust stdout or sentinel files — every agent reports its verdict by calling a single MCP tool, `report-check-result`, served by ONE in-process MCP server built with `rmcp`, bound to a localhost port, running on a dedicated tokio task within the CLI process (never a subprocess). Decomposed into the sub-tickets in this phase.\n\n———\nMirrors Linear **MULTI-1332** (parent epic) · git branch `robbie/multi-1332`\nhttps://linear.app/wack-incorporated/issue/MULTI-1332/in-process-mcp-result-reporting-server",
"id": "T-013",
"order": 10,
"phase": "m4-mcp-result-server",
"status": "open",
"title": "In-process MCP result-reporting server (epic)",
"type": "feature"
}
13 changes: 13 additions & 0 deletions .story/tickets/T-014.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"blockedBy": [],
"completedDate": null,
"createdDate": "2026-06-23",
"description": "Define the single MCP tool every agent uses to report its verdict, and the Rust types behind it. `report-check-result`: `success: bool` (required — true = satisfied), `evidence?: String` (optional — explanation of how the agent concluded). Define the serde input schema and the corresponding server-side result type.\n\n———\nMirrors Linear **MULTI-1344** (sub-issue of MULTI-1332) · git branch `robbie/multi-1344`\nhttps://linear.app/wack-incorporated/issue/MULTI-1344/define-the-report-check-result-tool-contract-and-result-types",
"id": "T-014",
"order": 20,
"parentTicket": "T-013",
"phase": "m4-mcp-result-server",
"status": "open",
"title": "Define the `report-check-result` tool contract and result types",
"type": "task"
}
Loading
Loading