Preview Forge for Claude Code

Preview is all you need.

One line of idea → 26 AI-generated previews → pick with your eyes → frozen full-stack app.

The picture is the spec. SpecDD and TestDD only run on the mockup you approved. 144 Opus 4.7 agents · zero third-party services · two human clicks.

TDD drove code with tests. SpecDD drove code with specs. We put PreviewDD in front. Mockup-first, eyes-first decision-making — 144 Opus 4.7 agents turn one line of idea into a frozen full-stack app with only two human clicks.

The problem

You start any project without knowing what will get built. Specs go stale. Wireframes lie. By demo day half the assumptions were wrong.

Preview-Driven Development (PDD) flips it. Before any spec, before any code, the harness renders the project as 9 to 26 different mockups in parallel — each by a different Opus 4.7 persona pulling your idea in a different direction. You see what could be built. You select one.

Preview is all you need. The selection IS the spec.

Submission — Built with Opus 4.7 hackathon

Artifact	Link
🎥 Demo video	https://www.youtube.com/watch?v=_xHL8SZqfyI (2:59) — full walkthrough, problem statement through frozen app
💻 Repository	Two-Weeks-Team/PreviewForgeForClaudeCode
📝 Written summary (100–200 words)	See TL;DR below
📜 License	Apache-2.0 — fully open-source per hackathon rules
👥 Team	Two-Weeks-Team (≤2 members per rules)
🆕 New work only	Built from scratch during the hackathon window (Apr 21–28, 2026). See CHANGELOG.

TL;DR

Preview Forge turns one line of idea into a frozen, deployable full-stack app — by inverting the order of software development.

TDD drove code with tests. SpecDD drove code with specs. Preview Forge puts PreviewDD in front: before any spec or code is written, 26 Claude Opus 4.7 agents diverge into 26 single-file HTML mockups. You pick one with your eyes at Gate H1 (one human click). The picture becomes the contract every downstream agent honors.

The plugin runs 144 Opus 4.7 sub-agents organized into a 6-tier engineering organization (Ideation · Panels · Spec · Engineering · QA · Judges · Auditors), wired together by 15 /pf:* slash commands and a 4-layer cross-run memory (Reflexion pattern). SpecDD and TestDD then drive the build to a freeze threshold of ≥499/500. Two human clicks total — H1 (design pick), H2 (ship).

Built entirely on Anthropic-native primitives — Opus 4.7, Managed Agents, Memory Tool, Batch API, Files API, Context editing, Prompt caching, Claude Design. No third-party services in the plugin runtime. Apache-2.0 licensed.

Preview is all you need.

The 3-DD methodology

flowchart LR
    A["💡 One-line idea"] --> B["I1 Socratic interview<br/>4 required Q"]
    B --> C["① PreviewDD<br/>26 mockups diverge"]
    C --> D{{"🔒 Gate H1<br/>(human, 1 click)"}}
    D --> E["② SpecDD<br/>OpenAPI + nestia"]
    E --> F["③ TestDD<br/>Tests + Judges + Auditors"]
    F --> G{{"🚀 Gate H2<br/>(human, 1 click)"}}
    G --> H["📦 Frozen full-stack app"]

    style C fill:#d4a574,stroke:#7aa6c2,color:#000
    style E fill:#7aa6c2,stroke:#7aa6c2,color:#000
    style F fill:#84c984,stroke:#7aa6c2,color:#000

Cycle	Drives	Locked artifact
① Preview-Driven Development (PreviewDD) (new)	26 mockups before any spec	`chosen_preview.json` + `mockups/chosen.html`
② Spec-Driven Development (SpecDD)	OpenAPI drives implementation	`specs/openapi.yaml` + SHA-256 `.lock`
③ Test-Driven Development (TestDD)	Score ≥499/500 to freeze	`score/report.json` + `.frozen-hash`

All three cycles follow diverge → aggregate → lock. Two human gates, otherwise autonomous. Full v8.0 specification — 2,100+ lines, single HTML file.

From one prompt to a gallery — in 4 questions

You type one line. The plugin doesn't dispatch 26 advocates immediately — it asks 4 required questions (5–8 optional) to capture target persona, primary surface, killer feature, and must-have constraints. The answers compile to idea.spec.json — a structured ground truth every downstream agent honors.

"build a fun, cheerful lunch recommender for office workers"
        │
        ▼
┌──────────────────────────────────────────────────┐
│  I1 Idea Clarifier — 4 batched AskUserQuestion    │
│  • target_persona   • primary_surface             │
│  • killer_feature   • must_have_constraints       │
└──────────────────────────────────────────────────┘
        │
        ▼  idea.spec.json   (the picture's contract)
┌──────────────────────────────────────────────────┐
│  26 advocates diverge → gallery → you pick one    │
└──────────────────────────────────────────────────┘

Why it matters. Before v1.6, the same one-liner could mean "Slack bot" or "legal-deposition paralegal." Same words, different products. The Socratic interview makes divergence intentional creative reframing, not blind misalignment. Skip-interview is one click if you want the demo escape hatch.

Pixel-faithful delivery

The contract is the picture selected at Gate H1. Drift is detected by the Rule 9 idea-drift sentinel (hooks/idea-drift-detector.py) — block threshold 0.3, warn at 0.4. If the build wanders away from the approved mockup, the run pauses.

Layer-0 — the ten non-negotiable rules

The harness operates under ten contracts that no agent (including the supervisor) can override. They are enforced by Layer-0 hooks (PreToolUse / PostToolUse / Stop / SubagentStop). Layer-0 started at seven rules in v1.0.0 and has grown to ten as the harness shipped (most recent additions: Rule 9 idea-drift, Rule 10 English-only output).

Two human gates only — H1 (design pick) and H2 (ship). Everything else is autonomous.
Scope discipline — agents may not exceed their declared scope (department / file / phase).
Single source of truth per phase — each phase locks one artifact (idea.spec.json, chosen_preview.json, specs/openapi.yaml, score/report.json).
Adaptive thinking + xhigh effort wherever the action is one-shot and irreversible (freeze, deploy, schema lock).
Cost-regression sentinel — hooks/cost-regression.py pauses the run when token usage crosses the active profile's hard ceiling.
Two ways to ask — anchored gates the user knows are coming (H1, H2) and adaptive asks the harness fires on its own (Socratic, budget guard). Everything else: auto-decide.
Audit trail — every agent decision writes to the SQLite blackboard; runs are deterministically replayable from trace.jsonl.
All Opus 4.7 — every agent fixed to claude-opus-4-7; no Sonnet or Haiku fallback for plugin runtime.
Idea-drift detection — hooks/idea-drift-detector.py blocks runs where SpecDD wanders away from the H1 selection (block 0.3, warn 0.4).
Output language English — every artifact in the repo is English. Korean and other languages are permitted only as visual subtitles in the captured video.

Full Layer-0 specification — gates, scope, drift, output policy.

Quick install

# 1. Add this marketplace
/plugin marketplace add Two-Weeks-Team/PreviewForgeForClaudeCode

# 2. Install the plugin
/plugin install pf@two-weeks-team

# 3. Reload
/reload-plugins

# 4. Initialize memory + workspace permissions (first time per workspace)
/pf:bootstrap

# 5. Run (profile defaults to `standard` as of v1.4.0)
/pf:new "your one-line idea"

# …or pick a profile explicitly:
/pf:new "demo-class idea"     --profile=standard   # default — ~60k tok · 2×5 eng · 9 previews · SQLite · no Docker
/pf:new "real project"        --profile=pro         # ~250k tok · 3×5 eng · 18 previews · Postgres + Docker
/pf:new "production launch"   --profile=max         # ~600k tok · 5×5 eng · 26 previews · full CI/CD

Profiles (v1.4+)

Profile	Previews	Eng teams	DB	Container	Panels	SCC iter	P95 ceiling	Use for
standard (default)	9	2×5 (BE+FE)	SQLite	❌ none	keyword-trigger	3	~60k tok / 25 min	Local MVP · demo · prototyping
pro	18	3×5 (+DB)	Postgres (dev-prod parity)	Docker + compose	keyword-trigger + escalation	4	~250k tok / 70 min	Real projects
max	26	5×5 (all)	Postgres	Docker + CI/CD	always-on	5	~600k tok / 160 min	Production · baselines

--previews=N overrides the count (bounded by max_user_expand = 26).
--no-cache bypasses the PreviewDD-level cache (7 days for standard/pro, never cached for max).
Standard = local-first: npm install && npm run db:push && npm run dev — no Docker, no Postgres setup. DB lives at ~/.preview-forge/<project>/dev.db (outside repo tree for security).
Upgrade path: standard → pro via bash scripts/graduate.sh pro (additive; keeps your code, adds Dockerfile/compose/Postgres datasource).
Full spec: plugins/preview-forge/profiles/.

Profile escalation & cost-regression sentinel (v1.3+)

When you run standard but your idea mentions enterprise signals (Stripe, PII, HIPAA, SSO provider, SOC2, multi-tenant), the plugin recommends the right profile before PreviewDD burns tokens.

Evaluation precedence (highest wins):

Hard-require (Stripe / PII / HIPAA / auth-provider): any single hit forces upgrade. You cannot dismiss — false assurance is worse than friction. The min_distinct_categories=2 floor does NOT apply here.
Soft-suggest + category-floor (SOC2 / multi-tenant / B2B / scale): needs ≥2 distinct categories AND score ≥ threshold to ask via AskUserQuestion. Records your answer in ~/.preview-forge/escalation-history.json. If you decline, same signals won't re-prompt within 24h (anti-nagging).
Hint (weak signals, score < threshold but ≥ min-floor): shows "💡 Consider --profile=pro next time" in /pf:status, no interruption.

Categorical scoring (not raw keyword count) means "audit logging feature" in a generic marketing copy app won't false-positive.

Cost regression + drift detection. hooks/idea-drift-detector.py catches the failure where Gate H1 picks product A but SpecDD/Engineering drift to product B. Containment coefficient over token sets (no external ML deps). Block threshold 0.3, warn at 0.4. The P0-B cost-regression sentinel (hooks/cost-regression.py) compares cost-snapshot.json against the active profile's P95/hard ceiling every 30s. Hard breach triggers auto-pause + AskUserQuestion handoff.

What's new — v1.6 / v1.7 / v1.14 (shipped through semver v1.10.0+)

Terminology: "v1.6 audit" / "v1.7 audit" are feature umbrella names (issue #28 family / #29–#37). Each PR ships under its own release-please semver tag — the v1.6 schema landed in semver v1.6.0, B-1/B-3/A-4 (Phase 9, PR #51) landed in v1.10.0, etc. See CHANGELOG.md.

v1.6 — Socratic interview as ground truth (LESSON 0.7 fix). Before v1.6, 26 Advocates dispatched directly from the one-liner — and the failure mode in LESSON 0.7 played out: a one-liner could mean different products to different agents. v1.6 adds I1 Idea Clarifier between /pf:new and the 26 advocates. Three batched AskUserQuestion modals (10–12 fields total) produce idea.spec.json — structured ground truth (target_persona, primary_surface, jobs_to_be_done, killer_feature, must_have_constraints, non_goals, …) that every advocate receives. The PreviewDD cache key now includes idea_spec_hash, so the same one-liner with different Socratic answers gets a fresh advocate set.

v1.7 — 4 required questions, skip-interview, tiered fallback (Christensen + Kim-Mauborgne + Taleb). Hackathon demo feedback: 12 questions before seeing any output is too many. v1.7 trims:

B-1 — 4 required, 5–8 optional. Best path: 4 clicks to gallery. Fullest path: 12 questions for deep dive.
B-3 — Skip-interview button in Batch A. One click writes a 3-field stub and short-circuits to the v1.5.4 raw-idea path.
A-4 — _filled_ratio tiered fallback. The hard 0.5 gate is gone. ≥0.7 = high-confidence ground truth, 0.4–0.7 = hint, 0.2–0.4 = low-confidence, <0.2 = drop spec entirely.

Why "gallery-first." The flow inverts the SaaS-onboarding default of "configure → preview." Instead: answer 4 questions → see 9 / 18 / 26 mockups → pick one. The picture is the spec. SpecDD and TestDD only run on the picture you approved. (Godin: lead with the artifact, not the form.)

v1.14 — post-gate automation. H1 now auto-advances to SpecDD once chosen_preview.json.lock and design-approved.json exist. H2 now auto-launches the local preview server after ship approval, and /pf:preview handles manual re-open, stop, and status.

Updating & downgrading

# Check installed version
claude plugin list | grep -A2 pf@two-weeks-team

# Pull the latest manifest + plugin contents from the marketplace
/plugin marketplace update two-weeks-team

# Upgrade the plugin to the newest listed version
/plugin update pf@two-weeks-team     # if you have this subcommand
#   — or, if update is not available in your Claude Code version —
/plugin uninstall pf@two-weeks-team
/plugin install pf@two-weeks-team

# Reload so hooks, agents, and commands refresh
/reload-plugins

After updating, run pf check (or /pf:bootstrap once, then pf check) to confirm your local ~/.claude/preview-forge/memory/ is still intact — the update does not overwrite your LESSONS.md.

Downgrading:

/plugin uninstall pf@two-weeks-team
/plugin install pf@two-weeks-team@1.0.0    # any past version tag

Every release is signed via GitHub Releases.

Slash Commands

Preview Forge ships 15 slash commands under the /pf:* namespace:

🚀 Run lifecycle

Command	Purpose
`/pf:bootstrap`	Initialize plugin memory + seed workspace Bash permissions — first time per workspace
`/pf:new <idea>`	Start a new run (PreviewDD cycle begins)
`/pf:status`	Current run state, agent progress, blackboard
`/pf:retry <agent\|phase>`	Rerun a failed agent or stuck phase
`/pf:freeze`	Force Judges + Auditors evaluation (TestDD Stage 7)
`/pf:preview [run]`	Re-open, stop, or inspect the local preview server for a frozen run (auto-launched after H2)

🗳️ Decision gates

Command	Purpose
`/pf:design`	Gate H1 — Claude Design main / built-in Studio fallback
`/pf:panel`	Manually trigger 4-Panel (TP/BP/UP/RP) vote

📚 Assets & history

Command	Purpose
`/pf:gallery`	Browse / fork past runs
`/pf:replay <run>`	Deterministic replay from `trace.jsonl`
`/pf:seed`	Pre-verified demo idea bank (10)
`/pf:export <run>`	Package frozen run as tarball or Claude Code plugin

📊 Observability

Command	Purpose
`/pf:budget`	Cost dashboard — per-run / per-cycle / per-agent
`/pf:lessons`	Cross-run failure catalog (`LESSONS.md`)
`/pf:help`	Full 15-command reference + FAQ

Agent Organization

Preview Forge's 144 agents live in a 6-tier hierarchy + SQLite blackboard:

                        M1 Run Supervisor (Meta)
                               │
              ┌────────────────┼────────────────┐
              │                │                │
      M2 Cost Monitor     M3 Chief Eng PM   Software-Factory
       (tracking only)  (all dept leads)   Layer-0 Hooks
                               │
    ┌──────────┬───────────────┼────────────────┬─────────────┐
    │          │               │                │             │
 Ideation  4 Panels +       Spec Dept     5 Engineering     QA Dept +
  Dept      Mitigation       (9)          Teams (25)        SCC + Judges +
  (29)     Designer (45)                                    Auditors + Docs
                                                                (33)

Count: 3 Meta + 29 Ideation + 45 Panels + 9 Spec + 25 Engineering + 14 QA + 6 SCC + 5 Judges + 5 Auditors + 3 Docs = 144. All Opus 4.7, zero Sonnet/Haiku.

Requirements

Claude Code (latest) with Pro / Max / Team / Enterprise subscription. (No separate API key needed.)
Node.js 20 LTS + pnpm 9 (for scaffolded apps' build/test)
Docker 24+ (optional, for scaffolded apps' docker compose up verification)

What's inside the plugin

Area	Count	Summary
Agents	144	10 departments, 6 tiers, all Opus 4.7
Slash commands	15	`/pf:*` namespace
Hooks	7	`factory-policy`, `askuser-enforcement`, `auto-retro-trigger`, `idea-drift-detector`, `cost-regression`, `escalation-ledger`, `post-h1-signal`
Memory seed	3	`CLAUDE.md` + `PROGRESS.md` + `LESSONS.md` (with 3 bootstrap lessons)
Methodology	1	Layer-0 7 non-negotiable rules
Asset templates	12	Docker Compose, Caddyfile, nestia.config.ts, install.sh + 8 standard-profile build templates
JSON schemas	6	preview-card, panel-vote, score-report, pf-profile, idea-spec, spec-anchor-audit
Seed ideas	10	Pre-verified demo scenarios
CLI	1	`bin/pf`
Verification	1	`scripts/verify-plugin.sh`

Zero third-party services

Preview Forge's plugin runtime uses only Anthropic-native services:

Claude Code (Pro/Max) · Claude Opus 4.7 · Adaptive thinking · xhigh effort
Claude Managed Agents · Anthropic Memory Tool · Batch API · Files API · Citations
Context editing (context-management-2025-06-27) · Compaction (compact_20260112)
Prompt caching (1-hour TTL) · Fine-grained tool streaming · Task budgets (task-budgets-2026-03-13)
Claude Design (Gate H1 main) · Built-in Design Studio (Gate H1 fallback)

Not used in the plugin runtime or generated mockups: Figma, Google Fonts, external CDNs, hosted analytics services. All 26 mockups are single-file HTML with inline styles only.

Memory & cross-run learning

A 4-layer memory so mistakes don't repeat across runs:

memory/CLAUDE.md — session rules (read first every run)
memory/PROGRESS.md — run index (updated at run end)
memory/LESSONS.md — failure catalog (auto-appended by Auto-retro critic)
Anthropic Memory Tool (memory_20250818) — per-agent episodic memory (Reflexion pattern)

M1 Run Supervisor reads all four before every new run and pre-loads relevant lessons to every Department Lead.

Documentation

📘 Full v8.0 Specification — canonical, 2,100+ lines
📝 CHANGELOG — phase-by-phase build log
🛡️ Security Policy — reporting and scope
🤝 Contributing — LESSONS, new advocates, etc.
🪶 Layer-0 Rules — gates, scope, drift, and output policy

Verify install

git clone https://github.com/Two-Weeks-Team/PreviewForgeForClaudeCode
cd PreviewForgeForClaudeCode
bash scripts/verify-plugin.sh

License

Apache-2.0. See NOTICE for attribution.

_{Built with Claude Opus 4.7 · Powered by Claude Code Plugins · No third-party services in the plugin runtime · Apache-2.0}

_{Preview Forge · Two-Weeks-Team}

_{Preview is all you need.}

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.claude-plugin		.claude-plugin
.github		.github
claudedocs		claudedocs
docs		docs
plugins/preview-forge		plugins/preview-forge
runs		runs
scripts		scripts
tests		tests
tools		tools
.gitignore		.gitignore
.release-please-config.json		.release-please-config.json
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
package.json		package.json
pnpm-workspace.yaml		pnpm-workspace.yaml
preview-forge-proposal.html		preview-forge-proposal.html
requirements-dev.txt		requirements-dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Preview Forge for Claude Code

Preview is all you need.

One line of idea → 26 AI-generated previews → pick with your eyes → frozen full-stack app.

The problem

Submission — Built with Opus 4.7 hackathon

TL;DR

The 3-DD methodology

From one prompt to a gallery — in 4 questions

Pixel-faithful delivery

Layer-0 — the ten non-negotiable rules

Quick install

Profiles (v1.4+)

Slash Commands

🚀 Run lifecycle

🗳️ Decision gates

📚 Assets & history

📊 Observability

Agent Organization

Requirements

What's inside the plugin

Zero third-party services

Memory & cross-run learning

Documentation

Verify install

License

About

Uh oh!

Releases 26

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Preview Forge for Claude Code

Preview is all you need.

One line of idea → 26 AI-generated previews → pick with your eyes → frozen full-stack app.

The problem

Submission — Built with Opus 4.7 hackathon

TL;DR

The 3-DD methodology

From one prompt to a gallery — in 4 questions

Pixel-faithful delivery

Layer-0 — the ten non-negotiable rules

Quick install

Profiles (v1.4+)

Slash Commands

🚀 Run lifecycle

🗳️ Decision gates

📚 Assets & history

📊 Observability

Agent Organization

Requirements

What's inside the plugin

Zero third-party services

Memory & cross-run learning

Documentation

Verify install

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages