Skip to content

Latest commit

 

History

History
47 lines (42 loc) · 4.33 KB

File metadata and controls

47 lines (42 loc) · 4.33 KB

brainstorm — Project Intent

Goal

Eliminate single-model ideation bias by running structured multi-model deliberation loops — Diverge, Steer, Interrogate, Synthesize — across Claude, Codex, and Gemini, with rich personas, Socratic questioning, and human steering to produce higher-diversity, fatally-vetted idea sets.

Core Capabilities

  • Round 1 — Diverge: run three models in parallel, each assigned a persona (Explorer, Operator, Contrarian via topic-hash rotation); each produces 4-6 idea cards with assumptions, questions for the user, and anti-goals
  • Steering Checkpoint: Claude orchestrator clusters Round 1 output into 5-8 branches; user selects 2-3 branches or applies a bias preset (novelty / balanced / practical)
  • Round 2 — Socratic Interrogation: models receive branch summaries only (sparse communication) and output fatal risks, questions that would change the ranking, and strengthening moves
  • Synthesis: deterministic convergence — Best Bets (support >= 2, 0 fatal objections), Wild Cards (support = 1, high novelty), Open Questions, Next Experiments
  • Non-interactive / --quick mode: skips steering checkpoint, auto-selects max-support + max-novelty + max-disagreement branches; usable in CI or pipe contexts
  • Bias presets: --bias novelty, --bias practical, --bias balanced for automated branch selection without user interaction
  • Provider selection: --providers flag to restrict to a subset of claude/codex/gemini
  • Multi-platform deployment: Claude Code (slash command + skill), Codex CLI, Gemini CLI via nex plugin system

Non-Goals

  • Adaptive stopping based on convergence metrics (e.g., Simpson index — logged as telemetry only, never used for stopping in v1)
  • Separate critic swarm (distinct from the 3 deliberating models)
  • Six Thinking Hats persona preset (deferred to v2)
  • More than 2 external model rounds (fixed 2-round structure is intentional)
  • Dynamic persona authoring (personas are hardcoded: Explorer, Operator, Contrarian)
  • Session persistence and resume (each run is stateless)

Success Criteria

  • Heterogeneous model outputs produce measurably higher diversity than single-model runs (research baseline: ICLR 2025)
  • Rich personas yield +4.76 diversity score (d=2.88) vs +0.62 without personas (research baseline applied)
  • Socratic Round 2 is present; removing it would degrade output quality by -11.31% (MARS, AAAI 2026 baseline)
  • Human steering checkpoint achieves novelty accuracy of >=89% when present (baseline: 13.79% without)
  • All 8 research findings (#1-#8 from design doc table) implemented and verifiable in v1
  • TODO: No integration test suite exists yet (SKILL.md body is TODO stub) — needs confirmation once tests are written

Personas

  • AI Agent User (Developer / Knowledge Worker): invokes /brainstorm "topic" interactively in Claude Code, Codex CLI, or Gemini CLI for creative exploration of code, product, business, or strategy questions; expects a structured markdown report with ranked branches
  • Automation / CI Consumer: invokes /brainstorm --quick or with --bias preset in non-TTY contexts (pipelines, scripts); needs deterministic, unattended execution without a steering checkpoint
  • Multi-platform nex Plugin Installer: installs via nex install brainstorm or claude plugin add heurema/brainstorm; does not invoke the command directly but relies on the plugin system for lifecycle management