P1.3: surface Healthcare.gov as second case by MrBinnacle · Pull Request #13 · MrBinnacle/azimuth-testbed

MrBinnacle · 2026-05-27T16:10:47Z

What

Lifts the testbed from single-case Boeing to the n=2 framing locked on the launch board.

New `Healthcare.gov Pre-launch` prompt variant. Prompt text lifted verbatim from the canonical case study's pre-October-2013 input.
New `Case: Healthcare.gov` preamble in the left panel below the existing Boeing preamble. Same Decision / Outcome / What-to-look-for shape. Surfaces calibration: 5/6 recall, 0 false positives, 1 disclosed miss.
Orientation banner updated from singular Boeing to plural "two known-outcome failures."
README section renamed from "The Boeing methodology runs" → "Methodology calibration — two cases," with a Boeing subsection and a new Healthcare.gov subsection.

Preview

https://deploy-preview-p13--azimuth-testbed.netlify.app

Cold-read verified in deployed bundle:

`Case: Healthcare.gov` → 1
`Healthcare.gov Pre-launch` (variant in prompt list) → 1
`5/6 documented` (calibration score) → 1
`two known-outcome failures` (orientation banner) → 1

Claims-ledger compliance

"5/6 documented causes surfaced; 1 miss disclosed" matches the ledger's allowed wording for the Healthcare calibration row.
No "caught Healthcare.gov" or "predicted Healthcare.gov." The README spells out why over-claiming would be a credibility tax.
"Calibration exhibit, not a benchmark or validation" framing preserved on both surfaces.

Verified

`cd testbed && npm run build` → green (192.63 KB bundle, 566 ms)
Healthcare prompt text is the canonical input from the shipped case study; running on Opus 4.5 should reproduce DELAY PENDING EVIDENCE with the documented Critical Risks.

Do not merge until

Cold-read on preview confirms two preambles stack cleanly on both wide and narrow viewports.
Matthew approves. Merge to `main` + manual `netlify deploy --prod` ships to production (Git→Netlify still not wired per E2).

🤖 Generated with Claude Code

Lifts the testbed from a single-case (Boeing) demo to the n=2 framing locked in the launch board. The Healthcare case study already exists in MrBinnacle/azimuth/examples/case-study-healthcare-gov.md — this PR makes it runnable from the testbed and references it in the README. Changes - testbed/App.jsx PROMPT_VARIANTS: adds "Healthcare.gov Pre-launch" variant. Prompt text lifted verbatim from the case study's "Input (as presented, circa September 2013)" section so the verdict the user gets matches the documented run. - testbed/App.jsx left-panel preamble: adds a "Case: Healthcare.gov" preamble below the existing Boeing preamble. Same Decision / Outcome / What-to-look-for shape. Surfaces calibration score (5/6 recall, 0 false positives, 1 disclosed miss). - testbed/App.jsx orientation banner: updates the "what you're looking at" card from singular "Boeing's 2011 decision" to plural "two known-outcome failures — Boeing's 2011 decision and the October 2013 launch of Healthcare.gov". Updates the "this testbed is a calibration exhibit on a known-outcome failure" wording to "two known-outcome failures". - README.md: section title renamed from "The Boeing methodology runs" to "Methodology calibration — two cases" with a Boeing subsection and a new Healthcare.gov subsection. Healthcare subsection states the 5/6 recall, 0 false positives, 1 miss explicitly and links to the case study file in the skill repo. Claims-ledger compliance - "5/6 documented causes surfaced; 1 miss disclosed" wording matches the ledger's allowed form for the Healthcare.gov calibration row. - No "caught Healthcare.gov" or "predicted Healthcare.gov" wording. The README spells out explicitly why over-claiming would be a credibility tax. - "calibration exhibit, not a benchmark or validation" framing preserved on both surfaces. Verified - npm run build -> green (192.63 KB bundle) - The Healthcare prompt is the canonical input from the shipped case study; running it on Opus 4.5 should reproduce DELAY PENDING EVIDENCE with the documented Critical Risks. Do not merge until - Cold-read of the deploy preview confirms two preambles render cleanly in the left panel without overflow on the 1200px+ layout and the < 640px stacked layout. - Matthew approves (merge to main + manual netlify deploy --prod ships to production). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

MrBinnacle · 2026-06-05T19:54:43Z

Held pending Substack article publish. Disposition will be revisited after the launch sequence completes — Healthcare.gov as second case is not gating the Substack article (which only references Boeing) but is in scope for the post-launch fast-follow window. Not abandoned.

- .gitignore: add .day1-state.json, graphify-out/, drafts/ (paste-and-ship surfaces, not versioned product) - ci.yml: add weekly Monday 12:00 UTC cron trigger so npm audit runs against the locked deps without waiting for a push. Closes the false-negative trap where commit 58332db flipped CI red because a Critical vitest alert was published between pushes and the audit step only fires on push by default. Session-close housekeeping companions (no source changes, not in this commit): - Filed issue #19 (React 19 + plugin-react 6 + Vite 8 coordinated trio merge) - Filed issue #20 (CI Node 20 → 22 bump before 2026-06-16) - Closed PRs #1, #2, #4, #5 with @dependabot ignore-major directives + cross-ref to umbrella issue #19 - Posted held-pending-Substack-publish disposition on PRs #13, #14 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P1.3: surface Healthcare.gov as second case#13

P1.3: surface Healthcare.gov as second case#13
MrBinnacle wants to merge 1 commit into
mainfrom
launch/surface-healthcare-case

MrBinnacle commented May 27, 2026

Uh oh!

MrBinnacle commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MrBinnacle commented May 27, 2026

What

Preview

Claims-ledger compliance

Verified

Do not merge until

Uh oh!

MrBinnacle commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant