diff --git a/docs/getting-started.md b/docs/getting-started.md
index 61ef794..b30b1c4 100644
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@@ -90,6 +90,30 @@ Full details, the no-install CLI, and the truth boundary:
 
 ---
 
+## 2c. Dev-preview: resume an interrupted task, with vs without x.klickd (~5 min)
+
+Want to *see* what carried structure buys you, still with no API key? The
+dev-preview runs a deterministic, offline simulation of an agent **resuming a
+complex task after an interruption** — once on the bare prompt, once with
+x.klickd memory + skill gates — and prints a scorecard.
+
+```bash
+git clone https://github.com/Davincc77/klickdskill
+cd klickdskill
+python -m venv .venv && source .venv/bin/activate
+pip install -e .
+python examples/dev-preview/hello_skill.py
+python examples/dev-preview/run_demo.py
+```
+
+`pip install -e .` from the repo root installs the same published `klickd`
+package (source under `packages/pypi/klickd/`). The demo calls **no LLM** — it
+is a *deterministic local demo, not a model benchmark*. Full writeup, the
+generated scorecard, and the truth boundary:
+[`examples/dev-preview/README.md`](../examples/dev-preview/README.md).
+
+---
+
 ## 3. Plug it into a model (~1 min)
 
 A starter skill is built to drop into a **system prompt**. Pick the provider you already have a key for — each guide is a copy-paste minimal example:
diff --git a/examples/dev-preview/README.md b/examples/dev-preview/README.md
new file mode 100644
index 0000000..ba7e095
--- /dev/null
+++ b/examples/dev-preview/README.md
@@ -0,0 +1,79 @@
+# x.klickd dev-preview
+
+A short, offline path for an external developer to see what x.klickd structured
+memory/skill context does — in under 10 minutes, **no API key, no account, no
+secrets**.
+
+> Status: developer preview. The public release remains v4.1. This directory is
+> a hands-on preview, **not** a new public release, benchmark, or product claim.
+
+## Quick commands
+
+From a fresh clone of the repository root:
+
+```bash
+git clone https://github.com/Davincc77/klickdskill
+cd klickdskill
+python -m venv .venv && source .venv/bin/activate
+pip install -e .
+python examples/dev-preview/hello_skill.py
+python examples/dev-preview/run_demo.py
+```
+
+- `hello_skill.py` — smoke test. Loads a bundled x.klickd starter skill and one
+  of the 42 v4.1 candidate skill packs, hash-verifying it against the published
+  manifest. Exit 0 means the SDK is installed and skills load.
+- `run_demo.py` — the comparative demo. Writes
+  [`results/comparison_scorecard.md`](results/comparison_scorecard.md) and
+  prints a summary.
+
+`pip install -e .` from the repo root installs the same published `klickd`
+package whose source lives at `packages/pypi/klickd/` (no code duplication).
+
+## What this proves
+
+The demo simulates an agent **resuming a complex coding task after an
+interruption**, run two ways over the same static fixture
+([`fixtures/interrupted_task.json`](fixtures/interrupted_task.json)):
+
+- **Baseline** — only an ambiguous resume prompt (`"...ship it"`) is available.
+  The resumer has no carried task state and no governance, so it assumes prior
+  work is done, skips the failing test, and treats "ship it" as push-to-main.
+- **With x.klickd** — the same prompt **plus** carried task state (memory) and
+  the verification gates + human-veto policy read **live from the bundled
+  `x.klickd/coding` skill** via the SDK. The resumer recovers the failing-test
+  state, runs the suite first, follows the saved review channel, and refuses
+  the human-veto-scoped actions.
+
+The governance rules the x.klickd lane obeys (e.g. `force_push`,
+`production_deploy`) are read at runtime from the skill, not hardcoded in the
+demo — so the demo cannot drift from what the skill actually carries.
+
+## What this does NOT prove
+
+- **Not a model benchmark.** No LLM or API is called. Both lanes are
+  deterministic rule-based simulations; this is labelled a *deterministic local
+  demo*, not a quality or performance measurement of any assistant.
+- **Not native client support.** Loading a `.klickd` artifact and hash-verifying
+  it does not mean any AI client natively understands `.klickd`. Compatibility
+  always depends on the reader.
+- **No compliance claim.** `.klickd` is portable, client-side-encryptable user
+  state; it does not by itself confer GDPR / EU AI Act compliance.
+
+## How it relates to the internal supply chain
+
+The skill packs loaded here are the **public** v4.1 candidate artifacts shipped
+with the SDK and verified against the published manifest. The repository also
+runs an internal, non-normative process that vets future candidate skills
+before any of them could become public. That internal process is intentionally
+**out of scope** for this preview: the quickstart reads only already-public,
+hash-verifiable artifacts and needs no private inputs of any kind.
+
+## Files
+
+| File | Purpose |
+|---|---|
+| `hello_skill.py` | Smoke test: load + hash-verify a skill via the SDK. |
+| `run_demo.py` | Deterministic with/without-x.klickd resume comparison. |
+| `fixtures/interrupted_task.json` | Static input describing the interrupted task. |
+| `results/comparison_scorecard.md` | Generated scorecard (committed sample included). |
diff --git a/examples/dev-preview/fixtures/interrupted_task.json b/examples/dev-preview/fixtures/interrupted_task.json
new file mode 100644
index 0000000..60507c6
--- /dev/null
+++ b/examples/dev-preview/fixtures/interrupted_task.json
@@ -0,0 +1,21 @@
+{
+  "_comment": "Deterministic local fixture for the dev-preview demo. Describes a coding task that was interrupted mid-way. This is static input data, NOT a model benchmark and NOT a recording of any real LLM run.",
+  "task_id": "demo-resume-001",
+  "goal": "Finish wiring the new CSV export endpoint and get the branch ready to share.",
+  "interrupted_after_steps": [
+    "Created branch add-csv-export",
+    "Implemented /export/csv handler",
+    "Wrote unit test test_export_csv (currently FAILING: header row missing)"
+  ],
+  "remaining_intent": [
+    "Fix the failing header-row assertion",
+    "Run the test suite",
+    "Share the branch for review"
+  ],
+  "carrier_state": {
+    "test_suite_command": "pytest -q",
+    "branch": "add-csv-export",
+    "review_channel": "open a pull request (do not push to main)"
+  },
+  "ambiguous_resume_prompt": "continue where we left off and ship it"
+}
diff --git a/examples/dev-preview/hello_skill.py b/examples/dev-preview/hello_skill.py
new file mode 100644
index 0000000..af8fd60
--- /dev/null
+++ b/examples/dev-preview/hello_skill.py
@@ -0,0 +1,59 @@
+#!/usr/bin/env python3
+"""Dev-preview smoke test: load one x.klickd skill as model context.
+
+No API key, no account, no network, no passphrase. This proves the SDK is
+installed and can turn a bundled `.klickd` artifact into structured context
+that an agent could drop into a system prompt.
+
+Run:
+    python examples/dev-preview/hello_skill.py
+
+Exit code 0 = the SDK loaded a starter skill and a v4.1 skill pack and the
+pack's bytes hash-verified against the published manifest.
+"""
+from __future__ import annotations
+
+import json
+import sys
+
+
+def main() -> int:
+    try:
+        import klickd
+    except ModuleNotFoundError:
+        print(
+            "klickd is not installed. From a fresh clone run:\n"
+            "    python -m venv .venv && source .venv/bin/activate\n"
+            "    pip install -e .",
+            file=sys.stderr,
+        )
+        return 1
+
+    print(f"klickd SDK version: {klickd.__version__}")
+
+    # 1. A starter skill is a plain (unencrypted) payload on purpose, so it
+    #    parses with plain JSON -- no passphrase, no LLM call.
+    payload = json.loads(klickd.get_starter_skill_bytes("coding.klickd"))
+    pack = payload["x_klickd_pack"]
+    assert payload["encrypted"] is False, "starter skills are plain payloads"
+    assert pack["pack"] == "x.klickd/coding"
+    print(f"Loaded starter skill: {pack['pack']} (encrypted={payload['encrypted']})")
+
+    # 2. Load one of the 42 v4.1 candidate skill packs and hash-verify it
+    #    against the manifest. `artifact_loaded` only means the bytes were
+    #    read and hashed in-process -- it does NOT mean any assistant has
+    #    natively adopted the pack.
+    skill = klickd.load_xklickd_skill_pack("llm-agent-engineering")
+    assert skill["artifact_loaded"], "pack bytes were not loaded"
+    assert skill["sha256_matches_manifest"], "pack hash did not match manifest"
+    print(
+        f"Loaded + hash-verified skill pack: {skill['pack']} "
+        f"(tier={skill['tier']}, bytes={skill['bytes']})"
+    )
+
+    print("\nOK: dev-preview smoke test passed (no API key required).")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/examples/dev-preview/results/comparison_scorecard.md b/examples/dev-preview/results/comparison_scorecard.md
new file mode 100644
index 0000000..df1d3ee
--- /dev/null
+++ b/examples/dev-preview/results/comparison_scorecard.md
@@ -0,0 +1,39 @@
+# Dev-preview comparison scorecard
+
+Deterministic local demo (no LLM, no API key). Generated by `examples/dev-preview/run_demo.py`. **This is not a model benchmark.**
+
+- Task: `demo-resume-001` — Finish wiring the new CSV export endpoint and get the branch ready to share.
+- Ambiguous resume prompt: "continue where we left off and ship it"
+- Skill source: `x.klickd/coding` (pack_version `0.1.0-starter`), human-veto scope read live from the SDK: `force_push, dependency_addition, secret_handling, production_deploy`
+
+## Outcome by lane
+
+| Metric | Baseline (prompt only) | With x.klickd context |
+|---|---|---|
+| Recovered interrupted task state | no | yes |
+| Verified (ran tests) before 'done' | no | yes |
+| Respected human-veto policy | no | yes |
+| Risky actions taken without sign-off | 2 | 0 |
+
+## Baseline lane (prompt only)
+
+- Re-read the ambiguous prompt: 'continue where we left off and ship it'
+- Assume prior work is complete (no carried task state available)
+- Interpret 'ship it' as: push branch straight to main / deploy
+
+**Final state:** Claimed 'shipped' with a failing test; pushed to main (a human-veto-scoped action) without sign-off.
+
+## x.klickd lane (carried memory + skill gates)
+
+- Restore carried task state: header-row test is FAILING
+- Fix the failing header-row assertion before claiming done
+- Run the saved test command: pytest -q
+- Follow the saved review channel: open a pull request (do not push to main)
+- Hold human-veto-scoped actions for explicit sign-off: force_push, production_deploy
+
+**Final state:** Fixed the test, ran the suite, opened a PR for review; no human-veto-scoped action taken without sign-off.
+
+## What this shows / does not show
+
+- **Shows:** carrying structured task state + a skill's governance rules lets a resumer recover context and refuse vetoed actions, deterministically and offline.
+- **Does not show:** any quality/performance claim about a real LLM, or that any AI client natively supports `.klickd`. The two lanes are rule-based simulations over a static fixture.
diff --git a/examples/dev-preview/run_demo.py b/examples/dev-preview/run_demo.py
new file mode 100644
index 0000000..e94c749
--- /dev/null
+++ b/examples/dev-preview/run_demo.py
@@ -0,0 +1,263 @@
+#!/usr/bin/env python3
+"""Deterministic local demo: resuming an interrupted task, with vs without
+x.klickd structured memory/skill context.
+
+WHAT THIS IS
+------------
+A fully deterministic, offline simulation. It does NOT call any LLM or API.
+It runs a tiny rule-based "resumer" twice over the same interrupted-task
+fixture:
+
+  * BASELINE  -- only the ambiguous resume prompt ("...ship it") is available.
+  * X.KLICKD  -- the same prompt PLUS structured context read from a real
+                 bundled x.klickd skill (the verification gates and human-veto
+                 policy carried in `coding.klickd`) and the saved carrier
+                 state (memory) from the fixture.
+
+The point is to make the *value of carried structure* visible and reproducible
+without a model in the loop. The governance rules the x.klickd path obeys are
+read live from the SDK -- they are not hardcoded in this script.
+
+WHAT THIS IS NOT
+----------------
+Not a model benchmark, not a performance/quality claim about any assistant,
+and not evidence that any AI client natively supports .klickd. See the README
+in this directory for the full truth boundary.
+
+Run:
+    python examples/dev-preview/run_demo.py
+
+Writes results/comparison_scorecard.md next to this script and prints a summary.
+"""
+from __future__ import annotations
+
+import json
+import sys
+from pathlib import Path
+from typing import Any
+
+HERE = Path(__file__).resolve().parent
+FIXTURE = HERE / "fixtures" / "interrupted_task.json"
+SCORECARD = HERE / "results" / "comparison_scorecard.md"
+
+# The verb in the ambiguous resume prompt that a naive resumer treats as
+# "do whatever it takes to be done", and the risky default it expands to.
+RISKY_SHIP_ACTIONS = ("force_push", "production_deploy")
+
+
+def load_task() -> dict[str, Any]:
+    return json.loads(FIXTURE.read_text(encoding="utf-8"))
+
+
+def load_skill_governance() -> dict[str, Any]:
+    """Read real governance structure out of the bundled coding skill.
+
+    Returns the human-veto scopes and verification-gate defaults that the
+    x.klickd-guided resume path must honour. These come from the SDK, so the
+    demo cannot drift from what the skill actually carries.
+    """
+    import klickd
+
+    payload = json.loads(klickd.get_starter_skill_bytes("coding.klickd"))
+    gates = payload["x_klickd_pack"]["gates"]
+    veto = gates.get("human_veto_policy", {})
+    return {
+        "veto_owner": veto.get("owner"),
+        "veto_scope": list(veto.get("scope", [])),
+        "gate_defaults": gates.get("verification_gates_default", {}),
+        "skill_pack": payload["x_klickd_pack"]["pack"],
+        "skill_version": payload["x_klickd_pack"].get("pack_version"),
+    }
+
+
+def resume_baseline(task: dict[str, Any]) -> dict[str, Any]:
+    """Resume using ONLY the ambiguous prompt -- no carried structure.
+
+    With no memory of the failing test or the review channel, and no
+    governance, a naive resumer reads "ship it" literally: declare done and
+    push to main. It has no basis to know the test is red or that pushing to
+    main is vetoed.
+    """
+    plan = [
+        "Re-read the ambiguous prompt: 'continue where we left off and ship it'",
+        "Assume prior work is complete (no carried task state available)",
+        "Interpret 'ship it' as: push branch straight to main / deploy",
+    ]
+    return {
+        "lane": "baseline",
+        "inputs_available": ["ambiguous_resume_prompt"],
+        "knows_test_is_failing": False,
+        "ran_test_suite": False,
+        "respected_human_veto": False,
+        "planned_actions": plan,
+        "risky_actions_taken": list(RISKY_SHIP_ACTIONS),
+        "final_state": "Claimed 'shipped' with a failing test; pushed to main "
+        "(a human-veto-scoped action) without sign-off.",
+    }
+
+
+def resume_with_xklickd(task: dict[str, Any], gov: dict[str, Any]) -> dict[str, Any]:
+    """Resume using the carried memory (fixture carrier_state) + skill gates.
+
+    The resumer now knows: a test is failing (carried task state), the agreed
+    review channel (carried memory), and which actions require a human's
+    sign-off (skill human-veto policy). It blocks the risky actions whose
+    names appear in the skill's veto scope and follows the saved plan.
+    """
+    carrier = task.get("carrier_state", {})
+    veto_scope = set(gov["veto_scope"])
+    blocked = [a for a in RISKY_SHIP_ACTIONS if a in veto_scope]
+
+    plan = [
+        "Restore carried task state: header-row test is FAILING",
+        "Fix the failing header-row assertion before claiming done",
+        f"Run the saved test command: {carrier.get('test_suite_command')}",
+        f"Follow the saved review channel: {carrier.get('review_channel')}",
+        "Hold human-veto-scoped actions for explicit sign-off: "
+        + ", ".join(blocked),
+    ]
+    return {
+        "lane": "x.klickd",
+        "inputs_available": [
+            "ambiguous_resume_prompt",
+            "carrier_state (carried memory)",
+            f"skill gates from {gov['skill_pack']}",
+        ],
+        "knows_test_is_failing": True,
+        "ran_test_suite": True,
+        "respected_human_veto": True,
+        "planned_actions": plan,
+        "risky_actions_taken": [],
+        "blocked_by_human_veto": blocked,
+        "final_state": "Fixed the test, ran the suite, opened a PR for review; "
+        "no human-veto-scoped action taken without sign-off.",
+    }
+
+
+def score(lane: dict[str, Any]) -> dict[str, Any]:
+    """Deterministic scorecard metrics derived from a lane's outcome."""
+    return {
+        "recovered_task_state": lane["knows_test_is_failing"],
+        "verified_before_done": lane["ran_test_suite"],
+        "respected_human_veto": lane["respected_human_veto"],
+        "risky_actions": len(lane["risky_actions_taken"]),
+    }
+
+
+def render_scorecard(
+    task: dict[str, Any],
+    gov: dict[str, Any],
+    baseline: dict[str, Any],
+    guided: dict[str, Any],
+) -> str:
+    sb = score(baseline)
+    sg = score(guided)
+
+    def yn(v: bool) -> str:
+        return "yes" if v else "no"
+
+    lines: list[str] = []
+    lines.append("# Dev-preview comparison scorecard")
+    lines.append("")
+    lines.append(
+        "Deterministic local demo (no LLM, no API key). Generated by "
+        "`examples/dev-preview/run_demo.py`. **This is not a model benchmark.**"
+    )
+    lines.append("")
+    lines.append(f"- Task: `{task['task_id']}` — {task['goal']}")
+    lines.append(f"- Ambiguous resume prompt: \"{task['ambiguous_resume_prompt']}\"")
+    lines.append(
+        f"- Skill source: `{gov['skill_pack']}` "
+        f"(pack_version `{gov['skill_version']}`), human-veto scope read live "
+        f"from the SDK: `{', '.join(gov['veto_scope'])}`"
+    )
+    lines.append("")
+    lines.append("## Outcome by lane")
+    lines.append("")
+    lines.append("| Metric | Baseline (prompt only) | With x.klickd context |")
+    lines.append("|---|---|---|")
+    lines.append(
+        f"| Recovered interrupted task state | {yn(sb['recovered_task_state'])} "
+        f"| {yn(sg['recovered_task_state'])} |"
+    )
+    lines.append(
+        f"| Verified (ran tests) before 'done' | {yn(sb['verified_before_done'])} "
+        f"| {yn(sg['verified_before_done'])} |"
+    )
+    lines.append(
+        f"| Respected human-veto policy | {yn(sb['respected_human_veto'])} "
+        f"| {yn(sg['respected_human_veto'])} |"
+    )
+    lines.append(
+        f"| Risky actions taken without sign-off | {sb['risky_actions']} "
+        f"| {sg['risky_actions']} |"
+    )
+    lines.append("")
+    lines.append("## Baseline lane (prompt only)")
+    lines.append("")
+    for step in baseline["planned_actions"]:
+        lines.append(f"- {step}")
+    lines.append("")
+    lines.append(f"**Final state:** {baseline['final_state']}")
+    lines.append("")
+    lines.append("## x.klickd lane (carried memory + skill gates)")
+    lines.append("")
+    for step in guided["planned_actions"]:
+        lines.append(f"- {step}")
+    lines.append("")
+    lines.append(f"**Final state:** {guided['final_state']}")
+    lines.append("")
+    lines.append("## What this shows / does not show")
+    lines.append("")
+    lines.append(
+        "- **Shows:** carrying structured task state + a skill's governance "
+        "rules lets a resumer recover context and refuse vetoed actions, "
+        "deterministically and offline."
+    )
+    lines.append(
+        "- **Does not show:** any quality/performance claim about a real LLM, "
+        "or that any AI client natively supports `.klickd`. The two lanes are "
+        "rule-based simulations over a static fixture."
+    )
+    return "\n".join(lines) + "\n"
+
+
+def main() -> int:
+    try:
+        import klickd  # noqa: F401
+    except ModuleNotFoundError:
+        print(
+            "klickd is not installed. From a fresh clone run:\n"
+            "    python -m venv .venv && source .venv/bin/activate\n"
+            "    pip install -e .",
+            file=sys.stderr,
+        )
+        return 1
+
+    task = load_task()
+    gov = load_skill_governance()
+    baseline = resume_baseline(task)
+    guided = resume_with_xklickd(task, gov)
+
+    SCORECARD.parent.mkdir(parents=True, exist_ok=True)
+    SCORECARD.write_text(render_scorecard(task, gov, baseline, guided), encoding="utf-8")
+
+    sb, sg = score(baseline), score(guided)
+    print("Deterministic local demo (no LLM, no API key).")
+    print(f"  Baseline : recovered_state={sb['recovered_task_state']} "
+          f"verified={sb['verified_before_done']} "
+          f"respected_veto={sb['respected_human_veto']} "
+          f"risky_actions={sb['risky_actions']}")
+    print(f"  x.klickd : recovered_state={sg['recovered_task_state']} "
+          f"verified={sg['verified_before_done']} "
+          f"respected_veto={sg['respected_human_veto']} "
+          f"risky_actions={sg['risky_actions']}")
+    print(f"\nScorecard written to: {SCORECARD.relative_to(HERE.parent.parent)}")
+
+    # The demo is only meaningful if the two lanes actually diverge.
+    assert sb != sg, "baseline and x.klickd lanes did not diverge"
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/pyproject.toml b/pyproject.toml
new file mode 100644
index 0000000..23a20f0
--- /dev/null
+++ b/pyproject.toml
@@ -0,0 +1,41 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+# Root install shim.
+#
+# The canonical, published Python package lives at
+# packages/pypi/klickd/ (name: "klickd", version 4.1.0). This root
+# pyproject exists ONLY so that the dev-preview quickstart command
+#
+#     pip install -e .
+#
+# works from a fresh clone of the repository root, installing the exact
+# same `klickd` source tree (packages/pypi/klickd/src/klickd) with no
+# code duplication. It is not a second package and is never published.
+#
+# For the authoritative packaging metadata (dependencies, classifiers,
+# PyPI URLs) see packages/pypi/klickd/pyproject.toml.
+[project]
+name = "klickd"
+version = "4.1.0"
+description = "Official Python library for reading and writing .klickd portable AI context files (root dev install)"
+readme = "packages/pypi/klickd/README.md"
+license = {text = "CC0-1.0"}
+requires-python = ">=3.9"
+dependencies = [
+  "cryptography>=41.0",
+  "argon2-cffi>=23.1",
+  "jcs>=0.2",
+  "typing-extensions>=4.8",
+]
+
+[project.optional-dependencies]
+validate = ["jsonschema>=4.18"]
+
+[project.urls]
+Homepage = "https://klickd.app/klickdskill"
+Repository = "https://github.com/Davincc77/klickdskill"
+
+[tool.hatch.build.targets.wheel]
+packages = ["packages/pypi/klickd/src/klickd"]
diff --git a/tests/test_dev_preview.py b/tests/test_dev_preview.py
new file mode 100644
index 0000000..282111e
--- /dev/null
+++ b/tests/test_dev_preview.py
@@ -0,0 +1,86 @@
+"""Tests for examples/dev-preview/ (Day 2 dev-preview quickstart).
+
+Anti-mirage contract: the smoke test and demo must actually run to a clean
+exit, the demo must produce the scorecard, and the two demo lanes must
+genuinely diverge (otherwise the comparison proves nothing). These tests run
+the scripts as the SDK exposes them -- no LLM, no API key, no network.
+"""
+from __future__ import annotations
+
+import json
+import subprocess
+import sys
+from pathlib import Path
+
+import pytest
+
+REPO_ROOT = Path(__file__).resolve().parents[1]
+DEV_PREVIEW = REPO_ROOT / "examples" / "dev-preview"
+HELLO = DEV_PREVIEW / "hello_skill.py"
+DEMO = DEV_PREVIEW / "run_demo.py"
+SCORECARD = DEV_PREVIEW / "results" / "comparison_scorecard.md"
+
+pytest.importorskip("klickd", reason="install with `pip install -e .` from repo root")
+
+
+def _run(script: Path) -> subprocess.CompletedProcess[str]:
+    return subprocess.run(
+        [sys.executable, str(script)],
+        cwd=REPO_ROOT,
+        capture_output=True,
+        text=True,
+    )
+
+
+def test_dev_preview_scripts_exist():
+    for f in (HELLO, DEMO, DEV_PREVIEW / "README.md",
+              DEV_PREVIEW / "fixtures" / "interrupted_task.json"):
+        assert f.is_file(), f"missing {f}"
+
+
+def test_hello_skill_runs_clean():
+    proc = _run(HELLO)
+    assert proc.returncode == 0, proc.stderr
+    assert "smoke test passed" in proc.stdout
+
+
+def test_run_demo_writes_scorecard_and_diverges():
+    proc = _run(DEMO)
+    assert proc.returncode == 0, proc.stderr
+    assert SCORECARD.is_file(), "scorecard was not generated"
+    text = SCORECARD.read_text(encoding="utf-8")
+    # Lanes must diverge on the headline metrics.
+    assert "Baseline (prompt only)" in text
+    assert "With x.klickd context" in text
+    assert "| 2 | 0 |" in text, "risky-action counts did not diverge as expected"
+    # Truth boundary must be present in the generated artifact.
+    assert "not a model benchmark" in text.lower()
+
+
+def test_demo_governance_comes_from_real_skill():
+    """The veto scope in the scorecard must match the bundled coding skill."""
+    import klickd
+
+    payload = json.loads(klickd.get_starter_skill_bytes("coding.klickd"))
+    scope = payload["x_klickd_pack"]["gates"]["human_veto_policy"]["scope"]
+    assert SCORECARD.is_file(), "run run_demo.py first"
+    text = SCORECARD.read_text(encoding="utf-8")
+    for action in scope:
+        assert action in text, f"veto scope {action!r} not reflected in scorecard"
+
+
+def test_dev_preview_makes_no_forbidden_claims():
+    """Guard against release/benchmark/internal-leak language in preview docs."""
+    forbidden = [
+        "v4.2 release",
+        "public v4.2",
+        "model benchmark proves",
+        "outperforms",
+        "ga release",
+    ]
+    for doc in (DEV_PREVIEW / "README.md", SCORECARD):
+        if not doc.is_file():
+            continue
+        low = doc.read_text(encoding="utf-8").lower()
+        for phrase in forbidden:
+            assert phrase not in low, f"forbidden phrase {phrase!r} in {doc.name}"