Release 1.0.0 — final (drop the rc)#9
Conversation
…oc; C-8 preserved Dogfood-#2 governance honesty (convention C-10), branch-local — merge/release gated on the filigree-first propagation. Capability confinement (proposed C-8) preserved throughout: operator signing keys stay out of agent reach, nothing is auto-provisioned/relocated, no MCP tool enables a cell or self-grants authority. N3 (weft-df8d2ef454, C-10(c)) — legis no longer ships dark and quiet: - mcp.py _recovery_for: INVALID_CELL_SPEC names LEGIS_WARDLINE_CELL / LEGIS_WARDLINE_CELL_BY_SEVERITY (covers all WardlineRoutingError kinds, incl. those str(exc) misses); CELL_NOT_ENABLED split into the keyless simple tier (policy/cells.toml / LEGIS_POLICY_CELLS / LEGIS_DEV_DEFAULT_CELLS) and the complex tier (LEGIS_HMAC_KEY, operator out-of-band + relaunch). Subsumes Le1. - doctor.py: two report-only checks (check_policy_cells, check_wardline_routing) naming the enablement path when unwired — presence-only, no repair param, write nothing, never render a key value. Fail-closed preserved (no auto-open). N4 (weft-a7a92a40dd, C-10(d)) — honest dirty-tree skip: - WardlineDirtyTreeError.to_payload() is the single source both transports (mcp.py scan_route + api/app.py) serialize: structured reason/posture/cause/ remediation, routed==[] (governs nothing). No scan_route call argument added; the LEGIS_WARDLINE_ALLOW_DIRTY dirty-snapshot opt-in stays an env-only operator switch. C3 (weft-f506e5f845) — charter now documents that legis's OWN audit records carry a self-asserted agent_id/operator_id (launch-bound + HMAC-tamper-evident, not authenticated); verified_author:null maps to those fields. Guards: test_c8_no_agent_reachable_enablement_or_signing_surface (no enable/sign tool; scan_route schema locked) + doctor checks write-nothing/render-no-key test. 762 passed; ruff + mypy clean; coverage 92.30%; per-package floors hold; policy-boundary-check PASS; SEI oracle PASS. Designed + adversarially red-teamed (C-8 verdict: safe) and implementation-reviewed via multi-agent workflows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntime Acceptance branch 1 of N3 (weft-df8d2ef454) — "a fresh stdio launch CAN reach a configured non-secret surface" — was only proven via injected-engine unit tests; the CHANGELOG and ticket comments assert "chill/coached reachable keyless" as fact. Add a test that exercises the REAL launch path: build_runtime() with no LEGIS_HMAC_KEY + the LEGIS_DEV_DEFAULT_CELLS=1 chill posture, then override_submit -> ACCEPTED_SELF via the lazy keyless _engine. A future change making _engine require a key now fails here instead of silently falsifying the promise. (Scan-route axis already pinned by test_scan_route_uses_server_owned_cell.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…48eb2) Wardline renamed the per-finding output key `suppressed` -> `suppression_state` across all surfaces incl. the SIGNED legis scan artifact, changing the canonical signed bytes and breaking the Wardline->legis hop (wardline's opt-in legis_e2e oracle red by design). legis adopts the new key. - ingest: WardlineFinding.from_wire reads `suppression_state`; the dataclass field, error message, and active_defects branches follow. Values unchanged (active/waived/suppressed/baselined/judged); the `Suppressed` enum (value vocabulary) and SUPPRESSION_PROOF_KEYS are untouched. - clean break: a finding carrying only the legacy `suppressed` key reads as `active` and OVER-gates — fail-safe (never silently drops a real defect), pinned by test_legacy_suppressed_key_is_ignored_clean_break. - NO signing/canonical change: legis's signer already reproduces Wardline's rekeyed golden byte-for-byte. Added the legis-side cross-impl golden MIRROR legis was missing: sign(_GOLDEN_FIELDS, _GOLDEN_KEY) == hmac-sha256:v2:2b2cf09… over `suppression_state`, so the hop self-verifies on both ends. - intake fixtures: ~40 `suppressed` test fixtures across tests/wardline, tests/api, tests/mcp, tests/store renamed to `suppression_state` (a sweep flagged these to avoid vacuously-green suppression-path assertions). Acceptance: legis 767 tests green; golden byte-agreement pinned; the live signed hop verifies — wardline's `-m legis_e2e` test_legis_accepts_signed_artifact PASSES against the reinstalled legis (real build_legis_artifact -> signed suppression_state artifact -> legis verifies + routes). Branch-only; ship via the filigree-gated rc4->main merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… resolution check_policy_cells claimed to "mirror mcp._load_policy_cell_registry" but the root fallback differs: the resolver uses os.getcwd() when LEGIS_SOURCE_ROOT is unset, while doctor uses its passed-in root. The env precedence is faithfully mirrored; the root resolution is a deliberate difference (they coincide when doctor runs from the server's launch CWD). Tighten the docstring to say so. Docstring-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oute root (opp #6) scan_route returned `{outcome: ROUTED, routed:[...]}` with no top-level posture field, so an agent relaying "governance passed" could not tell a keyless dev-grade pass (unverified/dirty) from a CI-signed `verified` pass — the posture was only buried in each routed record's provenance, and absent entirely when nothing routed. Same vacuous-green fidelity gap as wardline W2. - `route_wardline_scan` now returns `RoutedScan(routed, artifact_status)` instead of a bare list, surfacing the scan-level `artifact_status` that `verify_wardline_artifact` already computes - both surfaces echo it at the response root: the MCP `scan_route` tool and the HTTP `/scan-route` adapter (identical contract) - new MCP test asserts a keyless unsigned scan echoes `artifact_status: "unverified"` at the top level; the exact-shape routing test gains the field Closes gap-analysis opp #6. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…filigree-scope check (N1) Close two release-1.0 risk-audit gaps: POLICY-1 — a pinned, running evidence test could be disabled after the fact with @pytest.mark.skip / skipif / xfail. The fingerprint is blind to decorators (Q-L5 parity), so the drift check is byte-identical and cannot see the disablement. Add a highest-priority disabled-evidence judgement in the shared evaluate_test_evidence so both the runtime gate and the static boundary scanner reject it identically (new POLICY_BOUNDARY_TEST_DISABLED). Marker match is terminal-name based, so it catches the import-alias form (`from pytest import mark; @mark.skip`) whose only tell lives outside the function source the fingerprint sees. N1 — add report-only check_filigree_binding_scope to doctor: an unscoped federation-write binding in .mcp.json (/api/weft/… etc.) is fail-closed with HTTP 400 by a filigree server-mode daemon, so scans silently non-emit. Warn (not error — harmless against single-project/stdio) and name the offending URL + the scoped form to use.
/governance/lineage-integrity computed status as "unverified" if unavailable else "verified", ignoring integrity.divergences. A confirmed external tamper (divergence list populated) reported status="verified" — a false green at the top-level posture while the same payload carried the divergence. Three-way precedence: any divergence -> "diverged" (most severe, confirmed tamper) over "unverified" (can't check) over "verified". The existing divergence test pinned the divergences list but pointedly omitted the status assertion; pin status="diverged" so the false green cannot regress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… anchor (AUD-1) An attacker with DB-file write access could delete an audit record and re-chain the survivors undetectably: the hash chain is plain SHA (keyless, recomputable) and the HMAC bound record *content* but never its chain *position*, so every surviving signature still verified and the chain stayed internally consistent. service/governance.py already documented that whole- trail verify catches mutation but not deletion. Two complementary, isolated mechanisms now close it: * seq-binding (v3) + contiguity — interior delete and reorder. verify_integrity gains an expected-seq counter (a re-chained gap is now a tamper), and protected + sign-off verdicts sign at v3, folding the chain seq into the HMAC. A renumber-to-hide-a-deletion then fails to verify at the new position. seq is taken from the column at verify time, never a payload field. Resolved the sign-before-seq ordering with a store-mediated append_signed: the store reserves seq + prev_hash under its BEGIN IMMEDIATE lock and hands them to a signer callback, so the bound seq is provably the row's seq with no race. The store stays key-agnostic (the callback closes over the gate's key). * HeadAnchor (opt-in) — tail-truncation, the one thing seq-binding structurally cannot catch (a truncated head is legitimately last). A small HMAC-signed sidecar remembers the last (seq, chain_hash); a missing anchor on an anchored store fails closed. Wired as optional gate/verifier params, off by default — conceded-capability hardening that does not touch the 1.0 core. The shared sign()/verify() primitive keeps its v2 default, so the cross-tool Wardline artifact contract and the binding ledger are byte-for-byte untouched. Binding ledger stays v2 (separate, homogeneous store) but is covered by the new contiguity check; renumber-within that store is a documented residual, as is the inherent renumber-vulnerability of an all-unsigned (chill/coached) run. Tests: three attack PoCs, each isolating one mechanism (interior-delete-gap → contiguity; delete-and-renumber → v3 seq-HMAC; tail-truncate → anchor), plus HeadAnchor unit coverage (forged/missing/reappend/no-op) and a v3 signing pin. Full suite 793 passed, 2 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ail-loss (AUD-3) The audit store ran synchronous=NORMAL under WAL. NORMAL only fsyncs the WAL at a checkpoint, so a committed-but-not-yet-checkpointed append is lost on a power-cut while the database stays consistent. The survivors form a contiguous, fully-signed hash chain — a valid-looking SHORTENED trail indistinguishable from "nothing more was ever written". For an audit-integrity store that silent tail-loss is precisely the harm. Set synchronous=FULL: each commit is fsynced, so a committed governance record survives power loss; throughput is the correct thing to trade here. The floor is intentionally not configurable — an audit store's durability must not be lowerable back to the bug. SQLite's default wal_autocheckpoint still bounds WAL growth, so no separate checkpoint lifecycle is needed. This is the prevention half of the shortened-trail problem; AUD-1's out-of-band head anchor is the detection half (it flags a trail that shrank below its recorded head, whether by malice or by lost-tail). Pinned by reading PRAGMA synchronous (==2 FULL) on a listener connection, mirroring the existing WAL/busy_timeout pragma tests. Full suite 795 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed limit (AUD-1 red-team) An adversarial review of the AUD-1 anchor (5 red-team lanes, executed PoCs) refuted every interior-delete / reorder / renumber / version-downgrade / seq-soundness attack and confirmed the Wardline v2 contract is byte-for-byte intact (201-test regression sweep green). It found one genuine residual: the anchor's HMAC stops forgery but not REPLAY. The anchor is a single mutable sidecar, so a snapshotting attacker can save a genuinely-signed early anchor (head=1), let the trail grow, truncate the DB back to seq=1, and restore the saved anchor — it verifies (real signature, consistent seq + chain_hash) and the rollback goes undetected. This is inherent to local same-filesystem storage: nothing on disk is beyond a file-write attacker's rollback, so no purely-local check (counter, timestamp, extra copy) closes it — that would be honesty theatre. The fix is a deployment property: store the anchor on append-only/WORM or remote storage, or run an external monitor on the anchored head's monotonicity. The prior docstring over-claimed it detects "a rollback to an earlier consistent prefix" — false under replay. Corrected to state precisely what it catches (forgery; truncation by a late/non-snapshotting attacker) and the replay limitation + its real mitigation. Pinned the boundary with an executable known-limitation test so the over-claim cannot silently drift back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st-marker-only (INSTALL-1) The injector deliberately tolerates a split brain: when a second legis instruction block sits beyond a sibling tool's block, it cannot canonicalise across the foreign block, so it rewrites the first block fresh, warns, and leaves the stale second copy in place (foreign-safety wins over own-dedup). The doctor's freshness probe, though, read the token off the FIRST marker only (_MARKER_TOKEN_RE.search → first match) — so a fresh first block masked a stale second block and the doctor reported "healthy" on exactly the conflicting- guidance state it exists to catch. Freshness now requires EXACTLY ONE legis block at the current token, via a new foreign-aware walk (_own_open_marker_tokens) that reuses the injector's own fence-tracking — a legis marker quoted inside a sibling block is not counted, so the probe never miscounts a documented example as a real block. check_instruction _block surfaces a split brain (>1 block) with an actionable hand-resolution message and, since the injector cannot collapse it, does not falsely claim repair fixed it. This is the same honesty discipline as GOV-1/POLICY-1: a gate must not report green on the condition it exists to detect. RED test pinned the false-"ok" first; both CLAUDE.md and AGENTS.md get the fix via the shared check. Full suite 797 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
HttpLoomweaveIdentity.capability() probed GET /api/v1/_capabilities with an
explicit signed=False, so the request went out unsigned even when an HMAC key
was provisioned — the lone unsigned exception among the SEI routes, and the very
one that establishes whether legis trusts the provider as SEI-capable. On a
keyed deployment that left the trust-establishing handshake unauthenticated,
spoofable to capability=supported.
Sign it like every other route (the default path already no-ops signing when no
key is set, so loopback/trusted deployments are unchanged). Removed the per-call
`signed` knob from _request entirely: an unsigned opt-out is exactly the
affordance that caused this, and no other caller used it — so it cannot
reintroduce the gap. Wire confidentiality against an on-path response rewrite
remains TLS's job, which _validate_base_url already enforces for any non-loopback
(keyed) host.
RED-pinned the unsigned probe ({} headers when keyed) before the fix; added a
companion test that the keyless probe stays bare. Full suite 799 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… guard (JUDGE-1)
In the coached cell a model ACCEPTED maps straight to accepted=True, and the
agent-controlled rationale (and, on the degraded-to-locator branch, the entity
locator) flowed into the judge prompt with no length bound — so a prompt-stuffing
payload could bury the instruction or smuggle an injection into the model.
LLMJudge.evaluate now bounds the SERIALIZED request — {policy, entity, rationale}
exactly as build_prompt embeds it — at MAX_JUDGE_REQUEST_CHARS (8192) before the
model is consulted; over-cap is rejected as BLOCKED by a deterministic guard that
never calls the model (stamped with a self-documenting sentinel model id, not an
LLM identity). Measuring the serialized request (not the raw rationale) bounds
every agent-settable field in one check — rationale, entity locator, and the
ensure_ascii unicode-expansion variant (each non-ASCII char → 6-char \uXXXX, so a
raw-char cap would be 6x loose). Reject, never truncate: truncation would mutate
the rationale that is recorded and (protected cell) signed, and could pass a
front-loaded injection. The full over-cap rationale is still written to the
BLOCKED record, so the attempt stays attributable.
build_prompt's serialization (the structural-escape defense — a forged sibling
{"verdict":"ACCEPTED"} survives only as an escaped string value) is now pinned by
a round-trip test covering rationale AND entity injection (JUDGE-2). The module
docstring documents the residual honestly: a SEMANTIC injection that persuades
the model is a model-robustness property, not a code fail-open — mitigated by
attribution and, in the protected cell, by Q-H3's deterministic validator.
TDD: RED-pinned both stuffing vectors (rationale + entity reaching an accepting
model) and the model-never-consulted property before the guard; added an
in-cap boundary test so a thorough justification is not falsely blocked. Full
suite 803 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-2, CRYPTO-THRESHOLD-001) Closes the last three low/post-1.0 items from docs/release-1.0-risk-audit.md. POLICY-2 (this session) — remove the exemption-rescue mechanism outright. PolicyGrammar had a VIOLATION->CLEAR exemption-rescue branch wired to an agent-writable YAML loader (ExemptionAllowlist.from_file) with zero src consumers — the latent bypass trap the finding names. Full removal: delete policy/exemptions.py + tests/policy/test_exemptions.py, drop the exemptions ctor param / _exemptions / rescue branch from grammar.py, and remove the 3 rescue-branch tests. New regression guard test_grammar_has_no_exemption_rescue _mechanism pins that no exemption seam can be re-introduced by accident. This supersedes the earlier conservative document-only closure of legis-e512e97bfc (see ticket history): documenting around the loader left the trap in the tree. AUTH-1 (doc) — app.py comment telegraphs that LEGIS_ALLOW_UNSCOPED_API_TOKENS=1 grants unscoped tokens operator authority (not renamed: the var already fits the LEGIS_ALLOW_<bad-thing> family; audit remedy was "rename OR document"). CRYPTO-THRESHOLD-001 (doc) — README scopes the "cryptographic layer" to intra-suite HMAC tamper-evidence with a self-asserted actor, not third-party cryptographic proof; names RFC-8785 as the upgrade path. Full suite green (792 passed, 2 skipped), ruff clean on changed files. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve the 6 standing lint errors (default ruff E4/E7/E9/F ruleset): - test_doctor.py: 5x E402 (module-level imports placed under mid-file section headers) — consolidated into the top import block; section comments kept. - test_install.py: 1x F401 — dropped the unused `_legis_mcp_entry` import. No behaviour change. Full suite green (792 passed, 2 skipped), ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Second adversarial pre-ship review (docs/release-1.0-pre-ship-review.md)
re-attacked the prior audit's self-verified fixes. Crypto-threshold held;
these gaps it surfaced are now closed, each independently re-verified.
- JUDGE-3 (protected-cell fail-open): the Q-H3 advisory-downgrade was gated on
exact-match `protected_policies`, which diverges from the glob-capable cell
routing — a protected-cell policy outside the set (incl. any glob route and
the empty-set default) had its model ACCEPTED signed authoritative. The cell
is now fail-closed UNCONDITIONALLY: it clears only on a validator-confirmed
ACCEPTED. Independent re-attack then caught a second variant — a fooled model
emitting the operator-only OVERRIDDEN_BY_OPERATOR (which _record_signed also
counts as accepted) cleared the gate even for a declared protected policy.
Closed at two layers: the judge JSON parser now restricts verdicts to
{ACCEPTED, BLOCKED}, and submit() downgrades the whole accepted-set.
Behavior change: with no validator wired (default prod), protected overrides
now require operator sign-off. Regression tests at parser and gate levels.
- GOV-2: /governance/identity-gaps now returns a {status, gaps} envelope
("unavailable" vs "checked") so a can't-check state is not a false all-clear,
matching the GOV-1 fix on the sibling lineage-integrity endpoint.
- F1: TrailVerifier docstring corrected — no longer claims modify-to-unsigned is
caught; the modify-to-unsigned / tail-truncation residuals of the conceded
raw-file-write tier are documented honestly (code hardening tracked post-1.0).
- POLICY-1: aliased-marker (`skipper = pytest.mark.skip; @skipper`) and
fixture-skip vectors documented as residuals in _disabling_marker (zero live
@policy_boundary sites; name-heuristic hardening tracked post-1.0).
- ID-SEI-1: LEGIS_ALLOW_INSECURE_REMOTE_HTTP now warns on a remote-plaintext
bypass (loomweave + filigree clients); documented in README + federation doc.
- ID-SEI-2: resolver `alive` is now strict-bool; a non-bool truthy value
degrades fail-closed instead of promoting to a stable SEI identity.
- README "Known security limitations" section + CHANGELOG entries.
Suite 801 passed / 2 skipped; ruff + mypy clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
doctor: - `--fix` is now the canonical repair flag; `--repair` stays a working alias (argparse dest `fix`), so no script breaks. - DoctorCheck gains a `repairable` bit; text view tags each problem `[fixed]` / `[auto-fixable]` / `[operator]` with footers that point auto-fixable items at `legis doctor --fix` and tell the operator that `[operator]` items need out-of-band config + a relaunch. JSON checks carry `repairable` additively. - `install.filigree_scope` is gated on filigree actually being installed (file-existence probe, no filigree import): the unscoped-binding warning only fail-closes against a server-mode filigree daemon, so it is noise when filigree is absent. When it fires, the message names it operator- owned (the `--filigree-url` is operator-pinned in wardline's `.mcp.json`) and stays repairable=False. tidy for 1.0 (version held at rc4 per the live-e2e gate): - README + doctor docstring use the canonical `--fix` spelling. - CHANGELOG [Unreleased] records the above. - .gitignore ignores `.claude/*.lock` (transient scheduled-tasks lock). - removed stray build artifacts (.coverage, coverage.json). Full suite green (813 passed, 2 skipped), ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The README covers the *why* (the 2×2 concept) and the legis-workflow skill covers the *agent-call* surface, but there was no human-operator guide for "how do I configure this" and "what am I seeing when an agent does X". Adds docs/guide/: - configuration.md — the operator's governance-control reference: reconciles "zero human config" (the agent's experience) with the operator's two acts (choose the cell, hold the key); per-cell cost/buys table; the fail-closed routing default + resolution order; full LEGIS_* / OPENROUTER_* env-var reference grouped by purpose; and a separate, warning-carrying "dev-only / escape hatches" section for the LEGIS_UNSAFE_* / LEGIS_ALLOW_* flags. - reading-legis-output.md — organized by "where it surfaces / what it means / do I act": keeps the recorded Verdict (ACCEPTED/BLOCKED/OVERRIDDEN_BY_OPERATOR) distinct from the override_submit outcome envelope (ACCEPTED_SELF / ACCEPTED_BY_JUDGE / BLOCKED / ESCALATED_PENDING / NEED_INPUTS); covers scan outcomes, artifact/identity/lineage statuses, the override-rate gate, CI exit codes, doctor tags, and flags the only signals that need a human in real time. - README.md (index) + links from the top-level README. Every flag/enum/command cited was verified against source (e.g. dropped a spurious OPENROUTER_BASE_URL row that was a grep artifact of the DEFAULT_OPENROUTER_BASE_URL constant, not a real env var). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reference tables answer "what does signal Y mean / do I act"; a single compact narrative (agent hits a coached policy → BLOCKED → revise → ACCEPTED_BY_JUDGE → async review, with the structured ESCALATED_PENDING contrast) converts the reference into the mental model behind the user's literal question, "what am I seeing when an agent does X". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gree's install predicate Two corrections to the doctor checks landed in 84a8047: - **Split-brain instruction block is not auto-fixable.** `--fix` returns before the repair branch for the >1-block split-brain case (the injector won't splice across a sibling tool's block), so tagging it `repairable=True` rendered a false `[auto-fixable]` signal that re-creates the very --fix loop the design eliminates. Now `repairable=False` → `[operator]`, matching the check's own "resolve it by hand" message. (Corrects the tag shipped in 84a8047.) - **`_filigree_installed` now mirrors filigree's real install predicate.** It was an AND requiring `.filigree.conf` AND a `config.json`; filigree's `find_filigree_anchor` (core.py:1046-1064) treats a project as installed if ANY of three markers is present: `.filigree.conf` (file), `.weft/filigree/` (dir), or `.filigree/` (dir) — never AND, and the store/legacy checks are `.is_dir()`, not a `config.json` `.is_file()`. The old AND would return "not installed" for confless / legacy / conf-only installs and SILENTLY DROP a real unscoped-binding warning where filigree genuinely is installed — the false-green the governance honesty discipline forbids. Tests updated to cover conf-only, confless-weft, and confless-legacy installs (the last is the live federation-legacy-path case). Full suite green (815 passed, 2 skipped), ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…te cell trap, envelope next_action LEG-1: add the policy_list tool (routing table + each cell's honest enabled state, computed via a shared explain_cell so it can never disagree with policy_explain) and an additive matched_rule field on policy_explain (a configured policy reports its rule pattern; an unconfigured/hallucinated name reports null). cell_for now delegates to a new rule_for() so routing and discovery cannot drift. LEG-2: the error envelope already carries next_action/recoverable for every code (_recovery_for); reconcile the SKILL.md error table to it verbatim and add one drift-lock test asserting every emitted code yields a non-empty next_action. No new abstraction. LEG-3: scan_route's server-owned rejection now names the rejected request-side arg(s) (cell/severity_map/fail_on) while retaining the literal 'server-owned' substring; the cell/severity_map/fail_on schema descriptions state the LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING gating. Additive only; no routing/enablement/tiering semantics changed. ruff + mypy clean; full suite 825 passed, 2 skipped (+10 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version 1.0.0rc4 -> 1.0.0 across pyproject, legis.__version__ (feeds the MCP serverInfo, /health, and `legis --version`), and uv.lock. CHANGELOG [Unreleased] -> [1.0.0] (2026-06-09) with refreshed compare links. 1.0 release-prep hygiene (same pass): - README points to the now-public adversarial threat model — the risk audit and the independent pre-ship review, attack recipes and all — framed as the "forced me to do the right thing" discipline it is. - Dropped the rc1 "Known limitations" list from the changelog: the MCP item was superseded at rc2; the live sibling-gated items moved to the Filigree tracker (outstanding work belongs in the tracker, not the log). No code behavior change — version strings + docs only. Full suite green (825 passed, 2 skipped; ruff + mypy clean). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
This PR cuts the 1.0.0 final release (from 1.0.0rc4) and includes the release-prep hardening and documentation that accompanied the adversarial reviews described in the PR metadata. It also updates several governance/attestation invariants (audit-trail tamper evidence, Wardline schema interop, MCP/operator surfaces) and expands tests/docs to pin the intended fail-closed behaviors.
Changes:
- Bump versioning and release notes to 1.0.0 across package metadata and changelog.
- Strengthen governance/audit integrity and posture reporting (v3 signatures with
chain_seq, head anchor support, WAL durability pragma, structured skip payloads, more explicit routing errors). - Update Wardline ingest to the
suppression_statewire key and add new/expanded tests and operator/agent-facing documentation.
Reviewed changes
Copilot reviewed 67 out of 69 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Update editable package version to 1.0.0. |
| tests/wardline/test_policy.py | Update test fixtures to use suppression_state. |
| tests/wardline/test_ingest.py | Update ingest tests for suppression_state and add golden/skip payload tests. |
| tests/wardline/test_governor.py | Update governor tests for suppression_state. |
| tests/wardline/test_coached_routing.py | Update coached routing tests for suppression_state. |
| tests/test_install.py | Remove unused import in MCP registration test. |
| tests/store/test_head_anchor.py | Add tests for the out-of-band head anchor behavior and limitations. |
| tests/store/test_batch_read_free_invariant.py | Update finding fixtures to use suppression_state. |
| tests/store/test_audit_store.py | Add/expand tests for synchronous=FULL and contiguity integrity checks. |
| tests/service/test_wardline.py | Add tests pinning improved Wardline routing error messages. |
| tests/service/test_governance.py | Update signing fields tests to include seq binding. |
| tests/service/test_explain.py | Pin matched_rule reporting in explain payloads. |
| tests/policy/test_honesty_gate.py | Add tests ensuring disabled evidence tests are rejected (POLICY-1). |
| tests/policy/test_grammar.py | Pin removal of exemptions seam and update related expectations. |
| tests/policy/test_exemptions.py | Remove exemptions tests (feature removed). |
| tests/policy/test_evidence.py | Add evaluator tests for disabled-marker detection (skip/xfail/skipif). |
| tests/policy/test_boundary_scan.py | Add end-to-end scan tests for disabled evidence tests. |
| tests/mcp/test_server.py | Add policy_list tool coverage, posture echoing, and next_action invariants. |
| tests/identity/test_resolver.py | Add test ensuring non-bool alive does not promote stable identity. |
| tests/identity/test_loomweave_client.py | Add tests for signed capability probe and insecure-HTTP warnings. |
| tests/filigree/test_client.py | Add tests for insecure-HTTP warning and enforcement behavior. |
| tests/enforcement/test_trail_verify.py | Add tests for seq-binding and anchored tail-truncation detection. |
| tests/enforcement/test_signoff.py | Update expected signature prefix to v3 for sign-offs. |
| tests/enforcement/test_signing.py | Add v3 signing/verification primitive tests. |
| tests/enforcement/test_regressions.py | Remove exemptions regression test (feature removed). |
| tests/enforcement/test_protected_submit.py | Update protected submit tests for fail-closed behavior + v3 binding. |
| tests/enforcement/test_protected_override.py | Update operator override signature expectations to v3. |
| tests/enforcement/test_protected_extensions.py | Update signature verification reconstruction to use seq. |
| tests/enforcement/test_judge.py | Add tests for operator-only verdict rejection and prompt-size cap. |
| tests/api/test_sei_api.py | Update API tests for new identity-gaps envelope and protected validator wiring. |
| tests/api/test_complex_api.py | Update API tests for fail-closed protected behavior + identity-gaps envelope. |
| tests/api/test_combinations_api.py | Update API tests to use suppression_state and pin structured dirty-skip fields. |
| src/legis/wardline/ingest.py | Implement suppression_state, structured dirty-tree skip payload, and updated active-defects logic. |
| src/legis/store/protocol.py | Extend store protocol with append_signed and head query support. |
| src/legis/store/head_anchor.py | Add new HeadAnchor implementation for tail-truncation detection. |
| src/legis/store/audit_store.py | Add append_signed, contiguity checks, synchronous=FULL, and head query helper. |
| src/legis/service/wardline.py | Improve routing error messages and return scan-level posture in routing result. |
| src/legis/service/explain.py | Add matched_rule and refactor explain plumbing (explain_cell). |
| src/legis/policy/grammar.py | Remove exemptions seam from policy grammar. |
| src/legis/policy/exemptions.py | Remove exemptions implementation (feature removed). |
| src/legis/policy/evidence.py | Add disabling-marker detection and return disabled evidence results. |
| src/legis/policy/cells.py | Add rule_for and expose rule list for routing introspection. |
| src/legis/policy/boundary_scan.py | Map disabled evidence outcome to a dedicated rule id. |
| src/legis/mcp.py | Add policy_list, improve scan_route output posture, and enrich recovery hints. |
| src/legis/install.py | Add split-brain detection helper for multiple legis instruction blocks. |
| src/legis/identity/resolver.py | Fail-closed alive handling requiring strict boolean True. |
| src/legis/identity/loomweave_client.py | Sign capability probe when keyed and warn on insecure remote HTTP. |
| src/legis/filigree/client.py | Warn on insecure remote HTTP when bypass flag is set. |
| src/legis/enforcement/signoff.py | Bind sign-off signatures to seq (v3) and optionally advance head anchor. |
| src/legis/enforcement/signing.py | Add v3 signature prefix support and verification dispatch. |
| src/legis/enforcement/protected.py | Add v3 seq-binding, head-anchor checking, and protected fail-closed logic. |
| src/legis/enforcement/judge.py | Add prompt-size cap guard and restrict allowed judge verdicts. |
| src/legis/data/skills/legis-workflow/SKILL.md | Document policy_list and updated error recovery hints. |
| src/legis/cli.py | Add canonical legis doctor --fix flag (keep --repair alias). |
| src/legis/api/app.py | Improve identity-gaps honesty envelope, lineage-integrity status, and Wardline responses. |
| src/legis/init.py | Bump __version__ to 1.0.0. |
| README.md | Update release status and add security limitation + operator docs sections. |
| pyproject.toml | Bump project version to 1.0.0. |
| docs/release-1.0-risk-audit.md | Add published pre-release adversarial audit doc. |
| docs/release-1.0-pre-ship-review.md | Add published second-pass adversarial review doc. |
| docs/guide/README.md | Add operator guide index. |
| docs/guide/reading-legis-output.md | Add operator guide for interpreting outcomes/verdicts/statuses. |
| docs/guide/configuration.md | Add operator configuration guide and env var reference. |
| docs/federation/sei-conformance.md | Document TLS custody seal dependency and insecure-HTTP bypass implications. |
| docs/design/legis-charter.md | Expand charter note about self-asserted actor identity in records and federation writes. |
| CHANGELOG.md | Add 1.0.0 entry summarizing security/honesty closures and surface changes. |
| .gitignore | Ignore Claude Code transient lock files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| anchored_seq = body.get("head_seq") | ||
| anchored_chain = body.get("head_chain_hash") | ||
| if not sig or anchored_seq is None or anchored_chain is None: | ||
| raise AnchorError(f"head anchor {self._path} is structurally malformed") | ||
| if not verify(_anchor_fields(anchored_seq, anchored_chain), sig, self._key): |
|
|
||
| - **The coached cell is a model-robustness wall, not a cryptographic one.** A blocked agent clears the coached gate by convincing the LLM judge; a *malicious prompt injection* that persuades the model will likewise clear it. Structural injection (forging a verdict key) is closed and any transport/parse failure is fail-closed to `BLOCKED`, but the coached cell has no defense-in-depth against a model that is genuinely fooled. For verdicts that must not rest on the model's word, use the **protected** cell, where a judge `ACCEPTED` is advisory only and is downgraded to require operator sign-off (unless a deterministic, non-LLM validator confirms it). | ||
| - **Tamper-evidence assumes the signing key is out of the attacker's reach, and is not absolute against raw DB-file writes.** v3 signing binds each record's chain position, so in-place edits, reordering, and renumbering are detected. A holder of raw write access to the governance `.db` can still *delete* a record and re-chain, or rewrite a record's policy to a non-protected value and strip its protected markers ("modify-to-unsigned"), or truncate the tail — these are residuals of the conceded raw-file-write threat tier. The opt-in `HeadAnchor` mitigates truncation/rewind (with a documented anchor-replay caveat). Keep the governance store on storage only the operator controls. | ||
| - **Durability tier.** The audit store runs `synchronous=FULL`, but a power loss can still drop the most recent un-checkpointed appends; the trail stays internally consistent (a shortened-but-valid tail), it does not corrupt. |
Brings
mainfrom the rc4-release state (PRs #7/#8) to 1.0.0 final. 22 commits; merge is clean (origin/mainis an ancestor ofrc4, 0 conflicts). No behavior change in the release commit itself — it is the version cut + release-prep docs.What's in it
Version cut (
64208dd) —1.0.0rc4 → 1.0.0acrosspyproject,legis.__version__(MCP serverInfo //health/legis --version),uv.lock; CHANGELOG[Unreleased] → [1.0.0].Security / honesty — two adversarial review passes, all findings closed:
01382d5): JUDGE-3 protected cell now fail-closed unconditionally; GOV-2 identity-gaps no longer reports a false all-clear; F1 TrailVerifier docstring corrected.5076170,b36939d,98c9f5c,0a9cfe9,acdbff0+691e838+cf42727,41e0b20,0dabc8b): GOV-1 lineage divergence surfaced at the posture root; POLICY-1 disabled-evidence-test detection; AUD-1 delete-and-rechain forgery closed (v3 seq-binding + head anchor); AUD-3synchronous=FULL; INSTALL-1 split-brain detection; ID-3 signed SEI capability probe; JUDGE-1 prompt-stuffing cap; AUTH-1 / POLICY-2 / CRYPTO-THRESHOLD lows.The full adversarial threat model ships public —
docs/release-1.0-risk-audit.md+docs/release-1.0-pre-ship-review.md(reproduced attack recipes and all), linked from the README. A "forced me to do the right thing" discipline, not a hardened security boundary; residual tiers (raw DB-file write, model-robustness, response-integrity-rests-on-TLS) named honestly.Operator surface:
legis doctor --fix(canonical flag) with[auto-fixable]/[operator]repairability tagging + filigree-install-gated scope check (84a8047,a11378e); operator config + output-interpretation guides (d5a7580,b975567).Agent MCP surface: dogfood LEG-1/2/3 closed —
policy_listdiscoverability,matched_rule, scan_route cell-trap message, envelopenext_action(f5f5a8b); scan-level artifact posture echoed at the scan_route root (18c3a11).Federation contracts: adopted Wardline's
suppression_statekey (fbdf949, W3); honest unconfigured-governance seams N3/N4 + C-8 key confinement preserved (f921562).Verification
CHANGELOG.md[1.0.0]for the authoritative notes.Not done in this PR (release follow-ups, operator's call)
git tag v1.0.0(the changelog compare link assumes it).post-1.0), not here.🤖 Generated with Claude Code