Release 1.0.0 — final (drop the rc) by tachyon-beep · Pull Request #9 · foundryside-dev/legis

tachyon-beep · 2026-06-09T11:09:12Z

Brings main from the rc4-release state (PRs #7/#8) to 1.0.0 final. 22 commits; merge is clean (origin/main is an ancestor of rc4, 0 conflicts). No behavior change in the release commit itself — it is the version cut + release-prep docs.

What's in it

Version cut (64208dd) — 1.0.0rc4 → 1.0.0 across pyproject, legis.__version__ (MCP serverInfo / /health / legis --version), uv.lock; CHANGELOG [Unreleased] → [1.0.0].

Security / honesty — two adversarial review passes, all findings closed:

Second pre-ship review (01382d5): JUDGE-3 protected cell now fail-closed unconditionally; GOV-2 identity-gaps no longer reports a false all-clear; F1 TrailVerifier docstring corrected.
First risk audit (5076170, b36939d, 98c9f5c, 0a9cfe9, acdbff0+691e838+cf42727, 41e0b20, 0dabc8b): GOV-1 lineage divergence surfaced at the posture root; POLICY-1 disabled-evidence-test detection; AUD-1 delete-and-rechain forgery closed (v3 seq-binding + head anchor); AUD-3 synchronous=FULL; INSTALL-1 split-brain detection; ID-3 signed SEI capability probe; JUDGE-1 prompt-stuffing cap; AUTH-1 / POLICY-2 / CRYPTO-THRESHOLD lows.

The full adversarial threat model ships public — docs/release-1.0-risk-audit.md + docs/release-1.0-pre-ship-review.md (reproduced attack recipes and all), linked from the README. A "forced me to do the right thing" discipline, not a hardened security boundary; residual tiers (raw DB-file write, model-robustness, response-integrity-rests-on-TLS) named honestly.

Operator surface: legis doctor --fix (canonical flag) with [auto-fixable]/[operator] repairability tagging + filigree-install-gated scope check (84a8047, a11378e); operator config + output-interpretation guides (d5a7580, b975567).

Agent MCP surface: dogfood LEG-1/2/3 closed — policy_list discoverability, matched_rule, scan_route cell-trap message, envelope next_action (f5f5a8b); scan-level artifact posture echoed at the scan_route root (18c3a11).

Federation contracts: adopted Wardline's suppression_state key (fbdf949, W3); honest unconfigured-governance seams N3/N4 + C-8 key confinement preserved (f921562).

Verification

Full suite 825 passed, 2 skipped; ruff + mypy clean.
See CHANGELOG.md [1.0.0] for the authoritative notes.

Not done in this PR (release follow-ups, operator's call)

git tag v1.0.0 (the changelog compare link assumes it).
PyPI publish.
Post-1.0 backlog + conceptual extensions are tracked in Filigree (label post-1.0), not here.

🤖 Generated with Claude Code

…oc; C-8 preserved Dogfood-#2 governance honesty (convention C-10), branch-local — merge/release gated on the filigree-first propagation. Capability confinement (proposed C-8) preserved throughout: operator signing keys stay out of agent reach, nothing is auto-provisioned/relocated, no MCP tool enables a cell or self-grants authority. N3 (weft-df8d2ef454, C-10(c)) — legis no longer ships dark and quiet: - mcp.py _recovery_for: INVALID_CELL_SPEC names LEGIS_WARDLINE_CELL / LEGIS_WARDLINE_CELL_BY_SEVERITY (covers all WardlineRoutingError kinds, incl. those str(exc) misses); CELL_NOT_ENABLED split into the keyless simple tier (policy/cells.toml / LEGIS_POLICY_CELLS / LEGIS_DEV_DEFAULT_CELLS) and the complex tier (LEGIS_HMAC_KEY, operator out-of-band + relaunch). Subsumes Le1. - doctor.py: two report-only checks (check_policy_cells, check_wardline_routing) naming the enablement path when unwired — presence-only, no repair param, write nothing, never render a key value. Fail-closed preserved (no auto-open). N4 (weft-a7a92a40dd, C-10(d)) — honest dirty-tree skip: - WardlineDirtyTreeError.to_payload() is the single source both transports (mcp.py scan_route + api/app.py) serialize: structured reason/posture/cause/ remediation, routed==[] (governs nothing). No scan_route call argument added; the LEGIS_WARDLINE_ALLOW_DIRTY dirty-snapshot opt-in stays an env-only operator switch. C3 (weft-f506e5f845) — charter now documents that legis's OWN audit records carry a self-asserted agent_id/operator_id (launch-bound + HMAC-tamper-evident, not authenticated); verified_author:null maps to those fields. Guards: test_c8_no_agent_reachable_enablement_or_signing_surface (no enable/sign tool; scan_route schema locked) + doctor checks write-nothing/render-no-key test. 762 passed; ruff + mypy clean; coverage 92.30%; per-package floors hold; policy-boundary-check PASS; SEI oracle PASS. Designed + adversarially red-teamed (C-8 verdict: safe) and implementation-reviewed via multi-agent workflows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ntime Acceptance branch 1 of N3 (weft-df8d2ef454) — "a fresh stdio launch CAN reach a configured non-secret surface" — was only proven via injected-engine unit tests; the CHANGELOG and ticket comments assert "chill/coached reachable keyless" as fact. Add a test that exercises the REAL launch path: build_runtime() with no LEGIS_HMAC_KEY + the LEGIS_DEV_DEFAULT_CELLS=1 chill posture, then override_submit -> ACCEPTED_SELF via the lazy keyless _engine. A future change making _engine require a key now fails here instead of silently falsifying the promise. (Scan-route axis already pinned by test_scan_route_uses_server_owned_cell.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…48eb2) Wardline renamed the per-finding output key `suppressed` -> `suppression_state` across all surfaces incl. the SIGNED legis scan artifact, changing the canonical signed bytes and breaking the Wardline->legis hop (wardline's opt-in legis_e2e oracle red by design). legis adopts the new key. - ingest: WardlineFinding.from_wire reads `suppression_state`; the dataclass field, error message, and active_defects branches follow. Values unchanged (active/waived/suppressed/baselined/judged); the `Suppressed` enum (value vocabulary) and SUPPRESSION_PROOF_KEYS are untouched. - clean break: a finding carrying only the legacy `suppressed` key reads as `active` and OVER-gates — fail-safe (never silently drops a real defect), pinned by test_legacy_suppressed_key_is_ignored_clean_break. - NO signing/canonical change: legis's signer already reproduces Wardline's rekeyed golden byte-for-byte. Added the legis-side cross-impl golden MIRROR legis was missing: sign(_GOLDEN_FIELDS, _GOLDEN_KEY) == hmac-sha256:v2:2b2cf09… over `suppression_state`, so the hop self-verifies on both ends. - intake fixtures: ~40 `suppressed` test fixtures across tests/wardline, tests/api, tests/mcp, tests/store renamed to `suppression_state` (a sweep flagged these to avoid vacuously-green suppression-path assertions). Acceptance: legis 767 tests green; golden byte-agreement pinned; the live signed hop verifies — wardline's `-m legis_e2e` test_legis_accepts_signed_artifact PASSES against the reinstalled legis (real build_legis_artifact -> signed suppression_state artifact -> legis verifies + routes). Branch-only; ship via the filigree-gated rc4->main merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… resolution check_policy_cells claimed to "mirror mcp._load_policy_cell_registry" but the root fallback differs: the resolver uses os.getcwd() when LEGIS_SOURCE_ROOT is unset, while doctor uses its passed-in root. The env precedence is faithfully mirrored; the root resolution is a deliberate difference (they coincide when doctor runs from the server's launch CWD). Tighten the docstring to say so. Docstring-only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…oute root (opp #6) scan_route returned `{outcome: ROUTED, routed:[...]}` with no top-level posture field, so an agent relaying "governance passed" could not tell a keyless dev-grade pass (unverified/dirty) from a CI-signed `verified` pass — the posture was only buried in each routed record's provenance, and absent entirely when nothing routed. Same vacuous-green fidelity gap as wardline W2. - `route_wardline_scan` now returns `RoutedScan(routed, artifact_status)` instead of a bare list, surfacing the scan-level `artifact_status` that `verify_wardline_artifact` already computes - both surfaces echo it at the response root: the MCP `scan_route` tool and the HTTP `/scan-route` adapter (identical contract) - new MCP test asserts a keyless unsigned scan echoes `artifact_status: "unverified"` at the top level; the exact-shape routing test gains the field Closes gap-analysis opp #6. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…filigree-scope check (N1) Close two release-1.0 risk-audit gaps: POLICY-1 — a pinned, running evidence test could be disabled after the fact with @pytest.mark.skip / skipif / xfail. The fingerprint is blind to decorators (Q-L5 parity), so the drift check is byte-identical and cannot see the disablement. Add a highest-priority disabled-evidence judgement in the shared evaluate_test_evidence so both the runtime gate and the static boundary scanner reject it identically (new POLICY_BOUNDARY_TEST_DISABLED). Marker match is terminal-name based, so it catches the import-alias form (`from pytest import mark; @mark.skip`) whose only tell lives outside the function source the fingerprint sees. N1 — add report-only check_filigree_binding_scope to doctor: an unscoped federation-write binding in .mcp.json (/api/weft/… etc.) is fail-closed with HTTP 400 by a filigree server-mode daemon, so scans silently non-emit. Warn (not error — harmless against single-project/stdio) and name the offending URL + the scoped form to use.

/governance/lineage-integrity computed status as "unverified" if unavailable else "verified", ignoring integrity.divergences. A confirmed external tamper (divergence list populated) reported status="verified" — a false green at the top-level posture while the same payload carried the divergence. Three-way precedence: any divergence -> "diverged" (most severe, confirmed tamper) over "unverified" (can't check) over "verified". The existing divergence test pinned the divergences list but pointedly omitted the status assertion; pin status="diverged" so the false green cannot regress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… anchor (AUD-1) An attacker with DB-file write access could delete an audit record and re-chain the survivors undetectably: the hash chain is plain SHA (keyless, recomputable) and the HMAC bound record *content* but never its chain *position*, so every surviving signature still verified and the chain stayed internally consistent. service/governance.py already documented that whole- trail verify catches mutation but not deletion. Two complementary, isolated mechanisms now close it: * seq-binding (v3) + contiguity — interior delete and reorder. verify_integrity gains an expected-seq counter (a re-chained gap is now a tamper), and protected + sign-off verdicts sign at v3, folding the chain seq into the HMAC. A renumber-to-hide-a-deletion then fails to verify at the new position. seq is taken from the column at verify time, never a payload field. Resolved the sign-before-seq ordering with a store-mediated append_signed: the store reserves seq + prev_hash under its BEGIN IMMEDIATE lock and hands them to a signer callback, so the bound seq is provably the row's seq with no race. The store stays key-agnostic (the callback closes over the gate's key). * HeadAnchor (opt-in) — tail-truncation, the one thing seq-binding structurally cannot catch (a truncated head is legitimately last). A small HMAC-signed sidecar remembers the last (seq, chain_hash); a missing anchor on an anchored store fails closed. Wired as optional gate/verifier params, off by default — conceded-capability hardening that does not touch the 1.0 core. The shared sign()/verify() primitive keeps its v2 default, so the cross-tool Wardline artifact contract and the binding ledger are byte-for-byte untouched. Binding ledger stays v2 (separate, homogeneous store) but is covered by the new contiguity check; renumber-within that store is a documented residual, as is the inherent renumber-vulnerability of an all-unsigned (chill/coached) run. Tests: three attack PoCs, each isolating one mechanism (interior-delete-gap → contiguity; delete-and-renumber → v3 seq-HMAC; tail-truncate → anchor), plus HeadAnchor unit coverage (forged/missing/reappend/no-op) and a v3 signing pin. Full suite 793 passed, 2 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ail-loss (AUD-3) The audit store ran synchronous=NORMAL under WAL. NORMAL only fsyncs the WAL at a checkpoint, so a committed-but-not-yet-checkpointed append is lost on a power-cut while the database stays consistent. The survivors form a contiguous, fully-signed hash chain — a valid-looking SHORTENED trail indistinguishable from "nothing more was ever written". For an audit-integrity store that silent tail-loss is precisely the harm. Set synchronous=FULL: each commit is fsynced, so a committed governance record survives power loss; throughput is the correct thing to trade here. The floor is intentionally not configurable — an audit store's durability must not be lowerable back to the bug. SQLite's default wal_autocheckpoint still bounds WAL growth, so no separate checkpoint lifecycle is needed. This is the prevention half of the shortened-trail problem; AUD-1's out-of-band head anchor is the detection half (it flags a trail that shrank below its recorded head, whether by malice or by lost-tail). Pinned by reading PRAGMA synchronous (==2 FULL) on a listener connection, mirroring the existing WAL/busy_timeout pragma tests. Full suite 795 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ed limit (AUD-1 red-team) An adversarial review of the AUD-1 anchor (5 red-team lanes, executed PoCs) refuted every interior-delete / reorder / renumber / version-downgrade / seq-soundness attack and confirmed the Wardline v2 contract is byte-for-byte intact (201-test regression sweep green). It found one genuine residual: the anchor's HMAC stops forgery but not REPLAY. The anchor is a single mutable sidecar, so a snapshotting attacker can save a genuinely-signed early anchor (head=1), let the trail grow, truncate the DB back to seq=1, and restore the saved anchor — it verifies (real signature, consistent seq + chain_hash) and the rollback goes undetected. This is inherent to local same-filesystem storage: nothing on disk is beyond a file-write attacker's rollback, so no purely-local check (counter, timestamp, extra copy) closes it — that would be honesty theatre. The fix is a deployment property: store the anchor on append-only/WORM or remote storage, or run an external monitor on the anchored head's monotonicity. The prior docstring over-claimed it detects "a rollback to an earlier consistent prefix" — false under replay. Corrected to state precisely what it catches (forgery; truncation by a late/non-snapshotting attacker) and the replay limitation + its real mitigation. Pinned the boundary with an executable known-limitation test so the over-claim cannot silently drift back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…st-marker-only (INSTALL-1) The injector deliberately tolerates a split brain: when a second legis instruction block sits beyond a sibling tool's block, it cannot canonicalise across the foreign block, so it rewrites the first block fresh, warns, and leaves the stale second copy in place (foreign-safety wins over own-dedup). The doctor's freshness probe, though, read the token off the FIRST marker only (_MARKER_TOKEN_RE.search → first match) — so a fresh first block masked a stale second block and the doctor reported "healthy" on exactly the conflicting- guidance state it exists to catch. Freshness now requires EXACTLY ONE legis block at the current token, via a new foreign-aware walk (_own_open_marker_tokens) that reuses the injector's own fence-tracking — a legis marker quoted inside a sibling block is not counted, so the probe never miscounts a documented example as a real block. check_instruction _block surfaces a split brain (>1 block) with an actionable hand-resolution message and, since the injector cannot collapse it, does not falsely claim repair fixed it. This is the same honesty discipline as GOV-1/POLICY-1: a gate must not report green on the condition it exists to detect. RED test pinned the false-"ok" first; both CLAUDE.md and AGENTS.md get the fix via the shared check. Full suite 797 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

HttpLoomweaveIdentity.capability() probed GET /api/v1/_capabilities with an explicit signed=False, so the request went out unsigned even when an HMAC key was provisioned — the lone unsigned exception among the SEI routes, and the very one that establishes whether legis trusts the provider as SEI-capable. On a keyed deployment that left the trust-establishing handshake unauthenticated, spoofable to capability=supported. Sign it like every other route (the default path already no-ops signing when no key is set, so loopback/trusted deployments are unchanged). Removed the per-call `signed` knob from _request entirely: an unsigned opt-out is exactly the affordance that caused this, and no other caller used it — so it cannot reintroduce the gap. Wire confidentiality against an on-path response rewrite remains TLS's job, which _validate_base_url already enforces for any non-loopback (keyed) host. RED-pinned the unsigned probe ({} headers when keyed) before the fix; added a companion test that the keyless probe stays bare. Full suite 799 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… guard (JUDGE-1) In the coached cell a model ACCEPTED maps straight to accepted=True, and the agent-controlled rationale (and, on the degraded-to-locator branch, the entity locator) flowed into the judge prompt with no length bound — so a prompt-stuffing payload could bury the instruction or smuggle an injection into the model. LLMJudge.evaluate now bounds the SERIALIZED request — {policy, entity, rationale} exactly as build_prompt embeds it — at MAX_JUDGE_REQUEST_CHARS (8192) before the model is consulted; over-cap is rejected as BLOCKED by a deterministic guard that never calls the model (stamped with a self-documenting sentinel model id, not an LLM identity). Measuring the serialized request (not the raw rationale) bounds every agent-settable field in one check — rationale, entity locator, and the ensure_ascii unicode-expansion variant (each non-ASCII char → 6-char \uXXXX, so a raw-char cap would be 6x loose). Reject, never truncate: truncation would mutate the rationale that is recorded and (protected cell) signed, and could pass a front-loaded injection. The full over-cap rationale is still written to the BLOCKED record, so the attempt stays attributable. build_prompt's serialization (the structural-escape defense — a forged sibling {"verdict":"ACCEPTED"} survives only as an escaped string value) is now pinned by a round-trip test covering rationale AND entity injection (JUDGE-2). The module docstring documents the residual honestly: a SEMANTIC injection that persuades the model is a model-robustness property, not a code fail-open — mitigated by attribution and, in the protected cell, by Q-H3's deterministic validator. TDD: RED-pinned both stuffing vectors (rationale + entity reaching an accepting model) and the model-never-consulted property before the guard; added an in-cap boundary test so a thorough justification is not falsely blocked. Full suite 803 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…-2, CRYPTO-THRESHOLD-001) Closes the last three low/post-1.0 items from docs/release-1.0-risk-audit.md. POLICY-2 (this session) — remove the exemption-rescue mechanism outright. PolicyGrammar had a VIOLATION->CLEAR exemption-rescue branch wired to an agent-writable YAML loader (ExemptionAllowlist.from_file) with zero src consumers — the latent bypass trap the finding names. Full removal: delete policy/exemptions.py + tests/policy/test_exemptions.py, drop the exemptions ctor param / _exemptions / rescue branch from grammar.py, and remove the 3 rescue-branch tests. New regression guard test_grammar_has_no_exemption_rescue _mechanism pins that no exemption seam can be re-introduced by accident. This supersedes the earlier conservative document-only closure of legis-e512e97bfc (see ticket history): documenting around the loader left the trap in the tree. AUTH-1 (doc) — app.py comment telegraphs that LEGIS_ALLOW_UNSCOPED_API_TOKENS=1 grants unscoped tokens operator authority (not renamed: the var already fits the LEGIS_ALLOW_<bad-thing> family; audit remedy was "rename OR document"). CRYPTO-THRESHOLD-001 (doc) — README scopes the "cryptographic layer" to intra-suite HMAC tamper-evidence with a self-asserted actor, not third-party cryptographic proof; names RFC-8785 as the upgrade path. Full suite green (792 passed, 2 skipped), ruff clean on changed files. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Resolve the 6 standing lint errors (default ruff E4/E7/E9/F ruleset): - test_doctor.py: 5x E402 (module-level imports placed under mid-file section headers) — consolidated into the top import block; section comments kept. - test_install.py: 1x F401 — dropped the unused `_legis_mcp_entry` import. No behaviour change. Full suite green (792 passed, 2 skipped), ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Second adversarial pre-ship review (docs/release-1.0-pre-ship-review.md) re-attacked the prior audit's self-verified fixes. Crypto-threshold held; these gaps it surfaced are now closed, each independently re-verified. - JUDGE-3 (protected-cell fail-open): the Q-H3 advisory-downgrade was gated on exact-match `protected_policies`, which diverges from the glob-capable cell routing — a protected-cell policy outside the set (incl. any glob route and the empty-set default) had its model ACCEPTED signed authoritative. The cell is now fail-closed UNCONDITIONALLY: it clears only on a validator-confirmed ACCEPTED. Independent re-attack then caught a second variant — a fooled model emitting the operator-only OVERRIDDEN_BY_OPERATOR (which _record_signed also counts as accepted) cleared the gate even for a declared protected policy. Closed at two layers: the judge JSON parser now restricts verdicts to {ACCEPTED, BLOCKED}, and submit() downgrades the whole accepted-set. Behavior change: with no validator wired (default prod), protected overrides now require operator sign-off. Regression tests at parser and gate levels. - GOV-2: /governance/identity-gaps now returns a {status, gaps} envelope ("unavailable" vs "checked") so a can't-check state is not a false all-clear, matching the GOV-1 fix on the sibling lineage-integrity endpoint. - F1: TrailVerifier docstring corrected — no longer claims modify-to-unsigned is caught; the modify-to-unsigned / tail-truncation residuals of the conceded raw-file-write tier are documented honestly (code hardening tracked post-1.0). - POLICY-1: aliased-marker (`skipper = pytest.mark.skip; @skipper`) and fixture-skip vectors documented as residuals in _disabling_marker (zero live @policy_boundary sites; name-heuristic hardening tracked post-1.0). - ID-SEI-1: LEGIS_ALLOW_INSECURE_REMOTE_HTTP now warns on a remote-plaintext bypass (loomweave + filigree clients); documented in README + federation doc. - ID-SEI-2: resolver `alive` is now strict-bool; a non-bool truthy value degrades fail-closed instead of promoting to a stable SEI identity. - README "Known security limitations" section + CHANGELOG entries. Suite 801 passed / 2 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

doctor: - `--fix` is now the canonical repair flag; `--repair` stays a working alias (argparse dest `fix`), so no script breaks. - DoctorCheck gains a `repairable` bit; text view tags each problem `[fixed]` / `[auto-fixable]` / `[operator]` with footers that point auto-fixable items at `legis doctor --fix` and tell the operator that `[operator]` items need out-of-band config + a relaunch. JSON checks carry `repairable` additively. - `install.filigree_scope` is gated on filigree actually being installed (file-existence probe, no filigree import): the unscoped-binding warning only fail-closes against a server-mode filigree daemon, so it is noise when filigree is absent. When it fires, the message names it operator- owned (the `--filigree-url` is operator-pinned in wardline's `.mcp.json`) and stays repairable=False. tidy for 1.0 (version held at rc4 per the live-e2e gate): - README + doctor docstring use the canonical `--fix` spelling. - CHANGELOG [Unreleased] records the above. - .gitignore ignores `.claude/*.lock` (transient scheduled-tasks lock). - removed stray build artifacts (.coverage, coverage.json). Full suite green (813 passed, 2 skipped), ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The README covers the *why* (the 2×2 concept) and the legis-workflow skill covers the *agent-call* surface, but there was no human-operator guide for "how do I configure this" and "what am I seeing when an agent does X". Adds docs/guide/: - configuration.md — the operator's governance-control reference: reconciles "zero human config" (the agent's experience) with the operator's two acts (choose the cell, hold the key); per-cell cost/buys table; the fail-closed routing default + resolution order; full LEGIS_* / OPENROUTER_* env-var reference grouped by purpose; and a separate, warning-carrying "dev-only / escape hatches" section for the LEGIS_UNSAFE_* / LEGIS_ALLOW_* flags. - reading-legis-output.md — organized by "where it surfaces / what it means / do I act": keeps the recorded Verdict (ACCEPTED/BLOCKED/OVERRIDDEN_BY_OPERATOR) distinct from the override_submit outcome envelope (ACCEPTED_SELF / ACCEPTED_BY_JUDGE / BLOCKED / ESCALATED_PENDING / NEED_INPUTS); covers scan outcomes, artifact/identity/lineage statuses, the override-rate gate, CI exit codes, doctor tags, and flags the only signals that need a human in real time. - README.md (index) + links from the top-level README. Every flag/enum/command cited was verified against source (e.g. dropped a spurious OPENROUTER_BASE_URL row that was a grep artifact of the DEFAULT_OPENROUTER_BASE_URL constant, not a real env var). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The reference tables answer "what does signal Y mean / do I act"; a single compact narrative (agent hits a coached policy → BLOCKED → revise → ACCEPTED_BY_JUDGE → async review, with the structured ESCALATED_PENDING contrast) converts the reference into the mental model behind the user's literal question, "what am I seeing when an agent does X". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…gree's install predicate Two corrections to the doctor checks landed in 84a8047: - **Split-brain instruction block is not auto-fixable.** `--fix` returns before the repair branch for the >1-block split-brain case (the injector won't splice across a sibling tool's block), so tagging it `repairable=True` rendered a false `[auto-fixable]` signal that re-creates the very --fix loop the design eliminates. Now `repairable=False` → `[operator]`, matching the check's own "resolve it by hand" message. (Corrects the tag shipped in 84a8047.) - **`_filigree_installed` now mirrors filigree's real install predicate.** It was an AND requiring `.filigree.conf` AND a `config.json`; filigree's `find_filigree_anchor` (core.py:1046-1064) treats a project as installed if ANY of three markers is present: `.filigree.conf` (file), `.weft/filigree/` (dir), or `.filigree/` (dir) — never AND, and the store/legacy checks are `.is_dir()`, not a `config.json` `.is_file()`. The old AND would return "not installed" for confless / legacy / conf-only installs and SILENTLY DROP a real unscoped-binding warning where filigree genuinely is installed — the false-green the governance honesty discipline forbids. Tests updated to cover conf-only, confless-weft, and confless-legacy installs (the last is the live federation-legacy-path case). Full suite green (815 passed, 2 skipped), ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…te cell trap, envelope next_action LEG-1: add the policy_list tool (routing table + each cell's honest enabled state, computed via a shared explain_cell so it can never disagree with policy_explain) and an additive matched_rule field on policy_explain (a configured policy reports its rule pattern; an unconfigured/hallucinated name reports null). cell_for now delegates to a new rule_for() so routing and discovery cannot drift. LEG-2: the error envelope already carries next_action/recoverable for every code (_recovery_for); reconcile the SKILL.md error table to it verbatim and add one drift-lock test asserting every emitted code yields a non-empty next_action. No new abstraction. LEG-3: scan_route's server-owned rejection now names the rejected request-side arg(s) (cell/severity_map/fail_on) while retaining the literal 'server-owned' substring; the cell/severity_map/fail_on schema descriptions state the LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING gating. Additive only; no routing/enablement/tiering semantics changed. ruff + mypy clean; full suite 825 passed, 2 skipped (+10 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Version 1.0.0rc4 -> 1.0.0 across pyproject, legis.__version__ (feeds the MCP serverInfo, /health, and `legis --version`), and uv.lock. CHANGELOG [Unreleased] -> [1.0.0] (2026-06-09) with refreshed compare links. 1.0 release-prep hygiene (same pass): - README points to the now-public adversarial threat model — the risk audit and the independent pre-ship review, attack recipes and all — framed as the "forced me to do the right thing" discipline it is. - Dropped the rc1 "Known limitations" list from the changelog: the MCP item was superseded at rc2; the live sibling-gated items moved to the Filigree tracker (outstanding work belongs in the tracker, not the log). No code behavior change — version strings + docs only. Full suite green (825 passed, 2 skipped; ruff + mypy clean). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-06-09T11:09:18Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

Copilot

Pull request overview

This PR cuts the 1.0.0 final release (from 1.0.0rc4) and includes the release-prep hardening and documentation that accompanied the adversarial reviews described in the PR metadata. It also updates several governance/attestation invariants (audit-trail tamper evidence, Wardline schema interop, MCP/operator surfaces) and expands tests/docs to pin the intended fail-closed behaviors.

Changes:

Bump versioning and release notes to 1.0.0 across package metadata and changelog.
Strengthen governance/audit integrity and posture reporting (v3 signatures with chain_seq, head anchor support, WAL durability pragma, structured skip payloads, more explicit routing errors).
Update Wardline ingest to the suppression_state wire key and add new/expanded tests and operator/agent-facing documentation.

Reviewed changes

Copilot reviewed 67 out of 69 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
uv.lock	Update editable package version to 1.0.0.
tests/wardline/test_policy.py	Update test fixtures to use `suppression_state`.
tests/wardline/test_ingest.py	Update ingest tests for `suppression_state` and add golden/skip payload tests.
tests/wardline/test_governor.py	Update governor tests for `suppression_state`.
tests/wardline/test_coached_routing.py	Update coached routing tests for `suppression_state`.
tests/test_install.py	Remove unused import in MCP registration test.
tests/store/test_head_anchor.py	Add tests for the out-of-band head anchor behavior and limitations.
tests/store/test_batch_read_free_invariant.py	Update finding fixtures to use `suppression_state`.
tests/store/test_audit_store.py	Add/expand tests for `synchronous=FULL` and contiguity integrity checks.
tests/service/test_wardline.py	Add tests pinning improved Wardline routing error messages.
tests/service/test_governance.py	Update signing fields tests to include `seq` binding.
tests/service/test_explain.py	Pin `matched_rule` reporting in explain payloads.
tests/policy/test_honesty_gate.py	Add tests ensuring disabled evidence tests are rejected (POLICY-1).
tests/policy/test_grammar.py	Pin removal of exemptions seam and update related expectations.
tests/policy/test_exemptions.py	Remove exemptions tests (feature removed).
tests/policy/test_evidence.py	Add evaluator tests for disabled-marker detection (skip/xfail/skipif).
tests/policy/test_boundary_scan.py	Add end-to-end scan tests for disabled evidence tests.
tests/mcp/test_server.py	Add `policy_list` tool coverage, posture echoing, and next_action invariants.
tests/identity/test_resolver.py	Add test ensuring non-bool `alive` does not promote stable identity.
tests/identity/test_loomweave_client.py	Add tests for signed capability probe and insecure-HTTP warnings.
tests/filigree/test_client.py	Add tests for insecure-HTTP warning and enforcement behavior.
tests/enforcement/test_trail_verify.py	Add tests for seq-binding and anchored tail-truncation detection.
tests/enforcement/test_signoff.py	Update expected signature prefix to v3 for sign-offs.
tests/enforcement/test_signing.py	Add v3 signing/verification primitive tests.
tests/enforcement/test_regressions.py	Remove exemptions regression test (feature removed).
tests/enforcement/test_protected_submit.py	Update protected submit tests for fail-closed behavior + v3 binding.
tests/enforcement/test_protected_override.py	Update operator override signature expectations to v3.
tests/enforcement/test_protected_extensions.py	Update signature verification reconstruction to use `seq`.
tests/enforcement/test_judge.py	Add tests for operator-only verdict rejection and prompt-size cap.
tests/api/test_sei_api.py	Update API tests for new identity-gaps envelope and protected validator wiring.
tests/api/test_complex_api.py	Update API tests for fail-closed protected behavior + identity-gaps envelope.
tests/api/test_combinations_api.py	Update API tests to use `suppression_state` and pin structured dirty-skip fields.
src/legis/wardline/ingest.py	Implement `suppression_state`, structured dirty-tree skip payload, and updated active-defects logic.
src/legis/store/protocol.py	Extend store protocol with `append_signed` and head query support.
src/legis/store/head_anchor.py	Add new HeadAnchor implementation for tail-truncation detection.
src/legis/store/audit_store.py	Add `append_signed`, contiguity checks, `synchronous=FULL`, and head query helper.
src/legis/service/wardline.py	Improve routing error messages and return scan-level posture in routing result.
src/legis/service/explain.py	Add `matched_rule` and refactor explain plumbing (`explain_cell`).
src/legis/policy/grammar.py	Remove exemptions seam from policy grammar.
src/legis/policy/exemptions.py	Remove exemptions implementation (feature removed).
src/legis/policy/evidence.py	Add disabling-marker detection and return `disabled` evidence results.
src/legis/policy/cells.py	Add `rule_for` and expose rule list for routing introspection.
src/legis/policy/boundary_scan.py	Map `disabled` evidence outcome to a dedicated rule id.
src/legis/mcp.py	Add `policy_list`, improve `scan_route` output posture, and enrich recovery hints.
src/legis/install.py	Add split-brain detection helper for multiple legis instruction blocks.
src/legis/identity/resolver.py	Fail-closed `alive` handling requiring strict boolean True.
src/legis/identity/loomweave_client.py	Sign capability probe when keyed and warn on insecure remote HTTP.
src/legis/filigree/client.py	Warn on insecure remote HTTP when bypass flag is set.
src/legis/enforcement/signoff.py	Bind sign-off signatures to seq (v3) and optionally advance head anchor.
src/legis/enforcement/signing.py	Add v3 signature prefix support and verification dispatch.
src/legis/enforcement/protected.py	Add v3 seq-binding, head-anchor checking, and protected fail-closed logic.
src/legis/enforcement/judge.py	Add prompt-size cap guard and restrict allowed judge verdicts.
src/legis/data/skills/legis-workflow/SKILL.md	Document `policy_list` and updated error recovery hints.
src/legis/cli.py	Add canonical `legis doctor --fix` flag (keep `--repair` alias).
src/legis/api/app.py	Improve identity-gaps honesty envelope, lineage-integrity status, and Wardline responses.
src/legis/init.py	Bump `__version__` to 1.0.0.
README.md	Update release status and add security limitation + operator docs sections.
pyproject.toml	Bump project version to 1.0.0.
docs/release-1.0-risk-audit.md	Add published pre-release adversarial audit doc.
docs/release-1.0-pre-ship-review.md	Add published second-pass adversarial review doc.
docs/guide/README.md	Add operator guide index.
docs/guide/reading-legis-output.md	Add operator guide for interpreting outcomes/verdicts/statuses.
docs/guide/configuration.md	Add operator configuration guide and env var reference.
docs/federation/sei-conformance.md	Document TLS custody seal dependency and insecure-HTTP bypass implications.
docs/design/legis-charter.md	Expand charter note about self-asserted actor identity in records and federation writes.
CHANGELOG.md	Add 1.0.0 entry summarizing security/honesty closures and surface changes.
.gitignore	Ignore Claude Code transient lock files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        anchored_seq = body.get("head_seq")
+        anchored_chain = body.get("head_chain_hash")
+        if not sig or anchored_seq is None or anchored_chain is None:
+            raise AnchorError(f"head anchor {self._path} is structurally malformed")
+        if not verify(_anchor_fields(anchored_seq, anchored_chain), sig, self._key):


+
+- **The coached cell is a model-robustness wall, not a cryptographic one.** A blocked agent clears the coached gate by convincing the LLM judge; a *malicious prompt injection* that persuades the model will likewise clear it. Structural injection (forging a verdict key) is closed and any transport/parse failure is fail-closed to `BLOCKED`, but the coached cell has no defense-in-depth against a model that is genuinely fooled. For verdicts that must not rest on the model's word, use the **protected** cell, where a judge `ACCEPTED` is advisory only and is downgraded to require operator sign-off (unless a deterministic, non-LLM validator confirms it).
+- **Tamper-evidence assumes the signing key is out of the attacker's reach, and is not absolute against raw DB-file writes.** v3 signing binds each record's chain position, so in-place edits, reordering, and renumbering are detected. A holder of raw write access to the governance `.db` can still *delete* a record and re-chain, or rewrite a record's policy to a non-protected value and strip its protected markers ("modify-to-unsigned"), or truncate the tail — these are residuals of the conceded raw-file-write threat tier. The opt-in `HeadAnchor` mitigates truncation/rewind (with a documented anchor-replay caveat). Keep the governance store on storage only the operator controls.
+- **Durability tier.** The audit store runs `synchronous=FULL`, but a power loss can still drop the most recent un-checkpointed appends; the trail stays internally consistent (a shortened-but-valid tail), it does not corrupt.


tachyon-beep and others added 22 commits June 8, 2026 01:32

Copilot AI review requested due to automatic review settings June 9, 2026 11:09

Copilot started reviewing on behalf of tachyon-beep June 9, 2026 11:09 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.0.0 — final (drop the rc)#9

Release 1.0.0 — final (drop the rc)#9
tachyon-beep wants to merge 22 commits into
mainfrom
rc4

tachyon-beep commented Jun 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tachyon-beep commented Jun 9, 2026

What's in it

Verification

Not done in this PR (release follow-ups, operator's call)

Uh oh!

chatgpt-codex-connector Bot commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants