Release 1.0.0rc4: doctor, C-9 store consolidation, dogfood friction-tail closeout#7
Conversation
These are filigree-generated agent instruction files (carrying the `<!-- filigree:instructions:... -->` marker) regenerated by the session-start hook on every run, so tracking them produces churn-only diffs as the filigree version bumps. They are already listed in .gitignore; untrack them so the ignore actually takes effect (git keeps tracking files added before they were ignored). They remain on disk for local use and continue to be regenerated. Mirrors PR #5's untracking of local integration config. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the ignore entries that make the prior untracking stick — without these, the session-start hook's regenerated copies would show up as untracked and could be re-added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
filigree/client.py was the lowest-covered module when arch-analysis filed this; subsequent PRs lifted it to 94%, leaving the transport/error paths a security reviewer cares about uncovered (lines 35, 43, 118-119, 127, 130). Add focused tests for the unsigned-transport seam (Q-M4): - _json_body_bytes(None) -> b"" (zero-byte signed body) - _path_and_query carries the query string the HMAC commits to - _urllib_fetch wraps urllib.error.URLError as a typed FiligreeError - _decode_json_response rejects non-JSON content type - _decode_json_response rejects oversized responses before decode client.py now at 100% line coverage; 527 passed. Closes legis-f675dc5cd4 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…7 / roadmap 11) CI ran --cov-fail-under=70 against ~91% actual coverage — 20 points of silent- regression headroom — and had no lint step. Two F401 unused imports existed (policy/grammar.py Hashable, mcp.py WardlinePayloadError). - Clear both F401s. - Add ruff (dev dep + [tool.ruff.lint] default rule set E4/E7/E9/F) and a `ruff check src` CI gate. Import-sorting/pyupgrade left off deliberately to avoid unrelated churn and F821 on the honesty-gate test fixtures. - Raise the global floor to 88% (--cov-fail-under and [tool.coverage.report]). - Add scripts/check_coverage_floors.py: per-package floors for the security- critical packages (enforcement 93, service 92, governance 90, api 88, mcp.py 80), set a few points below current so they catch a real regression without tripping on incidental churn. Wired as a CI step. 527 passed, total 91.12%, all floors hold; ruff/mypy/boundary-gate green. Closes legis-d16c4fab16 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sh (Q-L6) resolver.py probed the Loomweave sei capability once per instance and latched the result for the resolver's whole life. A long-lived service resolver whose Loomweave LOST the capability mid-life kept treating it as capable until a later call happened to raise (and symmetrically never noticed a capability regained). - Add a capability TTL (default 300s): the latch — positive or negative — ages out and is re-probed. Time is injected (`monotonic`, default time.monotonic) for deterministic tests, mirroring the client.py timestamp/nonce injection. Transient probe errors still clear the latch and retry on the next resolve. - Type-check content_hash from the resolve response: a non-string degrades to None rather than landing in the typed str|None content axis. 531 passed; ruff/mypy/coverage-floors green. Closes legis-5875d0f156 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…p 12) The per-PR live oracle in ci.yml is gated on `vars.LOOMWEAVE_URL != ''`, so an absent variable silently skips it and Loomweave endpoint/header drift sails through default CI. Releases inherited the same blind spot. - Add .github/workflows/loomweave-conformance.yml: a FAIL-CLOSED live oracle gate. A missing LOOMWEAVE_URL / LOOMWEAVE_LIVE_ORACLE_LOCATOR / HMAC secret is an ::error and exit 1, not a skip. Runs on a daily schedule (drift sweep), workflow_dispatch, and workflow_call. - Gate release publish on it: release.yml grows a `conformance` job that calls the reusable workflow, and `publish` now needs [build, conformance]. A release cannot be published unless live conformance passes. The per-PR ci.yml step stays opt-in (unchanged) — releases and the schedule are where conformance is now mandatory. Live behavior can only be exercised with a real Loomweave endpoint; YAML structure and the local skip path are verified. Closes legis-2087bfca94 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… test (Q-M5) AuditStore.transaction()'s all-or-nothing guarantee depended on an UNENFORCED contract: every public read (read_all / read_by_seq / verify_integrity / get_latest_sequence_and_hash) opens its own connection, so a read issued inside a held BEGIN IMMEDIATE batch would miss the uncommitted appends and contend with the write lock (SQLITE_BUSY on SQLite, possibly silent on other backends). Only governor's own discipline (resolve all entities before opening the batch) kept the gate append paths read-free — nothing in the store stopped a future gate method from introducing an in-batch read. - Enforce: each read method guards on the thread-local and raises a clear RuntimeError if called inside an active transaction() batch on the same thread — turning silent, backend-dependent contention into a deterministic, loud failure a future regression's tests catch immediately. - Regression test (on real on-disk SQLite, not in-memory/shared-cache): drives the real EnforcementEngine.submit_override and SignoffGate.request append paths through route_findings' batch and asserts they complete (proving read- free), plus guard-fires-on-read and all-or-nothing rollback tests. - Update transaction() docstrings (store + protocol) to state the invariant is now enforced. 539 passed; ruff/mypy/coverage-floors green. Closes legis-79f42be309 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-8785 (Q-L5/Q-L4) Q-L5 (reconcile fingerprints) — DONE: The runtime honesty gate (decorator.fingerprint via inspect.getsource) and the static scanner (boundary_scan via ast.get_source_segment of the FunctionDef) could compute divergent fingerprints for a decorated or class-method test_ref: inspect.getsource INCLUDES decorator lines, get_source_segment EXCLUDES them, and only the runtime path normalized CRLF or had a parse-failure fallback. - get_normalized_ast_str now strips decorator_list (FunctionDef/AsyncFunctionDef/ ClassDef), making the AST fingerprint decorator-insensitive. - New fingerprint_source(): the single canonicalization (CRLF->LF, dedent, normalized-AST hash, fallback) both paths now route through — they can no longer diverge. - Verified empirically: a decorated test fingerprinted e2b8.. (runtime) vs d833.. (static) before; both d833.. after. Undecorated module-level tests are unchanged, so the 7 real boundaries' stored fingerprints still verify (policy-boundary-check PASS). Q-L4 (RFC-8785) — ASSESSED & DEFERRED with evidence: RFC-8785 is gated on "when cross-language verification is needed" (ADR-0001/0002, arch-handover item 15). No current consumer verifies a legis hash from a non- Python runtime; content_hash always derives utf-8 bytes, so v1 is deterministic for single-language use today. The fingerprints are Python ast.dump output (not cross-language JSON) so RFC-8785 does not apply to them. Recorded the assessment in canonical.py; the single-choke-point design keeps it a one-file upgrade when a cross-language verifier lands. Building it now would be speculative. 543 passed; ruff/mypy/coverage-floors/boundary-gate green. Closes legis-b4445c2f42 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…L8 / roadmap 14) mcp.py was the single-file complexity hotspot. Three parts from the ticket: 1. Table-driven dispatch: call_tool was a ~360-line if/elif over the tool name. Each branch is now a `_tool_<name>(runtime, args)` handler behind a `_TOOL_HANDLERS` dict; call_tool is a 9-line shell that keeps the same argument-key validation, UNKNOWN_TOOL fallback, and _service_error wrapper. Extracted mechanically (AST-driven) to avoid transcription drift; all 39 mcp tests and the full suite pass unchanged — behavior-preserving. 2. Stdin line-size bound: the hand-rolled one-object-per-line framing read lines unbounded, so a peer sending a line with no newline forced an unbounded read. run_jsonrpc now reads via _read_bounded_line (default 16 MiB, override with LEGIS_MCP_MAX_REQUEST_BYTES); an over-long record is rejected with -32700 and the framing realigns at the next newline rather than mis-parsing. 3. Shared config module (mcp->api edge): already landed — DEFAULT_*_DB live in legis.config and mcp imports from there (no legis.api import remains). Verified, no further change needed. mcp.py coverage 82.2% -> 83.4%; 545 passed; ruff/mypy/coverage-floors green. Closes legis-d70b003e93 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Untrack the filigree-generated AGENTS.md and CLAUDE.md instruction files and add them to .gitignore so they stay local-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… path wardline now ships `scan --format legis --allow-dirty`, emitting an UNSIGNED `dirty: true` dev artifact (signing stays clean-tree-only). The upstream contract the legis-side work was deferred on now exists, so: - keyless dev posture: a dirty artifact governs but records the marker honestly (artifact_status="dirty"), distinguishable from a clean unsigned scan. - CI posture (artifact key configured): a dirty dev artifact is a typed amber `SKIPPED_DIRTY_TREE` outcome on scan_route / /wardline/scan-results — not the generic red (WardlinePayloadError -> 422 / INVALID_ARGUMENT), and nothing is governed. `LEGIS_WARDLINE_ALLOW_DIRTY=1` is the explicit server-side dev-mode opt-in that governs it unsigned (recorded "dirty", never "verified"). Relaxation is scoped to exactly `dirty is True AND no signature`: a signed payload still verifies (a forged signature stays red), and a clean unsigned payload still requires a signature. `dirty` is checked as strict boolean True because the scan dict is caller-controlled, and dev-mode comes only from server config, never the payload — so the clean-tree signing guarantee is intact. Closes legis-d731c760c5 (P0 dirty-tree dev path), legis-7e85e8e7ba (P1 typed amber state). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`ruff check src tests` is now clean: - remove unused imports (EnforcementEngine, EntityKey in test_regressions; ExemptionRegistry in test_exemptions) — F401. - hoist mid-file `import pytest` / `TamperError` to the module top in test_protected_extensions — E402. - `# noqa: F821` on the honesty-gate fixture functions (fake_/weak_/nested_ boundary_test) whose free `handler` name is intentional: those functions are fingerprinted BY SOURCE and never executed, so `handler` stands for the real boundary call the gate looks for. AST-based behavioural check is unaffected (comments are ignored), and the fingerprint round-trip is computed at runtime. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Promote the dirty-tree dev path (typed SKIPPED_DIRTY_TREE amber + LEGIS_WARDLINE_ALLOW_DIRTY) and the test-suite lint cleanup into a tagged release candidate. `__version__` and pyproject agree at 1.0.0rc4; CHANGELOG [Unreleased] promoted to [1.0.0rc4] — 2026-06-06. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Its two commits (untrack + gitignore AGENTS.md/CLAUDE.md) are logically superseded by d7e7c81, which already deletes both files and adds them to .gitignore. Merging records the branch as integrated so nothing is left behind; the net file change is nil.
Tidy .gitignore and apply one consistent rule across the four loom tools: their working folders are *information about the conduct of the project* (issue data, code-archaeology index, scan cache, audit DBs) and do not belong in the repo that holds the *capability* (the shipping code + material). - group entries under clear headings; fold the scattered .loomweave/* lines into a single .loomweave/; add .wardline/ and the SQLite WAL sidecars (*.db-shm / *.db-wal) so the *.db audit-data ignore is complete. - ignore all four tools' working dirs: .filigree/, .loomweave/, .wardline/, legis audit *.db (filigree/wardline/legis already complied). - untrack the loomweave files committed under ADR-005 (config.json, instance_id, .gitignore). That ADR is loomweave's storage opinion for its own output; legis is a consumer and sets its own repo policy. Only conduct noise was tracked here anyway — loomweave.db was already untracked, leaving a near-empty config and a per-machine UUID. Files stay on disk (git rm --cached); loomweave regenerates them locally. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…versioning
Legis now "stands itself up" like its siblings: `legis install` injects a lean
agent-orientation block into CLAUDE.md / AGENTS.md, installs the legis-workflow
skill pack (Claude + Codex), registers a SessionStart hook, and extends
.gitignore with the local config surface (.legis/, legis.yaml).
The block carries a versioned, content-hashed marker
(<!-- legis:instructions:v{version}:{hash} -->); a drift check re-injects it
when either the bundled content or the package version changes. Two triggers
keep it fresh: the Claude Code SessionStart hook (`legis session-context`) and a
best-effort refresh on `legis mcp` boot — the latter closes the Codex-only-repo
gap that the hook-only approach (filigree, loomweave) leaves open.
Mirrors filigree's install/hooks mechanism (inject/replace/append, atomic write,
symlink rejection, idempotent hook registration), right-sized for legis (no
dashboard, no server mode). Lean block + skill-pack split keeps the injected
context small while the skill carries the full CLI + MCP-tool reference.
Design spec: docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md
Gates: ruff + mypy clean; 616 passed / 91.00% coverage; per-package floors hold;
wheel ships data/; end-to-end install + session-context smoke idempotent.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0127b66ebf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| prefix = shlex.join(_find_legis_command()) | ||
| session_context_cmd = f"{prefix} session-context" |
There was a problem hiding this comment.
Avoid writing host-specific hook paths
When legis is found on the installer's PATH, this writes the fully resolved executable path (for example a venv path under one developer's checkout) into .claude/settings.json. Claude Code documents .claude/settings.json as project-scoped/shareable, so if that file is committed or copied to another machine the SessionStart hook will invoke a path that usually does not exist and instruction refresh silently stops working for teammates; use a portable command or a local-only settings file for machine-specific paths.
Useful? React with 👍 / 👎.
| return [found] | ||
| import sys | ||
|
|
||
| return [sys.executable, "-P", "-m", "legis"] |
There was a problem hiding this comment.
Use an executable module fallback
When no legis console script is on PATH, the installed hook falls back to python -P -m legis session-context, but the package has no legis.__main__, so that command exits before running the refresh. This breaks the hook for source/dev invocations or stripped PATH environments where the console script cannot be resolved; point the fallback at an actually executable entry point or add __main__.py.
Useful? React with 👍 / 👎.
…upt settings
Applies the confirmed findings from the adversarial review of the
instruction-injection feature:
- Skill install no longer reports success over a destroyed pack: a genuine
(non-peer) rename failure during the staging->target swap now restores the
original pack from backup and returns (False, ...) instead of swallowing the
error and returning "Installed". The prior pack is only discarded once the
new one is in place.
- _upgrade_hook_commands now skips user-scoped SessionStart blocks (matcher not
in {None, "*"}), so a user's portable bare `legis session-context` is never
rewritten into a venv-specific absolute path. Matches the scope filter the
reuse/has-unscoped paths already use.
- Removed the dead matcher-block reuse loop (unreachable: needs_add implies no
unscoped legis hook exists) in favour of always appending a dedicated
matcher-less block.
- A valid settings.json whose nested hooks/SessionStart has the wrong type is
now backed up to settings.json.bak before the in-place reset, instead of
silently discarding that user data.
- ensure_gitignore appends only the absent rules, fixing a duplicate-write when
one of (.legis/, legis.yaml) was already present.
Documented the intentional version+hash freshness divergence from filigree in
_marker_token (legis git-ignores the regenerated CLAUDE.md/AGENTS.md, so a
version-bump re-inject produces no committed diff).
Adds regression tests for each fix plus the previously-untested version-bump
freshness path and Codex skill-pack drift path.
Gates: ruff + mypy clean; 623 passed / 91.19%; per-package floors hold;
install.py 82%->87%, hooks.py 94%; end-to-end smoke confirms no gitignore
duplication and scoped-block preservation.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b2457108ab
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)): | ||
| node.decorator_list = [] |
There was a problem hiding this comment.
Keep behavior-changing test decorators pinned
For referenced tests that gain or lose a behavior-changing decorator such as @pytest.mark.skip, @pytest.mark.xfail, or a wrapper decorator, this now clears decorator_list before computing test_fingerprint, so the honesty gate still accepts the old fingerprint even though the evidence test no longer runs or no longer behaves the same. The scanner/runtime mismatch is real, but stripping decorators from the pinned hash creates a drift hole; include the decorators in both fingerprint paths instead of dropping them.
Useful? React with 👍 / 👎.
| if backup is not None and swap_done and backup.exists(): | ||
| shutil.rmtree(backup, ignore_errors=True) |
There was a problem hiding this comment.
Handle a non-directory existing skill target
If .claude/skills/legis-workflow or .agents/skills/legis-workflow already exists as a regular file, the rename above succeeds by moving that file into backup, the new skill directory is installed, and then this shutil.rmtree(backup) raises NotADirectoryError because backup is not a directory. That makes legis install --skills crash after a partially successful install instead of reporting or replacing the bad target cleanly; check the target type before treating the backup as a tree.
Useful? React with 👍 / 👎.
| return { | ||
| "artifact_status": "dirty", | ||
| **{key: value for key in ARTIFACT_PROVENANCE_FIELDS | ||
| if isinstance(value := scan.get(key), str) and value}, | ||
| } |
There was a problem hiding this comment.
Require provenance for allowed dirty artifacts
When LEGIS_WARDLINE_ALLOW_DIRTY=1 is set with an artifact key, any unsigned payload with dirty: true returns here before the normal missing-field check, so a scan with no scanner_identity, rule_set_version, commit_sha, or tree_sha is still governed and recorded only as artifact_status: dirty. The dev-mode relaxation should only waive the signature for a dirty tree; otherwise malformed dirty payloads lose the provenance that the signed-artifact posture requires for audit records.
Useful? React with 👍 / 👎.
| signature_present = isinstance(scan.get(ARTIFACT_SIGNATURE_FIELD), str) and bool( | ||
| scan.get(ARTIFACT_SIGNATURE_FIELD) | ||
| ) | ||
| is_dirty_dev_artifact = scan.get("dirty") is True and not signature_present |
There was a problem hiding this comment.
Treat malformed signatures as malformed
If a dirty scan includes artifact_signature with a non-string value, this treats it as absent, so the CI posture returns the amber SKIPPED_DIRTY_TREE path and LEGIS_WARDLINE_ALLOW_DIRTY=1 governs it as unsigned. That relaxation is meant for dirty artifacts with no signature; a present-but-malformed signature field is caller-controlled bad provenance and should stay on the normal payload-error path rather than being silently reclassified as unsigned.
Useful? React with 👍 / 👎.
The ternary guarded `_checks(runtime).for_pr(number)` on `runtime.check_surface is not None`, which evaluates the field BEFORE the lazy `_checks()` builder can initialise it. On a fresh build_runtime (check_surface=None) the guard short-circuited to `[]` and never built the surface, so a PR's CI outcomes were call-order-dependent: empty until some other tool (e.g. check_list) happened to initialise the surface first. For a governance tool that is the "but legis said it was clean" failure — an agent could be told a PR has no checks when failing checks exist. Call `_checks(runtime).for_pr(number)` unconditionally, matching _tool_check_list; `_checks()` already handles the None case. Confirmed the sole instance of the guard-before-lazy-builder anti-pattern in mcp.py. Regression test reproduces the real path (fresh runtime, checks reachable via LEGIS_CHECK_DB) and asserts the FAIL check is reported. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…test
A code-review re-flagged canonical_json(ensure_ascii=False) as a cross-tool
HMAC hazard ("non-ASCII findings fail verification"). Disproven: Wardline's
artifact signer (wardline/core/legis.py) is a deliberate byte-for-byte Python
replica using the identical ensure_ascii=False, pinned by a golden HMAC vector
from the real legis signer plus an "é" canonicalization test. Both sides are
the same Python json.dumps call, so non-ASCII round-trips and verifies —
ensure_ascii=False is what makes them match, not a hazard.
The Q-L4 note's *conclusion* (RFC-8785 deferral) was correct, but one clause
was inaccurate: "every hash is produced and checked in-process". Wardline
produces the artifact_signature out-of-process, cross-repo — it is cross-repo
and cross-process, but NOT cross-language, which is the RFC-8785 trigger.
Rewrite the clause to name the real cross-repo Python verifier and separate
the two guarantees (the ASCII-only cross-impl golden vector vs. each side's
own non-ASCII unit test); the cross-impl non-ASCII payload is a Wardline-side
follow-up.
Add a non-ASCII canonical test on the legis side (it had none) mirroring
Wardline's — locks legis's own ensure_ascii=False output.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The CHANGELOG jumped rc1 -> rc4 (the rc2 and rc3 tags had no entries) and the rc4 entry documented only the dirty-tree path + lint, omitting the headline install instruction-injection system, the table-driven MCP dispatch (Q-L8), and the Q-L5/Q-M5/Q-L6 fixes. - rc2: the MCP stdio surface (WP-M2/M3, verified run_jsonrpc present at the v1.0.0rc2 tag), deployable OpenRouter judge, Filigree closure-gate, git rename feed, policy-boundary honesty gate, Clarion->Loomweave/Loom->Weft rebrand, PyPI publishing. - rc3: the Q-* audit-remediation series (single-secret writer scope, judge advisory-only on protected, Weft-component HMAC transport, fail-closed policy-cell config, etc.). - rc4: added the self-install system, table-driven MCP dispatch, and the Q-L5/Q-M5/Q-L6 fixes + the pull_request_get checks fix from this branch. Fixed the link refs (rc2/rc3 were missing; all pointed at a placeholder URL; repo org corrected to foundryside-dev). rc2/rc3 entries are reconstructed from git history. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The boot-time refresh swallowed all exceptions with a bare `pass` (the broad catch is justified — boot must never break the server), but it was silent while the sibling SessionStart path (hooks.generate_session_context) logs. Add a logger.debug(..., exc_info=True) so a recurring refresh failure is diagnosable without breaking boot. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The re-export was annotated "(kept for symmetry / tests)", but no test imports it via legis.hooks — tests/test_install.py imports it straight from legis.install. Likely residue of the abandoned INSTRUCTIONS constant. Remove the unused import; every other name imported here is used by refresh_instructions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The stdin framing read passed LEGIS_MCP_MAX_REQUEST_BYTES (16 MiB) as the limit to TextIO.readline(), which counts CHARACTERS. A line of multibyte UTF-8 that fits the character cap could exceed the nominal byte cap up to ~4x — fail-safe (still bounded) but the limit did not mean what its name promises. Keep the char-capped readline for memory safety (a decoded str holds <=4 bytes per char, so the in-memory read stays bounded) and reject a complete line whose UTF-8 encoding exceeds the byte budget. The existing truncate-and-drain path for a too-long line is unchanged; the -32700 "maximum size ... bytes" message is now accurate. Test pins a multibyte record that is under the char cap but over the byte cap, and confirms the following record stays framed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ance red Two untested branches surfaced by review: - LEGIS_WARDLINE_ALLOW_DIRTY was only ever tested with "1". The opt-in gates governing UNSIGNED dirty artifacts, so the strict `== "1"` parse must fail safe: pin that "0"/"true"/"yes"/"" all stay the typed amber SKIPPED_DIRTY_TREE skip, guarding against a future drift to truthiness. - The CI-posture (key configured, clean tree) missing-provenance-field branch in verify_wardline_artifact raised no test. Pin that a scan missing a required provenance field is a generic red before signature verification, not an amber skip. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 129d0bbcde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| scan, | ||
| agent_id=runtime.agent_id, | ||
| identity=runtime.identity, | ||
| engine=_engine(runtime), |
There was a problem hiding this comment.
Defer engine creation until after dirty-skip validation
When LEGIS_WARDLINE_ARTIFACT_KEY is configured and scan is an unsigned dirty: true artifact, route_wardline_scan() would raise WardlineDirtyTreeError before governing anything, but this argument is evaluated first. In environments where the default/ configured governance DB is unavailable or unwritable, _engine(runtime) can create/open the audit store or fail before the dirty-artifact check, so scan_route returns an internal/service error instead of the documented SKIPPED_DIRTY_TREE amber outcome; the same eager dependency provisioning pattern should be moved behind artifact verification.
Useful? React with 👍 / 👎.
The audit-store connect listener applied journal_mode=WAL/synchronous/ busy_timeout inside `try: ... except Exception: pass`. The real silent failure was never even an exception: PRAGMA journal_mode=WAL does NOT raise when WAL is unavailable (read-only mount, some network filesystems, in-memory DBs) — it returns the journal mode actually in force. So the connection ran without WAL and surfaced much later as opaque "database is locked" under concurrency, in a governance-critical store. Extract the listener body into a module-level _apply_sqlite_pragmas() (module -level to avoid an engine->listener->self reference cycle, and testable without constructing a store — in-memory AuditStore is unconstructable under NullPool). Log both failure channels at WARNING: a raising PRAGMA (with exc_info) and the silent WAL-not-applied case (by capturing the journal_mode return value). Keep best-effort semantics — a PRAGMA failure must not break connect. Delete the stale force_immediate_transaction comment (finding 2). Tests: WAL actually lands on the file (external sqlite3 reads journal_mode=wal) + busy_timeout=5000 on a listener-fired connection; WAL-not-applied warning fires; raising PRAGMA logs with exc_info and still closes the cursor. Closes legis-5a85c48c41 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Five outcome/status axes were bare strings where ingest.py already shipped WardlineSeverity(str, Enum) as the model. Convert each to a str,Enum — the member IS its wire string, so json.dumps/canonical_json emit byte-identical payloads (HMAC artifact-signature path unaffected: it signs the raw scan, not legis's provenance dict). - ScanOutcome (ROUTED / SKIPPED_DIRTY_TREE) — wardline ingest + scan_route boundaries (mcp.py, api/app.py). SKIPPED_DIRTY_TREE kept as a back-compat alias to the enum member. - ArtifactStatus (verified / dirty / unverified) — wardline/ingest.py. - IdentityResolutionStatus + LineageSnapshotStatus — resolver.py, with a __post_init__ bijection (alive None<->UNAVAILABLE, False<->NOT_ALIVE, True<->RESOLVED) so a self-contradictory frozen record is unrepresentable. Consumers updated: service/governance.py (dead getattr fallbacks dropped), governance/sei_backfill.py. - Suppressed (active / waived / suppressed / baselined / judged) — the field stays str on the wire-facing dataclass (validation timing/error-type unchanged); the enum is the vocabulary source of truth for the frozensets. Full suite + ruff + mypy green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The frozenset(p.strip() for p in split(",")) idiom was duplicated verbatim in
three composition roots (api/app.py, cli.py, mcp.py). A future change to the
delimiter/trim rule applied to one root would diverge the protected-policy set
between the API, the CLI override-rate gate, and the MCP server — and that set
decides whether a judge ACCEPTED is downgraded to operator sign-off, so a
divergence is a real authority split, not a cosmetic one.
Add config.protected_policies(), the single parse point, alongside the *_db_url
resolvers it mirrors. Read at call time (like those resolvers) because cli.py
writes the env var from --protected-policies before the downstream root reads
it. The bare frozenset() empty-default at app.py (no-verifier case) is a
different concern and left as-is.
rc4 review finding #9.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rc4 review #8 flagged _existing_idempotent_record's O(N) read + O(N) hash + O(N) HMAC and proposed a keyed O(1) lookup. That is the same optimization operator-confirmed declined as finding #7 (1805e37): the cost is _verified_records' whole-trail tamper check, and a keyed single-row lookup reaches O(1) only by skipping verification — the silent tamper window #7 deliberately refused. The #7 commit already names this idempotency path; add a pointer comment here so the equivalence is visible at the call site and the finding is not re-litigated a third time. No behavior change. rc4 review finding #8 (duplicate of #7, legis-4ab36517df). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e path loomweave install regenerated the workflow skill after loomweave moved its index under .weft/loomweave/ (the federation store convention). Doc-only path references; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New `legis doctor [--root] [--repair] [--format text|json]`: an operator/CLI health view + safe repair for legis's install and config layer, mirroring wardline doctor. Four check domains (install wiring incl. a new .mcp.json registration capability, config & stores, governance hash-chain integrity, runtime & sibling wiring). Doctrine-bound: C-9(b) makes doctor fully report-only on weft.toml (no scaffold — reversed from an earlier scoping choice once C-9(b) surfaced); repairs touch only legis's own per-member artifacts (install wiring, .mcp.json, .weft/legis/). Keys are presence-checked only, values never shown (Rust key sidecar planned). Not on the agent MCP surface. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bite-sized TDD tasks: DoctorCheck record + rendering, CLI wiring, the new register_mcp_json install capability, then the four check domains (install wiring, config & stores, governance integrity, runtime & siblings), an end-to-end --repair test, and docs/coverage. Each invariant from the spec (no weft.toml write, no key-value rendering, no-leak DB creation) is pinned by a named test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l-wiring drift Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, db overrides, legacy) Adds check_weft_toml (C-9(b): never writes, absent=ok, malformed=error), check_store_dir (absent=ok, present-unwritable=error, --repair creates), check_db_overrides (validate LEGIS_*_DB env URL syntax), and check_legacy_stray_db (warn on legis-*.db at repo root). All registered in collect_checks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds check_audit_chain (absent DB=ok, no-leak: never creates DB file; tampered chain=error, report-only), _store_url helper (resolves DB paths relative to root, not cwd, so tests are cwd-agnostic), check_hmac_key (presence-only, never renders value; warns if protected policies configured without key), and check_sibling_url (validates LOOMWEAVE_API_URL/FILIGREE_API_URL as http(s)). All registered in collect_checks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ighten override checks Review follow-ups on Chunk D: 1. Root-anchor store_dir resolution. New _store_dir_for(root) reads root/weft.toml (never cwd) and returns an absolute path, used by both check_store_dir and _store_url, so --root != cwd is consistent and an absolute cwd store_dir can't escape root. Malformed weft.toml falls back to the default (check_weft_toml reports it). 2. Source store identity from config. New config.STORE_DB_SPECS (env, name) tuples; doctor derives _DB_OVERRIDE_ENVS / _LEGACY_DB_NAMES from it so adding a store can't silently drop doctor coverage. 3. Match config's empty-override precedence: present-but-empty LEGIS_*_DB is a verbatim broken override (membership, not truthiness) in both check_db_overrides and _store_url. 4. Robust sqlite predicate via make_url(url).get_backend_name() == "sqlite". Tests: root-anchored store_dir via weft.toml; empty-string override -> error. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on health Fix 1: replace presence-only mcp_json check with mcp_entry_is_current() — a new predicate in install.py that verifies the entry exists, its args invoke mcp, and its command resolves to an existing executable. Byte-equality is deliberately avoided so valid binary-path variation (uv-tool vs venv) doesn't read as drift. check_mcp_json in doctor.py uses the predicate and re-verifies after repair. Fix 2: align render_text with render_json / exit-code semantics — the "ok" banner is now suppressed only on error, not on warn. A warn-only project shows "legis doctor: ok (N warning(s))" while still listing the warn checks below. Spec doc updated to describe the resolvability-based freshness check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
The hand-rolled MCP initialize handler rejected any protocolVersion outside _SUPPORTED_PROTOCOL_VERSIONS with a -32602 error. Newer MCP clients negotiate 2025-06-18, so legis failed to connect entirely while sibling servers (loomweave, wardline) connected fine. Per the MCP spec, when the client requests a version the server does not support (or omits it), the server must respond with a version it does support and let the client decide whether to proceed — not hard-error. Reply with _DEFAULT_PROTOCOL_VERSION in that case. Replaces test_initialize_rejects_unsupported_protocol_version with test_initialize_negotiates_unsupported_protocol_version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…c verified_author gap Dogfood #2 friction-tail closeout for the legis-owned items (weft-f506e5f845, weft-9da517a67e): - LG-1: add a top-level `legis --version` flag (argparse version action over legis.__version__), closing the gap where the running build could only be identified indirectly. - Le1: the MCP `CELL_NOT_ENABLED` recovery hint now names the concrete enablement path (LEGIS_HMAC_KEY + LEGIS_POLICY_CELLS / policy/cells.toml / LEGIS_DEV_DEFAULT_CELLS=1) instead of a generic "ask the operator"; the per-cell message still names which cell is unenabled. - C3: charter records the self-asserted-write-actor gap (verified_author:null) as a known governance gap — acceptable trust-local, deferred multi-principal. Tests: new version-flag test (exits 0, prints version) and an assertion that the closure-gate CELL_NOT_ENABLED next_action names LEGIS_HMAC_KEY. Full suite green (754 passed), ruff + mypy + coverage floors hold. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Merges the
rc4line tomainand publishes 1.0.0rc4 (never tagged; only rc2/rc3 exist).rc4is 74 commits ahead ofmain, 0 behind.Headline (since rc3)
legis doctor— operator health view + safe repair for the install/config layer..weft/legis/;weft.toml [legis]enrich-only read; malformed/absentweft.tomlboots on defaults.legis installself-install +--mcpregistration; SessionStart hook;legis session-context.legis --version(LG-1, weft-9da517a67e)CELL_NOT_ENABLEDrecovery hint now names the enablement path (Le1, weft-f506e5f845)verified_author: nullself-asserted-write-actor gap (C3)Gates
ruff + mypy clean; 754 passed / 2 skipped; coverage 92.29% (floor 88%); per-package floors hold. CI: test + override-rate green.
Full detail in
CHANGELOG.md[1.0.0rc4].🤖 Generated with Claude Code