diff --git a/.agents/skills/filigree-workflow/SKILL.md b/.agents/skills/filigree-workflow/SKILL.md index 76e81e4..aae6e10 100644 --- a/.agents/skills/filigree-workflow/SKILL.md +++ b/.agents/skills/filigree-workflow/SKILL.md @@ -196,7 +196,7 @@ When parsing `--json` output or MCP responses, expect these unified envelopes: one of: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, `INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, `INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, - `CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, + `LOOMWEAVE_REGISTRY_VERSION_MISMATCH`, `LOOMWEAVE_OUT_OF_SYNC`, `BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. Branch on `code` for retry policy (`CONFLICT` → exit 4, retryable; everything at exit 1 needs operator diff --git a/.agents/skills/loomweave-workflow/.fingerprint b/.agents/skills/loomweave-workflow/.fingerprint index e44b7ed..f1af0a2 100644 --- a/.agents/skills/loomweave-workflow/.fingerprint +++ b/.agents/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -fe04e6fd9d528b07738f527b41d817dff89344f051465af012fc42ed44377ea3 \ No newline at end of file +4c1af074f42ec147611923aafeb704eba54cd7dca4dcec2489907921b7f94233 \ No newline at end of file diff --git a/.agents/skills/loomweave-workflow/SKILL.md b/.agents/skills/loomweave-workflow/SKILL.md index 1b07457..5b8e4d8 100644 --- a/.agents/skills/loomweave-workflow/SKILL.md +++ b/.agents/skills/loomweave-workflow/SKILL.md @@ -26,7 +26,7 @@ calls this?" without reading a single file. - You need a function's neighborhood, execution paths, or which subsystem it belongs to. **Not for:** editing code, reading exact implementation bodies (use `summary` or -read the file once you have its path), or codebases with no `.loomweave/` index. +read the file once you have its path), or codebases with no `.weft/loomweave/` index. ## Entity IDs — the model @@ -65,18 +65,27 @@ tell which case you're in. | `execution_paths_from` | bounded call paths out of an entity | `{"id": "", "max_depth": 5}` | | `subsystem_members` | modules in a subsystem | `{"id": "core:subsystem:"}` | | `subsystem_of` | the subsystem an entity belongs to (reverse of `subsystem_members`) | `{"id": ""}` | -| `summary` | on-demand prose summary of one entity | `{"id": ""}` | +| `summary` † | on-demand prose summary of one entity | `{"id": ""}` | | `summary_preview_cost` | preview a `summary` call's cache status / cost before spending | `{"id": ""}` | | `issues_for` | Filigree issues attached to an entity | `{"id": ""}` | | `source_for_entity` | an entity's exact indexed source span + bounded context | `{"id": "", "context_lines": 10}` | | `call_sites` | the source line(s) behind a calls/references edge | `{"id": "", "role": "caller"}` | | `orientation_pack` | one deterministic orientation packet for an entity or file:line (entity + context + neighbors + paths + issues + freshness) | `{"file": "rel/path.py", "line": 42}` | | `index_diff` | index freshness / drift vs. the current working tree | `{}` | -| `analyze_start` | launch a background re-index, return its `run_id` | `{}` | +| `analyze_start` † | launch a background re-index, return its `run_id` | `{}` | | `analyze_status` | poll a started analyze (queued/running/terminal + progress) | `{"run_id": ""}` | -| `analyze_cancel` | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | +| `analyze_cancel` † | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | | `project_status` | index freshness, counts, LLM + Filigree status | `{}` | +† **Write-gated.** `summary` (`entity_summary_get`), `analyze_start`, +`analyze_cancel`, `propose_guidance`, and `promote_guidance` are registered only +when `serve.mcp.enable_write_tools: true` is set in `loomweave.yaml` (default +`false`). When the gate is off they do not appear in `tools/list` and a call +returns a tool-disabled error — run `loomweave config check` to see the active +policy. `summary` additionally requires the live LLM provider to be enabled +(`llm_policy.enabled: true` + `allow_live_provider: true`), or it serves cache +only. + `callers_of` / `neighborhood` / `execution_paths_from` take a `confidence` tier — one of `"resolved"` (default; only high-confidence edges), `"ambiguous"`, or `"inferred"`. There is no `"all"` value. When you suspect an @@ -152,7 +161,7 @@ honest-empty unless a plugin emits those tags. Likewise `high_churn` and `search_semantic` is also in the catalogue. It is opt-in under `semantic_search:`; when enabled, `loomweave analyze` populates the git-ignored -`.loomweave/embeddings.db` sidecar and the query path filters stale vectors by +`.weft/loomweave/embeddings.db` sidecar and the query path filters stale vectors by content hash. > Not in this catalogue: `emit_observation` as a general-purpose write surface. @@ -163,6 +172,7 @@ for team sharing). Agents may call `propose_guidance` to create a Filigree observation, but that proposal is inert until an operator promotes it through `promote_guidance` or the CLI. Promoted sheets reach you through `guidance_for` and are composed into `summary` prompts with a real guidance fingerprint. +(`propose_guidance` and `promote_guidance` are write-gated — see the † note above.) ## Workflow: orient, then navigate @@ -192,7 +202,7 @@ and are composed into `summary` prompts with a real guidance fingerprint. ## Launch -`loomweave serve --path ` where `` contains `.loomweave/loomweave.db` +`loomweave serve --path ` where `` contains `.weft/loomweave/loomweave.db` (built by `loomweave analyze `). In an MCP client the tools appear as `mcp__loomweave__find_entity`, etc. diff --git a/.claude/skills/filigree-workflow/SKILL.md b/.claude/skills/filigree-workflow/SKILL.md index 76e81e4..aae6e10 100644 --- a/.claude/skills/filigree-workflow/SKILL.md +++ b/.claude/skills/filigree-workflow/SKILL.md @@ -196,7 +196,7 @@ When parsing `--json` output or MCP responses, expect these unified envelopes: one of: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, `INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, `INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, - `CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, + `LOOMWEAVE_REGISTRY_VERSION_MISMATCH`, `LOOMWEAVE_OUT_OF_SYNC`, `BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. Branch on `code` for retry policy (`CONFLICT` → exit 4, retryable; everything at exit 1 needs operator diff --git a/.claude/skills/loomweave-workflow/.fingerprint b/.claude/skills/loomweave-workflow/.fingerprint index e44b7ed..f1af0a2 100644 --- a/.claude/skills/loomweave-workflow/.fingerprint +++ b/.claude/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -fe04e6fd9d528b07738f527b41d817dff89344f051465af012fc42ed44377ea3 \ No newline at end of file +4c1af074f42ec147611923aafeb704eba54cd7dca4dcec2489907921b7f94233 \ No newline at end of file diff --git a/.claude/skills/loomweave-workflow/SKILL.md b/.claude/skills/loomweave-workflow/SKILL.md index 1b07457..5b8e4d8 100644 --- a/.claude/skills/loomweave-workflow/SKILL.md +++ b/.claude/skills/loomweave-workflow/SKILL.md @@ -26,7 +26,7 @@ calls this?" without reading a single file. - You need a function's neighborhood, execution paths, or which subsystem it belongs to. **Not for:** editing code, reading exact implementation bodies (use `summary` or -read the file once you have its path), or codebases with no `.loomweave/` index. +read the file once you have its path), or codebases with no `.weft/loomweave/` index. ## Entity IDs — the model @@ -65,18 +65,27 @@ tell which case you're in. | `execution_paths_from` | bounded call paths out of an entity | `{"id": "", "max_depth": 5}` | | `subsystem_members` | modules in a subsystem | `{"id": "core:subsystem:"}` | | `subsystem_of` | the subsystem an entity belongs to (reverse of `subsystem_members`) | `{"id": ""}` | -| `summary` | on-demand prose summary of one entity | `{"id": ""}` | +| `summary` † | on-demand prose summary of one entity | `{"id": ""}` | | `summary_preview_cost` | preview a `summary` call's cache status / cost before spending | `{"id": ""}` | | `issues_for` | Filigree issues attached to an entity | `{"id": ""}` | | `source_for_entity` | an entity's exact indexed source span + bounded context | `{"id": "", "context_lines": 10}` | | `call_sites` | the source line(s) behind a calls/references edge | `{"id": "", "role": "caller"}` | | `orientation_pack` | one deterministic orientation packet for an entity or file:line (entity + context + neighbors + paths + issues + freshness) | `{"file": "rel/path.py", "line": 42}` | | `index_diff` | index freshness / drift vs. the current working tree | `{}` | -| `analyze_start` | launch a background re-index, return its `run_id` | `{}` | +| `analyze_start` † | launch a background re-index, return its `run_id` | `{}` | | `analyze_status` | poll a started analyze (queued/running/terminal + progress) | `{"run_id": ""}` | -| `analyze_cancel` | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | +| `analyze_cancel` † | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | | `project_status` | index freshness, counts, LLM + Filigree status | `{}` | +† **Write-gated.** `summary` (`entity_summary_get`), `analyze_start`, +`analyze_cancel`, `propose_guidance`, and `promote_guidance` are registered only +when `serve.mcp.enable_write_tools: true` is set in `loomweave.yaml` (default +`false`). When the gate is off they do not appear in `tools/list` and a call +returns a tool-disabled error — run `loomweave config check` to see the active +policy. `summary` additionally requires the live LLM provider to be enabled +(`llm_policy.enabled: true` + `allow_live_provider: true`), or it serves cache +only. + `callers_of` / `neighborhood` / `execution_paths_from` take a `confidence` tier — one of `"resolved"` (default; only high-confidence edges), `"ambiguous"`, or `"inferred"`. There is no `"all"` value. When you suspect an @@ -152,7 +161,7 @@ honest-empty unless a plugin emits those tags. Likewise `high_churn` and `search_semantic` is also in the catalogue. It is opt-in under `semantic_search:`; when enabled, `loomweave analyze` populates the git-ignored -`.loomweave/embeddings.db` sidecar and the query path filters stale vectors by +`.weft/loomweave/embeddings.db` sidecar and the query path filters stale vectors by content hash. > Not in this catalogue: `emit_observation` as a general-purpose write surface. @@ -163,6 +172,7 @@ for team sharing). Agents may call `propose_guidance` to create a Filigree observation, but that proposal is inert until an operator promotes it through `promote_guidance` or the CLI. Promoted sheets reach you through `guidance_for` and are composed into `summary` prompts with a real guidance fingerprint. +(`propose_guidance` and `promote_guidance` are write-gated — see the † note above.) ## Workflow: orient, then navigate @@ -192,7 +202,7 @@ and are composed into `summary` prompts with a real guidance fingerprint. ## Launch -`loomweave serve --path ` where `` contains `.loomweave/loomweave.db` +`loomweave serve --path ` where `` contains `.weft/loomweave/loomweave.db` (built by `loomweave analyze `). In an MCP client the tools appear as `mcp__loomweave__find_entity`, etc. diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d77e3a3..8661f6e 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -15,8 +15,12 @@ jobs: enable-cache: true - name: Install dependencies run: uv sync --dev + - name: Run lint + run: uv run ruff check src - name: Run test suite - run: uv run pytest --cov=legis --cov-report=term-missing --cov-fail-under=70 + run: uv run pytest --cov=legis --cov-report=term-missing --cov-report=json --cov-fail-under=88 + - name: Enforce per-package coverage floors + run: uv run python scripts/check_coverage_floors.py - name: Run SEI conformance oracle run: uv run pytest tests/conformance/test_sei_oracle.py - name: Run live Loomweave oracle @@ -46,4 +50,6 @@ jobs: # Remove this once a real governance DB is wired into CI. env: LEGIS_ALLOW_MISSING_GOVERNANCE_DB: "1" - run: uv run legis governance-gate --db sqlite:///legis-governance.db + # No --db: use the resolved default store (.weft/legis/legis-governance.db), + # the same location the server/MCP write to. + run: uv run legis governance-gate diff --git a/.github/workflows/loomweave-conformance.yml b/.github/workflows/loomweave-conformance.yml new file mode 100644 index 0000000..7a28077 --- /dev/null +++ b/.github/workflows/loomweave-conformance.yml @@ -0,0 +1,64 @@ +name: loomweave-conformance + +# Live cross-repo Loomweave SEI conformance. +# +# Unlike the per-PR oracle step in ci.yml (opt-in, silently skipped when +# LOOMWEAVE_URL is unset), this gate is FAIL-CLOSED: a missing endpoint, locator +# fixture, or HMAC credential is an ERROR, not a pass. That closes the roadmap-12 +# hole where an absent var let Loomweave endpoint/header drift sail through CI. +# +# It runs on a schedule (catch drift between releases) and is callable as a +# reusable workflow (`workflow_call`) so the release pipeline gates publish on it +# — making conformance non-optional for releases. + +on: + schedule: + - cron: "0 7 * * *" # daily 07:00 UTC drift sweep + workflow_dispatch: + workflow_call: + +permissions: + contents: read + +jobs: + live-loomweave-oracle: + name: Live Loomweave oracle (fail-closed) + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v5 + with: + enable-cache: true + - name: Install dependencies + run: uv sync --dev + - name: Require live Loomweave configuration + env: + LOOMWEAVE_URL: ${{ vars.LOOMWEAVE_URL }} + LOOMWEAVE_LIVE_ORACLE_LOCATOR: ${{ vars.LOOMWEAVE_LIVE_ORACLE_LOCATOR }} + LEGIS_LOOMWEAVE_HMAC_KEY: ${{ secrets.LEGIS_LOOMWEAVE_HMAC_KEY }} + run: | + missing=0 + if [ -z "${LOOMWEAVE_URL}" ]; then + echo "::error::LOOMWEAVE_URL variable is not set — live Loomweave conformance cannot run. Configure it under Settings → Secrets and variables → Actions → Variables." + missing=1 + fi + if [ -z "${LOOMWEAVE_LIVE_ORACLE_LOCATOR}" ]; then + echo "::error::LOOMWEAVE_LIVE_ORACLE_LOCATOR variable is not set — the round-trip locator fixture is required for conformance." + missing=1 + fi + if [ -z "${LEGIS_LOOMWEAVE_HMAC_KEY}" ]; then + echo "::error::LEGIS_LOOMWEAVE_HMAC_KEY secret is not set — the signed Loomweave channel credential is required." + missing=1 + fi + if [ "${missing}" -ne 0 ]; then + exit 1 + fi + - name: Run live Loomweave conformance oracle + env: + LOOMWEAVE_URL: ${{ vars.LOOMWEAVE_URL }} + LOOMWEAVE_LIVE_ORACLE_LOCATOR: ${{ vars.LOOMWEAVE_LIVE_ORACLE_LOCATOR }} + LEGIS_LOOMWEAVE_HMAC_KEY: ${{ secrets.LEGIS_LOOMWEAVE_HMAC_KEY }} + # -rs reports any skip in the log; the guard above makes the test file's + # own skipif conditions (unset URL / locator) unreachable, so a skip here + # would signal an unexpected gap rather than a benign opt-out. + run: uv run pytest tests/conformance/test_live_loomweave_oracle.py -q -rs diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 2b903a5..09760d8 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -54,9 +54,17 @@ jobs: name: dist path: dist/ + conformance: + # Live cross-repo Loomweave SEI conformance, required before publish. The + # reusable workflow is fail-closed: a missing LOOMWEAVE_URL / locator / HMAC + # credential fails the release rather than silently skipping (roadmap 12). + name: Live Loomweave conformance + uses: ./.github/workflows/loomweave-conformance.yml + secrets: inherit + publish: name: Publish to PyPI - needs: build + needs: [build, conformance] runs-on: ubuntu-latest environment: name: pypi diff --git a/.gitignore b/.gitignore index ab4b814..5bfb44f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,4 @@ -.worktrees/ +# OS / editor cruft .DS_Store Thumbs.db .idea/ @@ -8,14 +8,40 @@ Thumbs.db .venv/ __pycache__/ *.py[cod] -.pytest_cache/ *.egg-info/ -# Local audit/scratch databases (never commit audit data) -*.db -.filigree -.filigree.conf +.pytest_cache/ +.mypy_cache/ +.ruff_cache/ .coverage +coverage.json + +# Worktrees +.worktrees/ + +# Local tooling config (machine-specific, never commit) .mcp.json + +# Agent instruction files — filigree-generated, regenerated each session +AGENTS.md +CLAUDE.md + +# --- Weft suite working folders & local config (regenerated/local; never commit) --- +# Filigree — issue-tracker database + project config +.filigree/ +.filigree.conf +# Loomweave — code-archaeology index/cache + config +.loomweave/ loomweave.yaml +# Wardline — scanner cache + config +.wardline/ wardline.yaml -.loomweave/loomweave.lock +# Legis — local audit/scratch databases + their SQLite WAL sidecars +# (audit data is never committed) and local working dir / config +*.db +*.db-shm +*.db-wal +# Federated runtime-state subtree (legis is the sole writer; never .weft/ wholesale) +.weft/legis/ + +# Filigree issue tracker +.weft/ diff --git a/.loomweave/.gitignore b/.loomweave/.gitignore deleted file mode 100644 index e861d9e..0000000 --- a/.loomweave/.gitignore +++ /dev/null @@ -1,26 +0,0 @@ -# Loomweave .gitignore — ADR-005 tracked-vs-excluded list. -# Tracked (committed): loomweave.db, config.json, .gitignore itself. -# Excluded (ignored): WAL sidecars, shadow DB, per-run logs, tmp scratch. - -# SQLite write-ahead files never belong in the repo. -*-wal -*-shm -*.db-wal -*.db-shm - -# Shadow DB intermediate (ADR-011 --shadow-db). -*.shadow.db -*.db.new - -# Semantic-search embeddings sidecar (ADR-040): large + rebuildable, never -# committed (keeps loomweave.db unbloated). WAL files are covered by *.db-wal/-shm. -embeddings.db - -# Scratch / temp space. -tmp/ - -# Per-run log directories (see detailed-design §File layout). The run dir -# metadata (config.yaml, stats.json, partial.json) is tracked; only the -# raw LLM request/response log is excluded. -logs/ -runs/*/log.jsonl diff --git a/.loomweave/config.json b/.loomweave/config.json deleted file mode 100644 index d7ef3ef..0000000 --- a/.loomweave/config.json +++ /dev/null @@ -1,4 +0,0 @@ -{ - "schema_version": 1, - "last_run_id": null -} diff --git a/.loomweave/instance_id b/.loomweave/instance_id deleted file mode 100644 index 16ed381..0000000 --- a/.loomweave/instance_id +++ /dev/null @@ -1 +0,0 @@ -48bbdc71-c426-4b23-8217-a0ea17e349e7 diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index d2ea656..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - diff --git a/CHANGELOG.md b/CHANGELOG.md index e6e160f..9b85bfc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,276 @@ All notable changes to Legis are documented here. The format follows versions per [PEP 440](https://peps.python.org/pep-0440/) / [SemVer](https://semver.org/) (pre-release: `1.0.0rc1`). +## [1.0.0rc4] — 2026-06-08 + +### Added +- **`legis --version`** — top-level version flag (LG-1, weft-9da517a67e); reports + the installed package version and exits. Closes the dogfood gap where the only + way to identify the running build was an indirect probe. +- **`legis doctor [--root] [--repair] [--format text|json]`** — operator health + view and safe repair for the install + config layer (instruction blocks, skills, + SessionStart hook, `.gitignore`, `.mcp.json` registration, store dir, audit + hash-chain integrity, key/sibling wiring). Report-only on `weft.toml` (C-9(b)) + and on hash chains; key values are never rendered. +- **`legis install --mcp`** — register the legis MCP server in `.mcp.json` + (also part of `legis install` with no flags). +- **Self-install (`legis install`)** — legis now stands itself up like its + siblings: it injects a lean, versioned agent-orientation block into CLAUDE.md / + AGENTS.md, installs the `legis-workflow` skill pack (Claude + Codex), registers + a `SessionStart` hook, and extends `.gitignore` with the local config surface + (`.legis/`, `legis.yaml`). The block carries a content-hashed, version-pinned + marker (``); a drift check + re-injects it when either the bundled content or the package version changes. + Two triggers keep it fresh — the SessionStart hook (`legis session-context`) + and a best-effort refresh on `legis mcp` boot, the latter closing the + Codex-only-repo gap a hook-only approach leaves open. Mirrors filigree's + inject/replace/append install mechanism (atomic write, symlink rejection, + idempotent hook registration), right-sized for legis; the lean block + + skill-pack split keeps the injected context small while the skill carries the + full CLI + MCP-tool reference. Design spec: + `docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md`. + (legis-0127b66; hardening — skill swap, hook upgrade, gitignore, nested-corrupt + settings — in legis-b245710.) +- **Dirty-tree dev path** — `verify_wardline_artifact` now recognises the + unsigned `dirty: true` dev artifact emitted by `wardline scan --format legis + --allow-dirty`. In the keyless posture it governs but records the marker + honestly (`artifact_status: "dirty"`). In the CI posture (artifact key + configured) a dirty dev artifact is a typed amber **`SKIPPED_DIRTY_TREE`** + outcome on `scan_route` / `/wardline/scan-results` — distinguishable from the + generic red, never governed — unless `LEGIS_WARDLINE_ALLOW_DIRTY=1` opts into + governing it unsigned (recorded as `"dirty"`). The relaxation is scoped to + exactly `dirty is True AND no signature`: a signed payload still verifies + (a forged signature stays red) and a clean unsigned payload still requires a + signature, so the clean-tree signing guarantee is intact. (legis-d731c760c5, + legis-7e85e8e7ba; upstream wardline `--allow-dirty`.) + +### Changed +- **Typed outcome/status axes (str Enums)** — five stringly-typed axes are now + `str, Enum` following the existing `WardlineSeverity` model: `ScanOutcome` + (`ROUTED` / `SKIPPED_DIRTY_TREE`), `ArtifactStatus` + (`verified` / `dirty` / `unverified`), `IdentityResolutionStatus`, + `LineageSnapshotStatus`, and `Suppressed`. A `str, Enum` serializes identically + to the bare string, so wire payloads and HMAC artifact signatures are + byte-identical (the signature path signs the raw scan, not legis's + enum-bearing provenance). `IdentityResolution` gains a `__post_init__` + bijection (`alive` `None`↔`UNAVAILABLE`, `False`↔`NOT_ALIVE`, + `True`↔`RESOLVED`) so a self-contradictory frozen record is no longer + representable; the dead `getattr` fallbacks in `service/governance.py` are + dropped. The guard now covers the record's *other* half too — the lineage axis + (`lineage_snapshot` present iff `lineage_snapshot_status` is `VERIFIED`) — and + rejects a non-bool `alive` with its own `ValueError` rather than a `KeyError`. The `suppressed` field stays `str` on the wire-facing dataclass + (validation timing and error type unchanged); the enum is the vocabulary + source of truth. Behavior-preserving. (legis-bba4f22949; deferred from the + rc4 code review.) +- **Table-driven MCP dispatch (Q-L8)** — `call_tool` now routes through a tool + table instead of an if/elif ladder, and the stdio server bounds each stdin + line so a malformed client cannot stream unbounded input. Behavior-preserving. +- **`CELL_NOT_ENABLED` recovery hint names the enablement path (Le1, + weft-f506e5f845)** — the MCP error's `next_action` now tells the agent *how* to + enable a governance cell (set `LEGIS_HMAC_KEY`; configure policy cells via + `LEGIS_POLICY_CELLS` / `policy/cells.toml` / `LEGIS_DEV_DEFAULT_CELLS=1`) instead + of a generic "ask the operator". The per-cell message still names which cell is + unenabled. +- **Charter documents the self-asserted-write-actor gap (C3, weft-f506e5f845)** — + `docs/design/legis-charter.md` now records `verified_author: null` (federation + write attribution is self-asserted, not cryptographically verified) as a known + governance gap, acceptable for trust-local use and deferred for multi-principal. +- **Release CI gates** — the coverage floor is raised to 88% with a `ruff` lint + gate added (Q-L7), live Loomweave conformance is now non-optional for releases + (no silent skip when the oracle is down), and the Filigree client's transport / + error branches are covered. + +### Fixed +- **Fingerprint reconciliation + RFC-8785 deferral (Q-L5 / Q-L4)** — the policy + gate and the static boundary scanner now extract the same fingerprint (they had + diverged); the RFC-8785 canonical-JSON upgrade is explicitly deferred (its + trigger is a *non-Python* verifier, and the one cross-tool verifier — Wardline — + is a byte-for-byte Python replica pinned by a golden vector). +- **AuditStore batch read-free invariant (Q-M5)** — the batch append path is + guarded against issuing a read mid-batch, with a regression test pinning the + three-layer append-only enforcement. +- **Capability-latch TTL revalidation (Q-L6)** — the SEI capability latch is + TTL-revalidated rather than cached indefinitely, and `content_hash` is + type-checked at its call sites. +- **Lint** — cleared the remaining `ruff` findings in the test suite (unused + imports, mid-file imports hoisted to module top, and `# noqa: F821` on the + honesty-gate fixture functions whose free `handler` name is fingerprinted by + source, not executed). `ruff check src tests` is now clean. +- **`pull_request_get` reports recorded checks unconditionally** — the tool no + longer short-circuits to an empty `checks` list on a fresh runtime whose check + surface has not yet been lazily initialised. A PR's CI outcomes are now + call-order-independent, so a governance agent can never be told a PR is clean + when failing checks exist. +- **Injector anchors on its own top-level fence, not a quoted marker** — the + instruction injector previously located its block with a bare substring search + for ``) with its own regex, + independently of `install.py`, which builds the marker and owns + `INSTRUCTIONS_MARKER`. A change to the marker spacing or token shape in the + writer would silently desync the reader, and the drift check — the hook's whole + job — would stop matching. The token-extraction helper (`_extract_marker_token`) + now lives next to the writer in `install.py`; its regex is `re.escape`d from the + `INSTRUCTIONS_MARKER` constant and captures the token opaquely (`\S+`), so it + cannot desync from the prefix and needs no edit if the token shape changes. A + round-trip test (`_extract_marker_token(_build_instructions_block())` == + `_marker_token()`) pins reader-to-writer, failing loudly on any future format + change. + +### Fixed +- **Ingest accepts realistic scans** — the over-strict Wardline ingest validator + was relaxed to accept the diagnostics a real scan carries while keeping the + trust-grammar projection. +- **CLI fails closed on protected override-rate trails** — a missing or + unverifiable protected trail exits non-zero rather than reporting a clean rate. +- **Override-rate gate no longer over-detects protected records** — the + keyless-branch protected-detector dropped its soft `file_fingerprint` / + `ast_path` extension sniffs, which a chill/coached record could carry via an + arbitrary `extra_extensions` dict and thereby fail-close a non-protected + deployment's `legis governance-gate`. It now keys off the policy set plus the + `protected_cell` / signature markers the simple-tier engine never writes; + `TrailVerifier`'s (deliberately over-inclusive) verify-path heuristic is + unchanged. +- Hardened the governance audit boundaries with regression coverage. + ## [1.0.0rc1] — 2026-06-03 First release candidate for 1.0. Everything built through Sprint 6 plus the @@ -49,4 +319,7 @@ WP-M1 service-layer extraction, consolidated behind a stable version. (Filigree signature column, live-Loomweave oracle + HMAC auth, operative git-rename feed) remain. -[1.0.0rc1]: https://peps.python.org/pep-0440/ +[1.0.0rc4]: https://github.com/foundryside-dev/legis/compare/v1.0.0rc3...HEAD +[1.0.0rc3]: https://github.com/foundryside-dev/legis/compare/v1.0.0rc2...v1.0.0rc3 +[1.0.0rc2]: https://github.com/foundryside-dev/legis/releases/tag/v1.0.0rc2 +[1.0.0rc1]: https://github.com/foundryside-dev/legis/releases/tag/v1.0.0rc1 diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index d2ea656..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - diff --git a/README.md b/README.md index 1a3fe9a..625ffd5 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ Legis is the fourth Weft product: the git/CI and governance side of the suite's ## Status -Legis is at **`1.0.0rc1`** — the first release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) underpinning the forthcoming agent-facing MCP surface has landed. See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. +Legis is at **`1.0.0rc4`** — the fourth release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) and the agent-facing MCP surface on top of it have landed (`legis mcp`), and Legis now stands itself up via `legis install` (instruction block + `legis-workflow` skill pack + SessionStart hook + `.mcp.json` registration). `legis doctor [--repair]` provides an operator health view and safe repair for the install + config layer. See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. ## The Weft suite diff --git a/docs/design/legis-charter.md b/docs/design/legis-charter.md index c9405d8..1ed449b 100644 --- a/docs/design/legis-charter.md +++ b/docs/design/legis-charter.md @@ -35,6 +35,19 @@ Legis can describe repository change and CI state on its own. Legis becomes the common operating picture for project change and governance while preserving the authority boundaries of the other Weft products. +## Known governance gaps + +- **Self-asserted write actor (`verified_author: null`).** Actor identity on + federation write events (e.g. a comment or status change attributed to an + agent) is self-asserted by the caller, not cryptographically verified. For + trust-local, single-operator use this is acceptable. A multi-principal + deployment that needs non-repudiable write attribution would require a + verified-identity binding at the write boundary — Legis governs *change* + provenance but does not today mint or verify the actor identity carried on a + sibling's write. Verified authorship is a deferred item in the governance + story, not a current guarantee. (Surfaced in the 2026-06 lacuna dogfood as + finding C3; tracked federation-side under the residual-friction tail.) + ## Near-term scope The initial repository is documentation-first. It should make the intended role reviewable before runtime implementation starts. diff --git a/docs/superpowers/plans/2026-06-07-legis-doctor.md b/docs/superpowers/plans/2026-06-07-legis-doctor.md new file mode 100644 index 0000000..a844d13 --- /dev/null +++ b/docs/superpowers/plans/2026-06-07-legis-doctor.md @@ -0,0 +1,1162 @@ +# Legis doctor Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add `legis doctor [--root .] [--repair] [--format {text,json}]` — an operator/CLI health view that diagnoses and (safely) repairs legis's install + config layer. + +**Architecture:** One new module `src/legis/doctor.py` (a `DoctorCheck` dataclass, one function per check, a `run_doctor` orchestrator, `machine_readable_doctor` for JSON), a thin `doctor` subparser in `cli.py`, and one new install capability `register_mcp_json` in `install.py` (with a `legis install --mcp` flag). Checks reuse existing `install.py` / `config.py` / `store` helpers; repairs touch only legis's own per-member artifacts. Bound by C-9(b): **never writes `weft.toml`**. + +**Tech Stack:** Python 3.12, argparse, stdlib `tomllib`/`json`, SQLAlchemy `make_url`, pytest, uv. + +**Spec:** `docs/superpowers/specs/2026-06-07-legis-doctor-design.md` + +--- + +## File Structure + +- **Create `src/legis/doctor.py`** — all doctor logic. Responsibilities: the `DoctorCheck` record, every check function (pure: `root: Path` + env → `DoctorCheck`, no mutation), the repair dispatch, the `run_doctor`/`machine_readable_doctor` orchestrators, and text/JSON rendering. +- **Modify `src/legis/install.py`** — add `register_mcp_json(project_root)` + `_legis_mcp_entry(agent_id)` (the `.mcp.json` writer/canonical entry), reusing `_find_legis_command`, `_atomic_write_text`, `reject_symlink`, `project_path`. +- **Modify `src/legis/cli.py`** — add the `doctor` subparser, a `--mcp` flag (+ optional `--agent-id`) on the `install` subparser and its step list, and a thin `_run_doctor` dispatcher. +- **Create `tests/test_doctor.py`** — mirrors `src/legis/doctor.py`. +- **Modify `tests/test_install.py`** — tests for `register_mcp_json`. +- **Modify `scripts/check_coverage_floors.py`** — (only if it enumerates modules) add a floor for `doctor.py`; otherwise the top-level src floor covers it. +- **Modify `CHANGELOG.md`**, **`README.md`** — document the new command. + +Reused symbols (verify they exist before relying on them): +- `install.py`: `INSTRUCTIONS_MARKER`, `SKILL_NAME`, `SESSION_CONTEXT_COMMAND`, `_marker_token`, `_extract_marker_token`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_has_unscoped_session_start_hook`, `_find_legis_command`, `_LEGIS_IGNORE_RULES`, `inject_instructions`, `install_skills`, `install_codex_skills`, `install_claude_code_hooks`, `ensure_gitignore`, `_atomic_write_text`, `reject_symlink`, `project_path`. +- `config.py`: `project_root`, `governance_db_url`, `binding_db_url`, `protected_policies`, `_store_dir`. +- `store/audit_store.py`: `AuditStore(url).verify_integrity() -> bool`. + +--- + +## Task 1: `DoctorCheck` record + rendering + empty orchestrator + +**Files:** +- Create: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# tests/test_doctor.py +from __future__ import annotations + +import json + +from legis.doctor import DoctorCheck, render_json, render_text + + +def test_doctorcheck_to_dict_omits_empty_message(): + assert DoctorCheck("a.b", "ok").to_dict() == {"id": "a.b", "status": "ok", "fixed": False} + assert DoctorCheck("a.b", "error", message="boom").to_dict() == { + "id": "a.b", + "status": "error", + "fixed": False, + "message": "boom", + } + + +def test_render_json_shape(): + checks = [DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")] + payload = json.loads(render_json(checks)) + assert payload["ok"] is False + assert payload["checks"][0] == {"id": "a", "status": "ok", "fixed": False} + assert payload["next_actions"] == ["b: bad"] + + +def test_render_text_lists_only_problems_when_healthy_says_ok(): + assert "legis doctor: ok" in render_text([DoctorCheck("a", "ok")]) + out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) + assert "b: bad" in out + assert "legis doctor: ok" not in out +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: FAIL with `ModuleNotFoundError: No module named 'legis.doctor'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# src/legis/doctor.py +"""`legis doctor` — view and repair legis install/config health. + +Operator/CLI tool only: it inspects and repairs the *host* install and legis's +own per-member artifacts. It is NOT on the agent MCP surface or the service +layer, and per hub doctrine C-9(b) it NEVER writes weft.toml. +""" + +from __future__ import annotations + +import json +from dataclasses import dataclass +from typing import Any + + +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str + status: str # "ok" | "warn" | "error" + fixed: bool = False + message: str | None = None + + @property + def ok(self) -> bool: + return self.status != "error" + + def to_dict(self) -> dict[str, Any]: + data: dict[str, Any] = {"id": self.id, "status": self.status, "fixed": self.fixed} + if self.message: + data["message"] = self.message + return data + + +def _next_actions(checks: list[DoctorCheck]) -> list[str]: + return [f"{c.id}: {c.message}" for c in checks if c.status != "ok" and c.message] + + +def render_json(checks: list[DoctorCheck]) -> str: + payload = { + "ok": all(c.ok for c in checks), + "checks": [c.to_dict() for c in checks], + "next_actions": _next_actions(checks), + } + return json.dumps(payload, indent=2, sort_keys=True) + + +def render_text(checks: list[DoctorCheck]) -> str: + healthy = all(c.status == "ok" for c in checks) + if healthy: + return "legis doctor: ok" + lines = ["legis doctor:"] + for c in checks: + if c.status == "ok": + continue + lines.append(f" {c.id}: {c.status} — {c.message}" if c.message else f" {c.id}: {c.status}") + return "\n".join(lines) +``` + +Note: `ok` is True for `warn` (non-fatal) and False only for `error`. `render_text`'s "all ok" banner uses strict `== "ok"` so warns still print. + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): DoctorCheck record + text/json rendering" +``` + +--- + +## Task 2: `collect_checks` orchestrator + `run_doctor` (still no real checks) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from pathlib import Path + +from legis.doctor import run_doctor + + +def test_run_doctor_empty_is_healthy(tmp_path, capsys): + # With no checks registered yet, an empty list renders healthy, exit 0. + rc = run_doctor(tmp_path, repair=False, fmt="text") + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_run_doctor_json_format(tmp_path, capsys): + rc = run_doctor(tmp_path, repair=False, fmt="json") + assert rc == 0 + payload = json.loads(capsys.readouterr().out) + assert payload == {"ok": True, "checks": [], "next_actions": []} +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k run_doctor -v` +Expected: FAIL with `ImportError: cannot import name 'run_doctor'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from pathlib import Path + + +def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: + """Run every check against *root*. Repairs run inside individual checks + when *repair* is True; each returned check reflects post-repair state.""" + checks: list[DoctorCheck] = [] + # Check functions are appended here in later tasks. + return checks + + +def run_doctor(root: Path, *, repair: bool, fmt: str) -> int: + checks = collect_checks(root, repair=repair) + print(render_json(checks) if fmt == "json" else render_text(checks)) + return 0 if all(c.ok for c in checks) else 1 +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k run_doctor -v` +Expected: PASS (2 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): collect_checks + run_doctor orchestrator skeleton" +``` + +--- + +## Task 3: CLI `doctor` subparser + dispatch (walking skeleton end-to-end) + +**Files:** +- Modify: `src/legis/cli.py` (subparser in `build_parser`, dispatch in `main`) +- Test: `tests/test_cli.py` (or `tests/test_doctor.py` — match where CLI tests live) + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.cli import main as cli_main + + +def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor"]) + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_cli_doctor_json(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor", "--format", "json"]) + assert rc == 0 + assert json.loads(capsys.readouterr().out)["ok"] is True +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k cli_doctor -v` +Expected: FAIL — argparse exits non-zero / `doctor` is not a known subcommand. + +- [ ] **Step 3: Write minimal implementation** + +In `src/legis/cli.py`, inside `build_parser()` (after the `install` subparser block, before `return parser`): + +```python + doctor = subparsers.add_parser( + "doctor", + help="View and repair legis install/config health", + ) + doctor.add_argument("--root", default=".", help="Project root to inspect (default: cwd)") + doctor.add_argument("--repair", action="store_true", help="Apply safe repairs, then re-check") + doctor.add_argument( + "--format", choices=("text", "json"), default="text", + help="Output format: human text (default) or machine-readable json", + ) +``` + +Add a dispatcher function near `_check_override_rate`: + +```python +def _run_doctor(args) -> int: + from pathlib import Path + + from legis.doctor import run_doctor + + return run_doctor(Path(args.root), repair=args.repair, fmt=args.format) +``` + +In `main()`, add a branch alongside the other `args.command` checks: + +```python + if args.command == "doctor": + return _run_doctor(args) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k cli_doctor -v` +Expected: PASS (2 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/cli.py tests/test_doctor.py +git commit -m "feat(doctor): wire 'legis doctor' CLI subcommand" +``` + +--- + +## Task 4: `register_mcp_json` install capability + `legis install --mcp` + +**Files:** +- Modify: `src/legis/install.py` (add `_legis_mcp_entry`, `register_mcp_json`) +- Modify: `src/legis/cli.py` (`--mcp` flag + step in `_run_install`) +- Test: `tests/test_install.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_install.py +import json +from pathlib import Path + +from legis.install import register_mcp_json, _legis_mcp_entry + + +def test_register_mcp_json_creates_file_with_legis_entry(tmp_path): + ok, msg = register_mcp_json(tmp_path) + assert ok, msg + data = json.loads((tmp_path / ".mcp.json").read_text()) + entry = data["mcpServers"]["legis"] + assert entry["type"] == "stdio" + assert entry["args"][0] == "mcp" + assert "--agent-id" in entry["args"] + + +def test_register_mcp_json_preserves_sibling_entries(tmp_path): + (tmp_path / ".mcp.json").write_text( + json.dumps({"mcpServers": {"filigree": {"command": "x", "type": "stdio"}}}) + ) + ok, _ = register_mcp_json(tmp_path) + assert ok + data = json.loads((tmp_path / ".mcp.json").read_text()) + assert "filigree" in data["mcpServers"] + assert "legis" in data["mcpServers"] + + +def test_register_mcp_json_idempotent(tmp_path): + register_mcp_json(tmp_path) + first = (tmp_path / ".mcp.json").read_text() + register_mcp_json(tmp_path) + assert (tmp_path / ".mcp.json").read_text() == first +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_install.py -k mcp_json -v` +Expected: FAIL with `ImportError: cannot import name 'register_mcp_json'`. + +- [ ] **Step 3: Write minimal implementation** + +In `src/legis/install.py` (after the `.gitignore` section), add: + +```python +# --------------------------------------------------------------------------- +# .mcp.json (agent MCP server registration) +# --------------------------------------------------------------------------- + +import shlex + +_DEFAULT_AGENT_ID = "claude-code" + + +def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: + """The canonical legis stdio server entry for .mcp.json.""" + return { + "args": ["mcp", "--agent-id", agent_id], + "command": _find_legis_command()[0] if len(_find_legis_command()) == 1 else shlex.join(_find_legis_command()), + "env": {}, + "type": "stdio", + } + + +def register_mcp_json(project_root: Path, agent_id: str = _DEFAULT_AGENT_ID) -> tuple[bool, str]: + """Register (or refresh) the legis server in /.mcp.json. + + Creates the file if absent; merges into mcpServers without disturbing + sibling entries. Preserves an existing legis entry's agent-id if it already + carries one (operator choice), refreshing only the command/args shape. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError as exc: + return False, str(exc) + + data: dict[str, Any] = {} + if path.exists(): + try: + parsed = json.loads(path.read_text(encoding="utf-8")) + if isinstance(parsed, dict): + data = parsed + except (json.JSONDecodeError, OSError): + return False, ".mcp.json present but unreadable; fix or remove it by hand" + + servers = data.get("mcpServers") + if not isinstance(servers, dict): + servers = {} + data["mcpServers"] = servers + + existing = servers.get("legis") + keep_agent = agent_id + if isinstance(existing, dict): + args = existing.get("args", []) + if isinstance(args, list) and "--agent-id" in args: + i = args.index("--agent-id") + if i + 1 < len(args) and isinstance(args[i + 1], str): + keep_agent = args[i + 1] + + desired = _legis_mcp_entry(keep_agent) + if existing == desired: + return True, "legis already registered in .mcp.json" + servers["legis"] = desired + _atomic_write_text(path, json.dumps(data, indent=2, sort_keys=True) + "\n") + return True, "Registered legis server in .mcp.json" +``` + +Note: `Any` and `json` are already imported at the top of `install.py`; if not, add `from typing import Any` and `import json`. Move `import shlex` to the module top if a linter flags the inline import. + +In `src/legis/cli.py` `build_parser()`, add to the `install` subparser: + +```python + install.add_argument("--mcp", action="store_true", help="Register the legis MCP server in .mcp.json only") + install.add_argument( + "--agent-id", default="claude-code", + help="Agent id stamped in the .mcp.json legis entry (default: claude-code)", + ) +``` + +In `_run_install` (the `steps` list and the imports from `legis.install`), add `register_mcp_json` to the import and a step: + +```python + (install_all or args.mcp, ".mcp.json", lambda: register_mcp_json(project_root, args.agent_id)), +``` + +and update the `install_all` computation to include `args.mcp`: + +```python + install_all = not any( + [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore, args.mcp] + ) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_install.py -k mcp_json -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/install.py src/legis/cli.py tests/test_install.py +git commit -m "feat(install): register legis MCP server in .mcp.json (+ --mcp flag)" +``` + +--- + +## Task 5: doctor `.mcp.json` check + repair + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_mcp_json + + +def test_mcp_json_absent_is_error(tmp_path): + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + assert c.fixed is False + + +def test_mcp_json_repair_fixes_it(tmp_path): + c = check_mcp_json(tmp_path, repair=True) + assert c.status == "ok" + assert c.fixed is True + assert (tmp_path / ".mcp.json").exists() + + +def test_mcp_json_present_is_ok(tmp_path): + from legis.install import register_mcp_json + register_mcp_json(tmp_path) + c = check_mcp_json(tmp_path, repair=False) + assert c.status == "ok" + assert c.fixed is False +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k mcp_json -v` +Expected: FAIL — `cannot import name 'check_mcp_json'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +import json as _json # noqa: F401 (json already imported at top; reuse it) + + +def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.mcp_json" + path = root / ".mcp.json" + present = False + if path.exists(): + try: + data = json.loads(path.read_text(encoding="utf-8")) + present = isinstance(data, dict) and isinstance(data.get("mcpServers"), dict) and "legis" in data["mcpServers"] + except (json.JSONDecodeError, OSError): + present = False + if present: + return DoctorCheck(cid, "ok") + if repair: + from legis.install import register_mcp_json + + ok, msg = register_mcp_json(root) + if ok: + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="legis server not registered (run: legis install --mcp)") +``` + +Remove the `import json as _json` line — `json` is already imported at the top of the module from Task 1; this note is a reminder, not new code. Then register the check in `collect_checks`: + +```python + checks.append(check_mcp_json(root, repair=repair)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k mcp_json -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): .mcp.json registration check + repair" +``` + +--- + +## Task 6: doctor install-wiring checks (blocks, skills, hook, gitignore) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_instruction_block, check_skill_pack, check_hook, check_gitignore +from legis import install as legis_install + + +def test_instruction_block_absent_is_error(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.id == "install.claude_md" + assert c.status == "error" + + +def test_instruction_block_repair_creates_it(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert c.status == "ok" + assert c.fixed is True + assert legis_install.INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + + +def test_gitignore_absent_is_error_then_repaired(tmp_path): + assert check_gitignore(tmp_path, repair=False).status == "error" + fixed = check_gitignore(tmp_path, repair=True) + assert fixed.status == "ok" and fixed.fixed is True + assert ".weft/legis/" in (tmp_path / ".gitignore").read_text() + + +def test_skill_pack_absent_is_error(tmp_path): + assert check_skill_pack(tmp_path, ".claude", repair=False).status == "error" + + +def test_skill_pack_repair_installs(tmp_path): + c = check_skill_pack(tmp_path, ".claude", repair=True) + assert c.status == "ok" and c.fixed is True +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "instruction_block or gitignore or skill_pack" -v` +Expected: FAIL — those check functions don't exist yet. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from legis import install as _install + + +def _block_fresh(root: Path, filename: str) -> bool: + """True iff / has the legis block at the current token.""" + path = root / filename + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + if _install.INSTRUCTIONS_MARKER not in content: + return False + return _install._extract_marker_token(content) == _install._marker_token() + + +def check_instruction_block(root: Path, filename: str, *, repair: bool) -> DoctorCheck: + cid = "install.claude_md" if filename == "CLAUDE.md" else "install.agents_md" + if _block_fresh(root, filename): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.inject_instructions(root / filename) + if ok and _block_fresh(root, filename): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + missing = "missing" if not (root / filename).exists() else "block missing or drifted" + return DoctorCheck(cid, "error", message=f"{filename} {missing} (run: legis install)") + + +def _skill_fresh(root: Path, base: str) -> bool: + source = _install._get_skills_source_dir() / _install.SKILL_NAME + target = root / base / "skills" / _install.SKILL_NAME + if not source.is_dir() or not target.is_dir(): + return False + return _install._skill_tree_fingerprint(target) == _install._skill_tree_fingerprint(source) + + +def check_skill_pack(root: Path, base: str, *, repair: bool) -> DoctorCheck: + cid = "install.claude_skill" if base == ".claude" else "install.agents_skill" + installer = _install.install_skills if base == ".claude" else _install.install_codex_skills + if _skill_fresh(root, base): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = installer(root) + if ok and _skill_fresh(root, base): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=f"{base}/skills/{_install.SKILL_NAME} missing or drifted (run: legis install)") + + +def _hook_present(root: Path) -> bool: + settings_path = root / ".claude" / "settings.json" + if not settings_path.exists(): + return False + try: + settings = json.loads(settings_path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + return _install._has_unscoped_session_start_hook(settings, _install.SESSION_CONTEXT_COMMAND) + + +def check_hook(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.hook" + if _hook_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.install_claude_code_hooks(root) + if ok and _hook_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="SessionStart hook not registered (run: legis install)") + + +def _gitignore_present(root: Path) -> bool: + path = root / ".gitignore" + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} + return all(rule in present for rule in _install._LEGIS_IGNORE_RULES) + + +def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.gitignore" + if _gitignore_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.ensure_gitignore(root) + if ok and _gitignore_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") +``` + +Register them in `collect_checks` (before the `.mcp.json` check): + +```python + checks.append(check_instruction_block(root, "CLAUDE.md", repair=repair)) + checks.append(check_instruction_block(root, "AGENTS.md", repair=repair)) + checks.append(check_skill_pack(root, ".claude", repair=repair)) + checks.append(check_skill_pack(root, ".agents", repair=repair)) + checks.append(check_hook(root, repair=repair)) + checks.append(check_gitignore(root, repair=repair)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "instruction_block or gitignore or skill_pack or hook" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): install-wiring checks (blocks, skills, hook, gitignore)" +``` + +--- + +## Task 7: doctor config & store checks (weft.toml report-only, store dir, db overrides, legacy) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_weft_toml, check_store_dir, check_db_overrides, check_legacy_stray_db + + +def test_weft_toml_absent_is_ok(tmp_path): + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_valid_legis_table_is_ok(tmp_path): + (tmp_path / "weft.toml").write_text('[legis]\nstore_dir = ".weft/legis"\n') + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_malformed_is_error_and_unchanged(tmp_path): + wt = tmp_path / "weft.toml" + wt.write_text("[legis]\nstore_dir = \n") # malformed TOML + before = wt.read_text() + c = check_weft_toml(tmp_path) + assert c.status == "error" + assert wt.read_text() == before # C-9(b): never written + + +def test_weft_toml_legis_not_a_table_is_error(tmp_path): + (tmp_path / "weft.toml").write_text('legis = "oops"\n') + assert check_weft_toml(tmp_path).status == "error" + + +def test_store_dir_writable_parent_is_ok(tmp_path): + assert check_store_dir(tmp_path).status == "ok" + + +def test_db_override_bad_url_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "::not a url::") + assert check_db_overrides(tmp_path).status == "error" + + +def test_legacy_stray_db_is_warn(tmp_path): + (tmp_path / "legis-governance.db").write_text("x") + assert check_legacy_stray_db(tmp_path).status == "warn" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "weft_toml or store_dir or db_override or legacy" -v` +Expected: FAIL — functions undefined. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +import os +import tomllib + +from sqlalchemy.engine import make_url + +_DB_OVERRIDE_ENVS = ("LEGIS_CHECK_DB", "LEGIS_GOVERNANCE_DB", "LEGIS_BINDING_DB", "LEGIS_PULL_DB") +_LEGACY_DB_NAMES = ("legis-checks.db", "legis-governance.db", "legis-binding.db", "legis-pulls.db") + + +def check_weft_toml(root: Path) -> DoctorCheck: + """Report-only (C-9(b)): NEVER writes weft.toml. Distinguishes ABSENT (ok — + defaults intentional) from PRESENT-BUT-BROKEN (error — config silently not + applying), restoring the operator signal that C-9(c) silences at runtime.""" + cid = "config.weft_toml" + path = root / "weft.toml" + if not path.exists(): + return DoctorCheck(cid, "ok", message="absent (built-in defaults)") + try: + data = tomllib.loads(path.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError) as exc: + return DoctorCheck(cid, "error", message=f"present but unparseable; [legis] silently not applying ({exc})") + table = data.get("legis") + if table is not None and not isinstance(table, dict): + return DoctorCheck(cid, "error", message="[legis] in weft.toml must be a table") + return DoctorCheck(cid, "ok") + + +def _nearest_existing(path: Path) -> Path: + p = path + while not p.exists() and p != p.parent: + p = p.parent + return p + + +def check_store_dir(root: Path, *, repair: bool = False) -> DoctorCheck: + """An absent .weft/legis/ is ok (created lazily). A present-but-unwritable + dir is an error. --repair ensures the dir exists (explicit operator action).""" + cid = "store.dir" + from legis import config + + store_dir = (root / config._store_dir()) if not config._store_dir().is_absolute() else config._store_dir() + if store_dir.exists(): + if not os.access(store_dir, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not writable") + return DoctorCheck(cid, "ok") + if repair: + try: + store_dir.mkdir(parents=True, exist_ok=True) + return DoctorCheck(cid, "ok", fixed=True) + except OSError as exc: + return DoctorCheck(cid, "error", message=f"cannot create {store_dir}: {exc}") + anchor = _nearest_existing(store_dir) + if not os.access(anchor, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not creatable ({anchor} not writable)") + return DoctorCheck(cid, "ok", message="absent (created on first store open)") + + +def check_db_overrides(root: Path) -> DoctorCheck: + cid = "store.db_overrides" + bad = [] + for env in _DB_OVERRIDE_ENVS: + val = os.environ.get(env) + if not val: + continue + try: + make_url(val) + except Exception: # noqa: BLE001 — any parse failure is a bad override + bad.append(env) + if bad: + return DoctorCheck(cid, "error", message="invalid URL in: " + ", ".join(bad)) + return DoctorCheck(cid, "ok") + + +def check_legacy_stray_db(root: Path) -> DoctorCheck: + cid = "store.legacy_stray" + stray = [n for n in _LEGACY_DB_NAMES if (root / n).is_file()] + if stray: + return DoctorCheck(cid, "warn", message="legacy DB at repo root (move to .weft/legis/): " + ", ".join(stray)) + return DoctorCheck(cid, "ok") +``` + +Register in `collect_checks`: + +```python + checks.append(check_weft_toml(root)) + checks.append(check_store_dir(root, repair=repair)) + checks.append(check_db_overrides(root)) + checks.append(check_legacy_stray_db(root)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "weft_toml or store_dir or db_override or legacy" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): config & store checks (weft.toml report-only, store dir, db overrides, legacy)" +``` + +--- + +## Task 8: doctor governance integrity + runtime/sibling checks + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_audit_chain, check_hmac_key, check_sibling_url + + +def test_audit_chain_absent_db_is_ok(tmp_path): + c = check_audit_chain("store.governance_chain", "sqlite:///" + str(tmp_path / "nope.db")) + assert c.status == "ok" + + +def test_audit_chain_intact_db_is_ok(tmp_path): + from legis.store.audit_store import AuditStore + url = "sqlite:///" + str(tmp_path / "gov.db") + AuditStore(url) # creates schema + assert check_audit_chain("store.governance_chain", url).status == "ok" + + +def test_hmac_key_warn_when_protected_set_without_key(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + c = check_hmac_key(tmp_path) + assert c.status == "warn" + + +def test_hmac_key_never_prints_value(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "super-secret-value") + c = check_hmac_key(tmp_path) + assert c.status == "ok" + assert "super-secret-value" not in (c.message or "") + + +def test_sibling_url_invalid_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LOOMWEAVE_API_URL", "localhost:9620") # no scheme + c = check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL") + assert c.status == "error" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "audit_chain or hmac_key or sibling_url" -v` +Expected: FAIL — functions undefined. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from urllib.parse import urlsplit + + +def check_audit_chain(cid: str, url: str) -> DoctorCheck: + """Report-only. Absent file store => ok (nothing to verify). A tampered + chain => error (a hash chain cannot/must not be auto-repaired).""" + try: + parsed = make_url(url) + except Exception: # noqa: BLE001 + return DoctorCheck(cid, "ok", message="store URL not a file store") + db = parsed.database + if not str(parsed.drivername).startswith("sqlite") or not db or db == ":memory:": + return DoctorCheck(cid, "ok", message="not a file store") + if not Path(db).exists(): + return DoctorCheck(cid, "ok", message="no store yet") + from legis.store.audit_store import AuditStore + + try: + intact = AuditStore(url).verify_integrity() + except Exception as exc: # noqa: BLE001 — surface any verify failure, never raise from doctor + return DoctorCheck(cid, "error", message=f"integrity check failed: {exc}") + return DoctorCheck(cid, "ok") if intact else DoctorCheck(cid, "error", message="hash chain verification FAILED (report-only; cannot repair)") + + +def check_hmac_key(root: Path) -> DoctorCheck: + """Presence-only; NEVER renders the key value.""" + cid = "runtime.hmac_key" + from legis import config + + if not config.protected_policies(): + return DoctorCheck(cid, "ok", message="no protected policies configured") + if os.environ.get("LEGIS_HMAC_KEY"): + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "warn", message="protected policies configured but LEGIS_HMAC_KEY not set; protected submissions will fail") + + +def check_sibling_url(cid: str, env: str) -> DoctorCheck: + url = os.environ.get(env) + if not url: + return DoctorCheck(cid, "ok", message="not configured") + parsed = urlsplit(url) + if parsed.scheme.lower() in {"http", "https"} and parsed.netloc: + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "error", message=f"{env} invalid URL: {url!r}") +``` + +Register in `collect_checks`: + +```python + from legis import config + + checks.append(check_audit_chain("store.governance_chain", config.governance_db_url())) + checks.append(check_audit_chain("store.binding_chain", config.binding_db_url())) + checks.append(check_hmac_key(root)) + checks.append(check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL")) + checks.append(check_sibling_url("runtime.filigree_url", "FILIGREE_API_URL")) +``` + +Note: `config.governance_db_url()` / `binding_db_url()` resolve cwd-relative URLs. `collect_checks` must resolve them relative to `root`; if `root` is not cwd, run the resolution with cwd set to `root` — simplest is to compute these URLs inside a small helper that `os.chdir`-free resolves via `config._store_dir()` joined to `root`. To avoid cwd coupling in tests, compute the path directly: + +```python +def _store_url(root: Path, db_name: str, env: str) -> str: + val = os.environ.get(env) + if val: + return val + from legis import config + + store_dir = config._store_dir() + base = store_dir if store_dir.is_absolute() else (root / store_dir) + return "sqlite:///" + (base / db_name).as_posix() +``` + +and call: + +```python + checks.append(check_audit_chain("store.governance_chain", _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB"))) + checks.append(check_audit_chain("store.binding_chain", _store_url(root, "legis-binding.db", "LEGIS_BINDING_DB"))) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "audit_chain or hmac_key or sibling_url" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): governance-chain integrity + runtime/sibling checks" +``` + +--- + +## Task 9: end-to-end `--repair` re-check + JSON regression test + +**Files:** +- Test: `tests/test_doctor.py` (no new logic — repairs already run inside checks; this proves the whole pipeline) + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +def test_repair_makes_fresh_project_healthy(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + # First run: unhealthy (no install artifacts, no .mcp.json). + assert run_doctor(tmp_path, repair=False, fmt="text") == 1 + # Repair run: install-wiring + .mcp.json get fixed; re-check is healthy. + assert run_doctor(tmp_path, repair=True, fmt="text") == 0 + # Third run, no repair: stays healthy. + assert run_doctor(tmp_path, repair=False, fmt="text") == 0 + + +def test_repair_never_writes_weft_toml(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("[legis]\nstore_dir = \n") # malformed + before = (tmp_path / "weft.toml").read_text() + run_doctor(tmp_path, repair=True, fmt="json") + assert (tmp_path / "weft.toml").read_text() == before + + +def test_json_output_has_no_secret(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "TOP-SECRET") + import io, contextlib + buf = io.StringIO() + with contextlib.redirect_stdout(buf): + run_doctor(tmp_path, repair=False, fmt="json") + assert "TOP-SECRET" not in buf.getvalue() +``` + +- [ ] **Step 2: Run test to verify it fails or passes** + +Run: `uv run pytest tests/test_doctor.py -k "repair_makes or never_writes or no_secret" -v` +Expected: PASS if Tasks 5–8 are wired correctly. If `test_repair_makes_fresh_project_healthy` fails, the offending check's `repair=True` branch isn't reaching `ok` — fix that check, not this test. + +- [ ] **Step 3: (only if a test failed) fix the implicated check** + +No new code if green. If red, the failing check is reported by name in the assertion — return to that check's task and correct its repair branch. + +- [ ] **Step 4: Run the full doctor test file** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: PASS (all). + +- [ ] **Step 5: Commit** + +```bash +git add tests/test_doctor.py +git commit -m "test(doctor): end-to-end repair pipeline + weft.toml/secret invariants" +``` + +--- + +## Task 10: docs, coverage floor, and full gate run + +**Files:** +- Modify: `CHANGELOG.md`, `README.md` +- Modify: `scripts/check_coverage_floors.py` (only if it lists modules explicitly) + +- [ ] **Step 1: Update CHANGELOG and README** + +Add to `CHANGELOG.md` under the unreleased/rc4 section: + +```markdown +### Added +- `legis doctor [--root] [--repair] [--format text|json]` — operator health view + and safe repair for the install + config layer (instruction blocks, skills, + SessionStart hook, `.gitignore`, `.mcp.json` registration, store dir, audit + hash-chain integrity, key/sibling wiring). Report-only on `weft.toml` (C-9(b)) + and on hash chains; key values are never rendered. +- `legis install --mcp` — register the legis MCP server in `.mcp.json` + (also part of `legis install` with no flags). +``` + +In `README.md`, under the surfaces/commands section, add a `legis doctor` line mirroring the existing `legis install` description. + +- [ ] **Step 2: Run the full test suite + lint + types** + +Run: +```bash +uv run ruff check src +uv run mypy src/legis +uv run pytest -q +``` +Expected: ruff clean, mypy clean, all tests pass. + +- [ ] **Step 3: Run coverage floors** + +Run: `uv run pytest --cov=legis --cov-report=term-missing && uv run python scripts/check_coverage_floors.py` +Expected: floors hold. If `check_coverage_floors.py` enumerates packages and `doctor.py` is top-level (not in a covered package dir), confirm it falls under the global floor; if the script needs a per-module entry, add one a few points below the achieved coverage. + +- [ ] **Step 4: Manual smoke test** + +Run: +```bash +cd /tmp && rm -rf doctortest && mkdir doctortest && cd doctortest +legis doctor # expect: several errors (fresh dir), exit 1 +legis doctor --repair # expect: install wiring + .mcp.json fixed, exit 0 +legis doctor --format json # expect: {"ok": true, ...} +``` +Expected: matches the comments. + +- [ ] **Step 5: Commit** + +```bash +git add CHANGELOG.md README.md scripts/check_coverage_floors.py +git commit -m "docs(doctor): changelog + readme for legis doctor; coverage floor" +``` + +--- + +## Self-Review notes (for the implementer) + +- **Spec coverage:** install wiring (T6) ✓, `.mcp.json` install+check (T4/T5) ✓, config & stores (T7) ✓, governance integrity (T8) ✓, runtime & siblings (T8) ✓, `--repair` model (repairs live inside checks; T9 proves it) ✓, JSON shape + exit codes (T1/T2/T9) ✓, weft.toml never-written invariant (T7/T9) ✓, key-value-never-shown invariant (T8/T9) ✓. +- **C-9(b) guard** is asserted by `test_weft_toml_malformed_is_error_and_unchanged` and `test_repair_never_writes_weft_toml`. +- **No-leak guard:** `check_audit_chain` constructs `AuditStore` only when the DB file already exists; `check_store_dir` creates `.weft/legis/` only under `--repair`. +- **Verify reused private symbols exist** before Task 6/8 (`SESSION_CONTEXT_COMMAND`, `SKILL_NAME`, `_has_unscoped_session_start_hook`, `_LEGIS_IGNORE_RULES`, `_extract_marker_token`, `_marker_token`). If any name differs, adjust the call site — do not duplicate the logic. diff --git a/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md b/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md new file mode 100644 index 0000000..2d6a3d5 --- /dev/null +++ b/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md @@ -0,0 +1,197 @@ +# Legis instruction injection — design spec + +**Date:** 2026-06-06 +**Status:** Approved for implementation (ultracode) +**Author:** John Morrissey (with Claude) + +## Goal + +Make legis "stand itself up" the way its siblings do: a coding agent that opens +a legis project finds an **agent-calibrated orientation block** in +`CLAUDE.md` / `AGENTS.md` plus a `legis-workflow` **skill pack**, and that +content stays **automatically fresh** (versioned content hash; re-injected on +drift) for **both** Claude Code and Codex agents. + +This mirrors Filigree's proven mechanism +(`filigree/src/filigree/install.py`, `hooks.py`, +`install_support/hooks.py`) and adopts Loomweave's skill-tree fingerprint +drift detection — with one improvement over both siblings: refresh also fires +on **MCP server boot**, closing the "Codex-only repo never refreshes" gap. + +## Doctrine anchor + +From `README.md`: *"Each tool stands itself up preloaded with agent-calibrated +instructions — the instruction layer is the configuration mechanism."* and +*"Agent-first: humans on the loop, not in the loop."* This feature is the legis +realization of that instruction layer. + +## Architecture + +### Two-tier content (best practice: lean block + skill pack) + +1. **Lean orientation block** (~20 lines) injected into `CLAUDE.md` / `AGENTS.md`. + - States what legis is (the git/CI + governance layer of Weft), how to reach + it (`mcp__legis__*` tools when present; `legis` CLI fallback), and points to + the `legis-workflow` skill for the full reference. + - Delimited by versioned markers: + - open: `` + - close: `` + - `{version}` = `importlib.metadata.version("legis")` → falls back to + `legis.__version__` (currently `1.0.0rc4`). + - `{hash}` = first 8 hex chars of `sha256(block_body_text)`. + - **Freshness compares the full `v{version}:{hash}` token**, so a body edit + (hash drift) *or* a package-version bump both trigger re-injection and keep + the marker truthful. (Filigree compares hash-only; legis compares both so + "automatic versioning" actually tracks the version.) + +2. **`legis-workflow` skill pack** carrying the depth: CLI command reference, + MCP tool catalogue, error-code/recovery table, workflow patterns. Shipped as + package data; installed into `.claude/skills/legis-workflow/` and + `.agents/skills/legis-workflow/` (Codex). Drift-detected via a skill-tree + fingerprint (sorted relative POSIX path + bytes, sha256[:8]). + +### Refresh triggers (two — full coverage) + +- **Claude Code SessionStart hook** (`legis session-context`) registered in + `.claude/settings.json`. Refreshes block + skill drift when Claude opens the + repo. +- **`legis mcp` startup** — best-effort `refresh_instructions(cwd)` invoked from + the CLI `mcp` branch before the stdio loop starts. This is the **load-bearing + trigger for Codex-only repos** (no `.claude/` hook). Idempotent: writes only + when the embedded hash differs, so no git churn in steady state. All failures + are swallowed — the refresh must never block or crash the MCP server. + +Both triggers call the same `refresh_instructions(root)`. Refresh **only updates +files/skills that already carry the marker** (drift refresh in place). Initial +**creation** is the job of `legis install` — an MCP boot or hook never +surprise-creates `CLAUDE.md`. (Matches Filigree's freshness semantics.) + +## Components + +### `src/legis/data/instructions.md` +The lean block body (no markers — markers are added programmatically). Content: +what legis is, `mcp__legis__*` + CLI fallback, the six CLI subcommands, and a +pointer to the `legis-workflow` skill. + +### `src/legis/data/skills/legis-workflow/SKILL.md` +Skill pack with YAML frontmatter (`name: legis-workflow`, a `description:` that +triggers on governance/override/policy-cell/CI-gate/git-rename/closure-gate +tasks). Body documents: +- CLI: `serve`, `mcp`, `check-override-rate`, `governance-gate`, + `sei-backfill`, `policy-boundary-check`. +- MCP tools: `policy_explain`, `override_submit`, `signoff_status_get`, + `policy_evaluate`, `scan_route`, `git_branch_list`, `git_commit_get`, + `git_rename_list`, `git_rename_feed_get`, `filigree_closure_gate_get`, + `pull_request_get`, `check_list`, `override_rate_get`. +- Error codes / recovery (sourced from `legis/mcp.py` `_recovery_for`). + +### `src/legis/install.py` +Mirrors Filigree's injection core, right-sized (no dashboard, no server mode): +- `INSTRUCTIONS_MARKER = "`) to the current + version+hash, re-inject on mismatch; for each installed skill root, compare + tree fingerprint to source and reinstall on mismatch. Returns human-readable + update messages. `root` defaults to the caller's cwd; the MCP-boot caller + passes `Path.cwd()` and accepts that a non-project cwd simply no-ops (refresh + only ever touches marker-bearing files). + Best-effort: callers guard against `OSError`/`UnicodeDecodeError`/`ValueError`. +- `generate_session_context() -> str | None`: run `refresh_instructions(cwd)`; + return the joined update messages, or `None` when nothing changed (silent — + no governance snapshot, no DB dependency). + +### `src/legis/cli.py` +- `legis install` subcommand: flags `--claude-md`, `--agents-md`, `--skills`, + `--codex-skills`, `--hooks`, `--gitignore`; no flags ⇒ all. Steps: inject + `CLAUDE.md`, inject `AGENTS.md`, install skills, install codex skills, install + hooks, ensure gitignore. Print a per-step result table. +- `legis session-context` subcommand: prints `generate_session_context()` (or + nothing) and exits 0. +- In the existing `mcp` branch: call `refresh_instructions(Path.cwd())` inside a + broad `try/except` (swallow all) **before** `mcp_main(...)`. + +### `pyproject.toml` +Ensure `src/legis/data/**` (the `instructions.md` and the skill tree) ships in +the wheel/sdist under `uv_build`. Verify via +`importlib.resources.files("legis.data")` at test time. + +### `.gitignore` +Extend the existing `# Legis —` stanza so it also ignores the (prophylactic, +sibling-consistent) local config surface: +``` +# Legis — local audit/scratch databases + their SQLite WAL sidecars +# and local working dir / config (regenerated/local; never commit) +*.db +*.db-shm +*.db-wal +.legis/ +legis.yaml +``` + +## Out of scope (YAGNI) + +- Dashboard / ephemeral-port / server-mode machinery (legis has none). +- A PreToolUse hook (no dashboard to restart). +- A Codex-native hook (the MCP-boot refresh supersedes it). +- Changing how `CLAUDE.md`/`AGENTS.md` are tracked — they remain gitignored + regenerated artifacts; the legis block coexists with whatever else regenerates + them. + +## Testing + +Mirror Filigree/Loomweave coverage (repo floor: 88%): +- `inject_instructions`: create / append / replace / malformed (missing end + marker) / idempotent re-run. +- `_instructions_hash` stable; `_build_instructions_block` marker shape; + marker-hash regex extraction. +- `_skill_tree_fingerprint` changes on content/path change; `refresh_instructions` + updates a drifted `CLAUDE.md` **and** `AGENTS.md` and a drifted skill pack; + no-ops (returns `[]`) when fresh; skips files without the marker. +- `install_claude_code_hooks`: fresh install, idempotent re-run, bare→absolute + upgrade, malformed `settings.json` backup, does not duplicate, reuses only + unscoped blocks. +- `ensure_gitignore`: adds `.legis/`/`legis.yaml`, idempotent, preserves + existing content. +- `_atomic_write_text`: preserves existing file mode; new file respects umask; + rejects symlink target. +- CLI: `legis install` (all + each selective flag) writes expected artifacts; + `legis session-context` prints refresh messages / nothing; `mcp` branch + refresh is best-effort (a raising `refresh_instructions` does not break + `mcp` startup). +- Packaging: `importlib.resources.files("legis.data")` resolves the template and + skill tree. + +## Gates + +`ruff`, `mypy` (py312, the repo's strict config), `pytest` with the 88% floor, +all green before done. diff --git a/docs/superpowers/specs/2026-06-07-legis-doctor-design.md b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md new file mode 100644 index 0000000..81819bf --- /dev/null +++ b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md @@ -0,0 +1,249 @@ +# Legis doctor — design spec + +**Date:** 2026-06-07 +**Status:** Approved for implementation +**Author:** John Morrissey (with Claude) + +## Goal + +Give legis a `legis doctor` command that **views and repairs install/config +problems**, the way its siblings do (`wardline doctor`, +`filigree`'s `install_support/doctor.py`; loomweave has none). One command +answers *"is my legis wiring healthy, and if not, fix what's safe to fix."* + +Two distinct gaps motivate it: + +1. **No affirmative health view.** legis already self-heals install drift on + `SessionStart` / MCP boot (`hooks.refresh_instructions`), but that path is + silent on success — `session-context` prints nothing whether everything is + current or nothing was checked. There is no way to *ask* "is this healthy?" + and get an affirmative answer. +2. **No coverage of the config/store layer.** The install path checks + instruction blocks / skills / hook / `.gitignore`, but nothing checks + `weft.toml` parseability, the `.weft/legis/` stores, audit-chain integrity, + the `.mcp.json` server registration, or key/sibling-URL wiring. + +These were surfaced concretely while scoping this work (see **Worked examples**): +legis was absent from `.mcp.json` entirely, and `session-context` returned +nothing — both real, both exactly what doctor should catch. + +## Doctrine anchors + +- **C-9(a) — per-member subtree.** Each member is the **sole writer of its own + `.weft//` subtree** and never reads/writes a sibling's. doctor may + create/repair `.weft/legis/`. +- **C-9(b) — `weft.toml` is operator-write-only; `doctor` is named.** *"No + member's installer / CLI / `doctor` writes or rewrites `weft.toml`."* Precedent + is the multi-writer truncation gate `weft-eb3dee402f`. **`legis doctor` is + fully report-only on `weft.toml` — it does not even scaffold an absent + `[legis]` table.** Matches `wardline doctor` ("never weft.toml, never a + sibling's"). +- **C-9(c) — malformed = absent (silent fallback) at runtime.** A + malformed/unreadable `weft.toml` must still boot on defaults. doctor's job is + to **restore the operator signal** that runtime silences: it reports + malformed `weft.toml` as an **error** (your config is silently not applying) + — a diagnostic, never a write. +- **Capability honesty / key carve-out.** Operator signing keys are + capability-confined and not agent-reachable (`config.py`). doctor + **presence-checks** keys only — it never prints, logs, or writes a key value. + legis operator keys are held securely (a Rust key sidecar is planned); + filigree's auto-generated federation comms key is a separate concern. +- **Agent-first, humans on the loop.** doctor is an **operator/CLI** tool. It + inspects and repairs the *host* install and operator files, which is not an + agent-reachable concern, so it is **not** added to the legis MCP tool surface + or the transport-agnostic `service/` decision layer. + +## Architecture + +A single new module plus thin CLI wiring and one install capability — +mirroring `wardline/install/doctor.py` and matching legis's flat-module style +(`config.py`, `install.py`, `hooks.py`). + +- **`src/legis/doctor.py`** — the logic. A `DoctorCheck` dataclass, one function + per check, a `run_doctor(root, *, repair, fmt) -> int` orchestrator, and + `machine_readable_doctor(root, *, repair) -> dict` for the JSON shape. +- **`src/legis/cli.py`** — a `doctor` subparser and a thin `_run_doctor` + dispatcher (I/O shell + exit code only; same pattern as `_check_override_rate`). +- **`src/legis/install.py`** — a new `register_mcp_json(project_root) -> + tuple[bool, str]` (and a matching `--mcp` install flag, included in + install-all), so the `.mcp.json` check has a repair capability to call. This + closes the asymmetry where `wardline install` registers `.mcp.json` but + `legis install` did not. + +**Reuse (no logic duplication):** +- `install.py`: `INSTRUCTIONS_MARKER`, `_extract_marker_token`, `_marker_token`, + `_skill_tree_fingerprint`, `_get_skills_source_dir`, `inject_instructions`, + `install_skills`, `install_codex_skills`, `install_claude_code_hooks`, + `ensure_gitignore`, and the new `register_mcp_json`. +- `config.py`: `project_root`, `_weft_legis_config`, `_store_dir`, + `*_db_url`, `protected_policies`, `ensure_sqlite_parent`. +- `store/audit_store.py`: `verify_integrity`. + +### `DoctorCheck` + +```python +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str # stable, e.g. "install.mcp_json", "store.governance_chain" + status: str # "ok" | "warn" | "error" + fixed: bool = False # True if --repair changed state from not-ok to ok + message: str | None = None + + @property + def ok(self) -> bool: return self.status == "ok" +``` + +`warn` is non-fatal (does not affect exit code); `error` is fatal (exit 1). + +## Surface + +``` +legis doctor [--root .] [--repair] [--format {text,json}] +``` + +- **default** — report-only, human text. Exit `0` if no `error` checks, else `1`. +- **`--repair`** — apply safe repairs (see model below), **re-check**, then + report the post-repair state. +- **`--format json`** — emit the federation machine-readable shape: + `{"ok": bool, "checks": [DoctorCheck.to_dict()...], "next_actions": [str...]}`. + `next_actions` lists `"{id}: {message}"` for each non-ok check with a message. + +`--format` (not wardline's `--fix`) is deliberate: it matches legis's *own* +existing `policy-boundary-check --format {text,json}` convention. `--repair` and +`--format` are orthogonal (you can `--repair --format json`). Exit `2` on usage +error. + +## Checks + +### Install wiring (repairable) +- `install.claude_md` — CLAUDE.md instruction block present and **not drifted** + (marker token = current `version:hash`). +- `install.agents_md` — AGENTS.md block present and not drifted. +- `install.claude_skill` — `.claude` skill pack present, tree fingerprint fresh. +- `install.agents_skill` — `.agents` (Codex) skill pack present, fingerprint fresh. +- `install.hook` — Claude Code `SessionStart` hook registered. +- `install.gitignore` — legis `.gitignore` rules present. +- `install.mcp_json` — `.mcp.json` has a usable `legis` server entry: present, + args invoke `mcp`, and `command` resolves to an existing executable. Deliberately + NOT byte-canonical — a valid but differently-resolved legis binary (uv-tool vs + venv path) must not read as drift; only a missing entry, malformed args, or a + dead `command` path is stale. `--repair` writes the canonical entry via + `register_mcp_json` (resolved binary at repair time). + +### Config & stores +- `config.weft_toml` — **report-only.** ABSENT → `ok` (defaults intentional); + PRESENT-and-`[legis]`-valid → `ok`; PRESENT-but-unparseable, or `[legis]` not a + table → `error` ("weft.toml present but malformed; legis is booting on + defaults and your `[legis]` config is silently not applying"). +- `store.dir` — the resolved `store_dir` is usable: its parent is writable so + stores can be created. An **absent** `.weft/legis/` is `ok` (created lazily on + first store open — preserves the import-time no-leak guarantee + `test_build_runtime_initialize_does_not_create_local_state`); a + **present-but-unwritable** dir is `error`. `--repair` ensures the dir exists as + a convenience — an explicit operator action, categorically distinct from the + import-time no-leak guarantee (C-9(a)). +- `store.db_overrides` — any set `LEGIS_*_DB` env var is a well-formed URL. + Report-only. +- `store.legacy_stray` — legacy `legis-*.db` at the repo root → `warn` + (informational; never deleted — operator data). + +### Governance integrity (report-only) +- `store.governance_chain` — `AuditStore(governance_db_url()).verify_integrity()`. + Absent DB → `ok` (nothing to verify, not an error). Tamper/broken chain → + `error` (report-only; a hash chain cannot and must not be auto-repaired). +- `store.binding_chain` — same for the binding ledger. + +### Runtime & siblings (report-only) +- `runtime.hmac_key` — if `LEGIS_PROTECTED_POLICIES` is non-empty (protected / + structured cells configured) but no signing key is available → `warn` + ("protected policies configured but no signing key; protected submissions + will fail"). **Presence only; the value is never read out or shown.** +- `runtime.loomweave_url` / `runtime.filigree_url` — if set, well-formed + http(s) URL; unset → `ok` ("not configured"). Report-only. + +## Repair model + +`--repair` mutates **only legis's own per-member artifacts**: + +| Artifact | Repaired? | How | +|---|---|---| +| CLAUDE.md / AGENTS.md blocks | ✅ | `inject_instructions` (idempotent, drift-aware) | +| `.claude` / `.agents` skills | ✅ | `install_skills` / `install_codex_skills` | +| SessionStart hook | ✅ | `install_claude_code_hooks` | +| `.gitignore` | ✅ | `ensure_gitignore` | +| `.mcp.json` legis entry | ✅ | `register_mcp_json` (new) | +| `.weft/legis/` dir | ✅ | `ensure_sqlite_parent` / `mkdir` | +| `weft.toml` | ❌ never | C-9(b) — report-only, even when absent | +| Audit hash chains | ❌ never | tamper-evidence; report-only | +| Keys, sibling URLs | ❌ never | secrets/values; report-only with guidance | + +After repair, every check is **re-run** so the report reflects true post-repair +state and `fixed=True` is set only where a not-ok check became ok. + +## `.mcp.json` registration (new install capability) + +`register_mcp_json(project_root)` adds/updates a `legis` entry under +`mcpServers` in `/.mcp.json` (creating the file if absent), merging +without disturbing sibling entries. The canonical entry: + +```json +"legis": { + "args": ["mcp", "--agent-id", ""], + "command": "", + "env": {}, + "type": "stdio" +} +``` + +- **Binary resolution** reuses the same logic as the hook installer + (`install._find_legis_command`) so the entry points at the real `legis`. +- **Agent id**: `legis mcp` requires `--agent-id` (it stamps the governance + actor). Default `"claude-code"`; overridable via a `--agent-id` option on + `legis install --mcp` (and `legis doctor --repair` uses the default unless an + existing entry already carries one, which it preserves). +- Wired into `legis install` as `--mcp` and included in install-all. + +## What doctor does NOT do + +- Never writes `weft.toml` (C-9(b)). +- Never repairs a hash chain (tamper-evidence is the point). +- Never prints, logs, or writes a key value. +- Never deletes operator data (legacy stray DBs are warned, not removed). +- Not exposed on the agent MCP surface or the `service/` layer. + +## Testing + +`tests/test_doctor.py` (mirrors `src/legis/doctor.py`), `tmp_path` project +roots, with the **Worked examples** below as red→green fixtures: + +- missing `.mcp.json` legis entry → `error`; `--repair` → `fixed=True`, re-check `ok`. +- drifted instruction block (stale marker token) → `error` → repaired. +- absent `weft.toml` → `ok`; malformed `weft.toml` → `error` and **file + unchanged after `--repair`** (asserts C-9(b)). +- tampered governance chain → `error`, **report-only** (file unchanged after `--repair`). +- `LEGIS_PROTECTED_POLICIES` set with no key → `warn`; assert **no key value + appears** anywhere in text/JSON output. +- JSON shape: `{ok, checks:[{id,status,fixed,message?}], next_actions}`. +- exit codes: `0` healthy, `1` any error, `2` usage error. + +A new per-package coverage floor entry covers `doctor.py`. + +## Worked examples (the findings that motivated this) + +1. **legis absent from `.mcp.json`** — its `mcp__legis__*` tools never loaded. + `install.mcp_json` → `error`; repaired by `register_mcp_json`. (Fixed + manually during scoping; doctor makes it self-diagnosing.) +2. **`session-context` returns nothing** — honest-empty by design + (`refresh_instructions` → `[]` on no drift). doctor supplies the missing + affirmative "all current" signal. +3. **wardline rc1↔rc4 version skew (reported)** — not reproducible in this + environment (uniformly rc4). Cross-tool *version* reconciliation is **out of + scope** for v1 (doctor checks legis's own wiring, not sibling tool versions); + noted as a candidate future check. + +## Out of scope / future + +- Cross-tool version-skew checks (sibling binary versions). +- Reading keys from the planned Rust key sidecar (doctor stays presence-only; + it will check availability through whatever resolution path exists then). +- Any `weft.toml` write capability (blocked by C-9(b)). diff --git a/pyproject.toml b/pyproject.toml index 0f23bc0..8809ce7 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "legis" -version = "1.0.0rc3" +version = "1.0.0rc4" description = "Legis — the git/CI + governance layer of the Weft suite" readme = "README.md" license = "MIT" @@ -40,6 +40,7 @@ dev = [ "pytest-cov>=5.0", "httpx>=0.27", "mypy>=1.19", + "ruff>=0.8", "types-PyYAML>=6.0", ] @@ -62,3 +63,23 @@ python_version = "3.12" files = ["src/legis"] show_error_codes = true warn_unused_configs = true + +[tool.ruff] +target-version = "py312" +src = ["src"] + +[tool.ruff.lint] +# Ruff's default rule set (pyflakes F + the safe slice of pycodestyle E). This +# is the level that caught the F401 unused imports this floor-raise cleared; the +# CI gate (`ruff check src`) keeps src clean. Import-sorting (I) / pyupgrade (UP) +# are deliberately not enabled here — turning them on would impose unrelated +# reformatting churn and trip F821 on the honesty-gate test fixtures that inject +# `handler` dynamically; out of scope for a lint-gating change. +select = ["E4", "E7", "E9", "F"] + +[tool.coverage.report] +# Global silent-regression floor. CI passes --cov-fail-under explicitly (same +# value) so a local `pytest --cov` matches the gate. Per-package floors for the +# security-critical packages are enforced separately by +# scripts/check_coverage_floors.py. +fail_under = 88 diff --git a/scripts/check_coverage_floors.py b/scripts/check_coverage_floors.py new file mode 100644 index 0000000..5d421ce --- /dev/null +++ b/scripts/check_coverage_floors.py @@ -0,0 +1,96 @@ +#!/usr/bin/env python3 +"""Per-package coverage-floor gate (roadmap 11 / Q-L7). + +The global ``--cov-fail-under`` floor closes the aggregate silent-regression +headroom, but a regression concentrated in one security-critical package can +hide behind a high total. This gate enforces a minimum line-coverage percentage +per package (or single module) against ``coverage.json``. + +Floors are intentionally set a few points below current coverage: tight enough +to catch a real regression, loose enough not to trip on incidental churn. Raise +a floor when a package's coverage rises and you want to lock the gain in. + +Usage: + python scripts/check_coverage_floors.py [coverage.json] + +Exit status 0 if every floor holds, 1 otherwise (with a per-package report). +""" + +from __future__ import annotations + +import json +import sys + +# path-prefix (relative to repo root, as coverage records it) -> floor percent. +# A prefix ending in ".py" matches a single module; otherwise it matches a +# package subtree. Current coverage (2026-06-06) shown in the trailing comment. +FLOORS: dict[str, float] = { + "src/legis/enforcement/": 93.0, # currently ~95.0 + "src/legis/service/": 92.0, # currently ~94.1 + "src/legis/governance/": 90.0, # currently ~92.7 + "src/legis/api/": 88.0, # currently ~89.8 + "src/legis/mcp.py": 80.0, # currently ~82 + "src/legis/doctor.py": 88.0, # currently ~91 +} + + +def _load(path: str) -> dict: + with open(path, encoding="utf-8") as fh: + return json.load(fh) + + +def _aggregate(files: dict, prefix: str) -> tuple[int, int]: + """Sum (covered_lines, num_statements) over files matching ``prefix``.""" + covered = statements = 0 + for path, info in files.items(): + norm = path.replace("\\", "/") + if prefix.endswith(".py"): + match = norm == prefix + else: + match = norm.startswith(prefix) + if match: + summary = info["summary"] + covered += summary["covered_lines"] + statements += summary["num_statements"] + return covered, statements + + +def main(argv: list[str]) -> int: + report_path = argv[1] if len(argv) > 1 else "coverage.json" + try: + data = _load(report_path) + except FileNotFoundError: + print( + f"coverage report not found: {report_path}\n" + "Run pytest with --cov-report=json first.", + file=sys.stderr, + ) + return 1 + + files = data.get("files", {}) + failures: list[str] = [] + print(f"Per-package coverage floors ({report_path}):") + for prefix, floor in sorted(FLOORS.items()): + covered, statements = _aggregate(files, prefix) + if statements == 0: + failures.append(f" {prefix}: no statements measured (prefix matched nothing)") + continue + pct = 100.0 * covered / statements + status = "ok" if pct >= floor else "FAIL" + print(f" [{status}] {prefix:28} {pct:5.1f}% (floor {floor:.1f}%, {covered}/{statements})") + if pct < floor: + failures.append( + f" {prefix}: {pct:.1f}% < floor {floor:.1f}%" + ) + + if failures: + print("\nCoverage floor breach:", file=sys.stderr) + for line in failures: + print(line, file=sys.stderr) + return 1 + print("All per-package coverage floors hold.") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv)) diff --git a/src/legis/__init__.py b/src/legis/__init__.py index df1f691..7986973 100644 --- a/src/legis/__init__.py +++ b/src/legis/__init__.py @@ -1,3 +1,3 @@ """Legis — the git/CI + governance layer of the Weft suite.""" -__version__ = "1.0.0rc3" +__version__ = "1.0.0rc4" diff --git a/src/legis/api/app.py b/src/legis/api/app.py index 15f7448..cc0df06 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -26,10 +26,15 @@ from pydantic import BaseModel from legis import __version__ -# Re-exported so existing `from legis.api.app import DEFAULT_*_DB` call sites -# keep working, while the canonical definition lives in the transport-agnostic -# config module instead of the HTTP layer (Q-H2). -from legis.config import DEFAULT_CHECK_DB, DEFAULT_GOVERNANCE_DB +# Store-location resolvers live in the transport-agnostic config module, not the +# HTTP layer, so `mcp` and any other composition root share one source (Q-H2). +from legis.config import ( + binding_db_url, + check_db_url, + governance_db_url, + protected_policies, + pull_db_url, +) from legis.checks.models import CheckOutcome, CheckRun from legis.checks.surface import CheckSurface from legis.enforcement.engine import EnforcementEngine @@ -44,7 +49,12 @@ from legis.governance.signoff_binding import bind_signoff_to_issue from legis.identity.entity_key import EntityKey from legis.identity.resolver import IdentityResolver -from legis.service.errors import AuditIntegrityError, InvalidArgumentError, NotEnabledError +from legis.service.errors import ( + AuditIntegrityError, + InvalidArgumentError, + NotEnabledError, + WardlineRoutingError, +) from legis.service.governance import compute_override_rate as _compute_override_rate from legis.service.governance import evaluate_policy as _evaluate_policy from legis.service.governance import request_signoff as _request_signoff @@ -54,12 +64,19 @@ from legis.service.governance import submit_override as _submit_override from legis.service.governance import submit_protected_override as _submit_protected_override from legis.service.governance import verified_records as _verified_records -from legis.service.wardline import route_wardline_scan as _route_wardline_scan +from legis.service.wardline import ( + resolve_scan_routing, + route_wardline_scan as _route_wardline_scan, +) from legis.policy.grammar import PolicyGrammar, default_grammar from legis.pulls.models import PullRequest, PullRequestState from legis.pulls.surface import PullSurface from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlinePayloadError, WardlineSeverity +from legis.wardline.ingest import ( + ScanOutcome, + WardlineDirtyTreeError, + WardlinePayloadError, +) security = HTTPBearer(auto_error=False) @@ -238,20 +255,13 @@ class CheckRunIn(BaseModel): finished_at: str | None = None -def _parse_wardline_cell_map(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: - mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} - for part in raw.split(","): - if not part.strip(): - continue - severity_raw, sep, cell_raw = part.partition("=") - if not sep: - raise ValueError("cell map entries must be SEVERITY=cell") - mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( - cell_raw.strip() - ) - if not mapping: - raise ValueError("cell map must not be empty") - return mapping +# Wardline scan-routing rejections (raised by service.resolve_scan_routing) map +# to HTTP status by kind; the MCP adapter collapses the same kinds to one code. +_WARDLINE_ROUTING_STATUS = { + WardlineRoutingError.SERVER_MISCONFIGURED: 500, + WardlineRoutingError.SERVER_OWNED: 403, + WardlineRoutingError.MALFORMED: 422, +} def _check_to_dict(run: CheckRun) -> dict: @@ -331,18 +341,15 @@ def create_app( from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB) + gov_db_url = governance_db_url() gov_store = AuditStore(gov_db_url) clock = SystemClock() - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) + protected = protected_policies() if trail_verifier is None: from legis.enforcement.protected import TrailVerifier - trail_verifier = TrailVerifier(hmac_key, protected_policies) + trail_verifier = TrailVerifier(hmac_key, protected) if protected_gate is None: from legis.enforcement.judge_factory import build_judge_from_env @@ -353,7 +360,7 @@ def create_app( # downgraded and the agent must obtain operator sign-off. protected_gate = ProtectedGate( gov_store, clock, build_judge_from_env("API"), hmac_key, - protected_policies=protected_policies, + protected_policies=protected, ) if signoff_gate is None: @@ -362,7 +369,7 @@ def create_app( if binding_ledger is None: from legis.governance.binding_ledger import BindingLedger - bind_db_url = os.environ.get("LEGIS_BINDING_DB", "sqlite:///legis-binding.db") + bind_db_url = binding_db_url() binding_ledger = BindingLedger(AuditStore(bind_db_url), clock, hmac_key) state: dict[str, Any] = { "checks": check_surface, @@ -376,13 +383,13 @@ def git() -> GitSurface: def checks() -> CheckSurface: if state["checks"] is None: - check_db = os.environ.get("LEGIS_CHECK_DB", DEFAULT_CHECK_DB) + check_db = check_db_url() state["checks"] = CheckSurface(check_db) return state["checks"] def pulls() -> PullSurface: if state["pulls"] is None: - pull_db = os.environ.get("LEGIS_PULL_DB", "sqlite:///legis-pulls.db") + pull_db = pull_db_url() state["pulls"] = PullSurface(pull_db) return state["pulls"] @@ -391,7 +398,7 @@ def engine() -> EnforcementEngine: from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB) + gov_db_url = governance_db_url() state["enforcement"] = EnforcementEngine( AuditStore(gov_db_url), SystemClock() ) @@ -763,62 +770,27 @@ def policy_evaluate(body: PolicyEvalIn, actor: str = Depends(verify_writer)) -> @app.post("/wardline/scan-results") def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_writer)) -> dict: - server_cell = os.environ.get("LEGIS_WARDLINE_CELL") - server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") - if server_cell and server_cell_by_severity: - raise HTTPException(status_code=500, detail="server Wardline routing is misconfigured") - server_routing = server_cell is not None or server_cell_by_severity is not None - if server_routing and ( - body.cell is not None or body.cell_by_severity is not None or body.fail_on is not None - ): - raise HTTPException(status_code=403, detail="Wardline routing is server-owned") - if not server_routing: - if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": - raise HTTPException( - status_code=403, - detail="Wardline routing is server-owned; configure LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", - ) - if body.fail_on is not None: - if body.cell is None or body.cell_by_severity is not None: - raise HTTPException( - status_code=422, - detail="fail_on routing requires cell and forbids cell_by_severity", - ) - elif (body.cell is None) == (body.cell_by_severity is None): - raise HTTPException(status_code=422, - detail="provide exactly one of cell or cell_by_severity") - if body.cell_by_severity is not None and not body.cell_by_severity: - raise HTTPException(status_code=422, detail="cell_by_severity must not be empty") - - policy: WardlineCellPolicy | None = None - cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None - fail_on: WardlineSeverity | None = None try: - if server_cell_by_severity is not None: - cell_map = _parse_wardline_cell_map(server_cell_by_severity) - cells = set(cell_map.values()) - elif server_cell is not None: - policy = WardlineCellPolicy(server_cell) - cells = {policy} - elif body.cell_by_severity is not None: - cell_map = {WardlineSeverity[sev]: WardlineCellPolicy(cell) - for sev, cell in body.cell_by_severity.items()} - cells = set(cell_map.values()) - else: - policy = WardlineCellPolicy(body.cell) - if body.fail_on is not None: - fail_on = WardlineSeverity[body.fail_on] - cells = {policy, WardlineCellPolicy.SURFACE_ONLY} - else: - cells = {policy} - except (KeyError, ValueError) as exc: - raise HTTPException(status_code=422, detail=f"unknown cell/severity: {exc}") + routing = resolve_scan_routing( + server_cell=os.environ.get("LEGIS_WARDLINE_CELL"), + server_cell_by_severity=os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY"), + request_cell=body.cell, + request_severity_map=body.cell_by_severity, + request_fail_on=body.fail_on, + allow_request_routing=( + os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") == "1" + ), + ) + except WardlineRoutingError as exc: + raise HTTPException( + status_code=_WARDLINE_ROUTING_STATUS[exc.kind], detail=str(exc) + ) from exc # Only provision the governance store when a surface cell can actually run: - # engine() lazily creates legis-governance.db, so a pure block_escalate scan + # engine() lazily creates .weft/legis/legis-governance.db, so a pure block_escalate scan # must not touch it. signoff_gate is an injected param (no side effect). - needs_engine = bool(cells & {WardlineCellPolicy.SURFACE_OVERRIDE, - WardlineCellPolicy.SURFACE_ONLY}) + needs_engine = bool(routing.cells & {WardlineCellPolicy.SURFACE_OVERRIDE, + WardlineCellPolicy.SURFACE_ONLY}) try: routed = _route_wardline_scan( body.scan, @@ -826,19 +798,29 @@ def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_write identity=identity, engine=engine() if needs_engine else None, signoff=signoff_gate, - policy=policy, - cell_map=cell_map, - fail_on=fail_on, + policy=routing.policy, + cell_map=routing.cell_map, + fail_on=routing.fail_on, artifact_key=( os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") else None ), + allow_dirty=os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1", ) + except WardlineDirtyTreeError as exc: + # Amber, not red: a dirty dev tree is "environment not ready", not a + # broken/tampered scan. 200 with a typed skip so a harness can tell + # it apart from the 422 generic failure and nothing is governed. + return { + "outcome": exc.reason, + "routed": [], + "detail": str(exc), + } except WardlinePayloadError as exc: raise HTTPException(status_code=422, detail=f"invalid Wardline scan: {exc}") except ValueError as exc: raise HTTPException(status_code=409, detail=str(exc)) - return {"routed": routed} + return {"outcome": ScanOutcome.ROUTED, "routed": routed} return app diff --git a/src/legis/canonical.py b/src/legis/canonical.py index eb5df71..7816473 100644 --- a/src/legis/canonical.py +++ b/src/legis/canonical.py @@ -2,7 +2,33 @@ v1 uses sorted-key, tight-separator JSON for deterministic hashing. RFC 8785 is a future hardening (elspeth uses RFC 8785); legis should converge there before -the protected cell ships cryptographic guarantees (see ADR-0001). +the protected cell ships cryptographic guarantees (see ADR-0001 / ADR-0002). + +Q-L4 deferral (assessed 2026-06-06; clause corrected 2026-06-06): RFC-8785 is +gated on "when cross-language verification is needed." One consumer verifies a +hash this module did NOT produce — ``wardline/ingest.verify_wardline_artifact`` +checks the ``artifact_signature`` Wardline computes in its OWN repo/process over +``canonical_json(scan-minus-signature)``. That is genuinely cross-repo and +cross-process, but it is NOT cross-language: Wardline's signer +(``wardline/src/wardline/core/legis.py``) is a deliberate byte-for-byte Python +replica using the same ``ensure_ascii=False`` params. Two guarantees back this, +and they are NOT the same: a golden HMAC vector captured from the real legis +signer is the *cross-impl* pin (it proves the two signers agree byte-for-byte — +but its payload is ASCII-only today); a separate ``"é"`` canonicalization unit +test on each side proves that side preserves a non-ASCII char as the literal +byte rather than a ``\\uXXXX`` escape. Because both serializers are the identical +Python ``json.dumps`` call, non-ASCII findings round-trip and verify — the +``ensure_ascii=False`` choice is what makes them match, not a hazard. The +*cross-impl non-ASCII* case is therefore guaranteed by construction but not yet +pinned by a golden vector; doing so (a non-ASCII payload in the shared golden +HMAC vector) is a Wardline-side follow-up, because that vector lives in +Wardline's repo and only Wardline's repo can detect Wardline drifting. RFC-8785 +is needed only the day a *non-Python* verifier lands; because this is the single +canonicalization choke point, that upgrade stays a one-file change. The +companion Q-L5 fingerprint +reconciliation (decorator.py / boundary_scan.py) is independent and is done — +those fingerprints are Python ``ast.dump`` output, not cross-language JSON, so +RFC-8785 does not apply to them. """ from __future__ import annotations diff --git a/src/legis/checks/models.py b/src/legis/checks/models.py index ea687c2..2aea94d 100644 --- a/src/legis/checks/models.py +++ b/src/legis/checks/models.py @@ -10,6 +10,8 @@ from dataclasses import dataclass from enum import Enum +from legis.provenance import Provenance + class CheckOutcome(str, Enum): PASS = "pass" @@ -37,4 +39,4 @@ class CheckRun: # "unauthenticated" so a consumer is never misled into treating a # writer-asserted "pass" as authoritative. An authenticated path (a signed # forge webhook) would set a stronger value; none exists today. - provenance: str = "unauthenticated" + provenance: str = Provenance.UNAUTHENTICATED diff --git a/src/legis/checks/surface.py b/src/legis/checks/surface.py index d627ef8..c15414e 100644 --- a/src/legis/checks/surface.py +++ b/src/legis/checks/surface.py @@ -23,10 +23,14 @@ from sqlalchemy.pool import NullPool from legis.checks.models import CheckOutcome, CheckRun +from legis.provenance import Provenance class CheckSurface: def __init__(self, db_url: str) -> None: + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(db_url) self._engine = create_engine(db_url, future=True, poolclass=NullPool) self._md = MetaData() self._runs = Table( @@ -108,7 +112,7 @@ def _to_run(r) -> CheckRun: finished_at=r.finished_at, recorded_by=r.recorded_by, # Rows written before this column existed are still writer-asserted. - provenance=r.provenance or "unauthenticated", + provenance=r.provenance or Provenance.UNAUTHENTICATED, ) def for_commit(self, sha: str) -> list[CheckRun]: diff --git a/src/legis/cli.py b/src/legis/cli.py index d9532f3..e2dcc31 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -1,16 +1,20 @@ import argparse import json +import logging import sys from pathlib import Path import uvicorn +from legis import __version__ from legis.clock import SystemClock from legis.governance.sei_backfill import run_pre_sei_backfill from legis.identity.loomweave_client import HttpLoomweaveIdentity, loomweave_hmac_key_from_env from legis.policy.boundary_scan import scan_policy_boundaries from legis.store.audit_store import AuditStore +logger = logging.getLogger(__name__) + def _add_judge_flags(parser: argparse.ArgumentParser) -> None: parser.add_argument( @@ -31,6 +35,12 @@ def _add_judge_flags(parser: argparse.ArgumentParser) -> None: def build_parser() -> argparse.ArgumentParser: parser = argparse.ArgumentParser(prog="legis", description="Legis CLI") + parser.add_argument( + "--version", + action="version", + version=f"legis {__version__}", + help="Print the legis version and exit", + ) subparsers = parser.add_subparsers(dest="command") serve = subparsers.add_parser("serve", help="Run the Legis API server") @@ -86,15 +96,15 @@ def build_parser() -> argparse.ArgumentParser: ) _add_judge_flags(mcp) - import os - gov_db_default = os.environ.get("LEGIS_GOVERNANCE_DB", "sqlite:///legis-governance.db") + from legis.config import governance_db_url + gov_db_default = governance_db_url() rate = subparsers.add_parser( "check-override-rate", help="Fail (exit 1) if the override-rate gate is FAIL — for CI", ) rate.add_argument( "--db", default=gov_db_default, - help="Governance store URL (mirrors the server's DEFAULT_GOVERNANCE_DB)", + help="Governance store URL (defaults to the server's governance store)", ) gate = subparsers.add_parser( "governance-gate", @@ -102,7 +112,7 @@ def build_parser() -> argparse.ArgumentParser: ) gate.add_argument( "--db", default=gov_db_default, - help="Governance store URL (mirrors the server's DEFAULT_GOVERNANCE_DB)", + help="Governance store URL (defaults to the server's governance store)", ) backfill = subparsers.add_parser( "sei-backfill", @@ -140,6 +150,39 @@ def build_parser() -> argparse.ArgumentParser: help="Output format: human-readable text (default) or machine-readable json", ) + install = subparsers.add_parser( + "install", + help="Inject legis instructions, install the legis-workflow skill, and register the hook", + ) + install.add_argument("--claude-md", action="store_true", help="Inject instructions into CLAUDE.md only") + install.add_argument("--agents-md", action="store_true", help="Inject instructions into AGENTS.md only") + install.add_argument("--skills", action="store_true", help="Install the Claude Code skill pack only") + install.add_argument("--codex-skills", action="store_true", help="Install the Codex skill pack only") + install.add_argument("--hooks", action="store_true", help="Register the Claude Code SessionStart hook only") + install.add_argument("--gitignore", action="store_true", help="Add legis config rules to .gitignore only") + install.add_argument("--mcp", action="store_true", help="Register the legis MCP server in .mcp.json only") + install.add_argument( + "--agent-id", default=None, + help="Agent id stamped in the .mcp.json legis entry " + "(default: claude-code, or preserve an existing entry's id)", + ) + + subparsers.add_parser( + "session-context", + help="SessionStart hook: refresh drifted legis instructions/skills in the cwd", + ) + + doctor = subparsers.add_parser( + "doctor", + help="View and repair legis install/config health", + ) + doctor.add_argument("--root", default=".", help="Project root to inspect (default: cwd)") + doctor.add_argument("--repair", action="store_true", help="Apply safe repairs, then re-check") + doctor.add_argument( + "--format", choices=("text", "json"), default="text", + help="Output format: human text (default) or machine-readable json", + ) + return parser @@ -169,6 +212,7 @@ def _apply_judge_env(args) -> None: def _check_override_rate(db_url: str) -> int: import os + from legis.config import protected_policies from legis.enforcement.lifecycle import GateStatus from legis.service.errors import AuditIntegrityError, ProtectedKeyRequiredError from legis.service.governance import evaluate_override_rate_gate @@ -198,10 +242,6 @@ def _check_override_rate(db_url: str) -> int: return 1 records = store.read_all() - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) # The detect -> require-key -> verify -> score decision lives in the service # layer (Q-H2), so the cli, the api, and any future consumer all measure the @@ -210,7 +250,7 @@ def _check_override_rate(db_url: str) -> int: res = evaluate_override_rate_gate( records, hmac_key=os.environ.get("LEGIS_HMAC_KEY"), - protected_policies=protected_policies, + protected_policies=protected_policies(), ) except (ProtectedKeyRequiredError, AuditIntegrityError) as exc: print(f"Error: {exc}", file=sys.stderr) @@ -221,6 +261,71 @@ def _check_override_rate(db_url: str) -> int: return 1 if res.status is GateStatus.FAIL else 0 +def _run_doctor(args) -> int: + from legis.doctor import run_doctor + + return run_doctor(Path(args.root), repair=args.repair, fmt=args.format) + + +def _run_install(args) -> int: + from legis.install import ( + ensure_gitignore, + inject_instructions, + install_claude_code_hooks, + install_codex_skills, + install_skills, + register_mcp_json, + ) + + project_root = Path.cwd() + install_all = not any( + [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore, args.mcp] + ) + + steps: list[tuple[bool, str, object]] = [ + (install_all or args.claude_md, "CLAUDE.md", lambda: inject_instructions(project_root / "CLAUDE.md")), + (install_all or args.agents_md, "AGENTS.md", lambda: inject_instructions(project_root / "AGENTS.md")), + (install_all or args.skills, "Claude Code skill", lambda: install_skills(project_root)), + (install_all or args.codex_skills, "Codex skill", lambda: install_codex_skills(project_root)), + (install_all or args.hooks, "Claude Code hook", lambda: install_claude_code_hooks(project_root)), + (install_all or args.gitignore, ".gitignore", lambda: ensure_gitignore(project_root)), + (install_all or args.mcp, ".mcp.json", lambda: register_mcp_json(project_root, args.agent_id)), + ] + + failures = 0 + for selected, name, step in steps: + if not selected: + continue + try: + ok, message = step() # type: ignore[operator] + except Exception as exc: # noqa: BLE001 — one bad step must not abort the rest + # Stay consistent with the per-step [OK]/[FAIL] model instead of + # aborting the whole install with a traceback and leaving it + # half-applied. Render the failure, count it, keep going. + logger.warning("install step %r raised", name, exc_info=True) + print(f"[FAIL] {name}: {exc}") + failures += 1 + continue + mark = "OK" if ok else "FAIL" + print(f"[{mark}] {name}: {message}") + if not ok: + failures += 1 + return 1 if failures else 0 + + +def _refresh_instructions_best_effort() -> None: + """Refresh drifted legis instructions on MCP boot. Never raises.""" + try: + from legis.hooks import refresh_instructions + + for message in refresh_instructions(Path.cwd()): + print(message, file=sys.stderr) + except Exception: # noqa: BLE001 (boot refresh must never break the server) + # Best-effort: never break the server, but don't vanish silently either — + # the sibling SessionStart path (hooks.generate_session_context) logs too. + logger.warning("Best-effort instruction refresh on MCP boot failed", exc_info=True) + + def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: if argv is None: argv = sys.argv[1:] @@ -247,6 +352,17 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: run("legis.api.app:create_app", host=args.host, port=args.port, factory=True) return 0 + if args.command == "install": + return _run_install(args) + + if args.command == "session-context": + from legis.hooks import generate_session_context + + context = generate_session_context() + if context: + print(context) + return 0 + if args.command in {"check-override-rate", "governance-gate"}: return _check_override_rate(args.db) @@ -275,6 +391,12 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: os.environ["LEGIS_POLICY_CELLS"] = args.policy_cells _apply_judge_env(args) + # Universal refresh trigger: every agent (Claude or Codex) reaches legis + # by booting this MCP server, so refreshing here keeps the instruction + # block + skill pack fresh even in Codex-only repos with no SessionStart + # hook. Best-effort — it must never block or break server startup. + _refresh_instructions_best_effort() + from legis.mcp import main as mcp_main return mcp_main(args.agent_id) @@ -290,5 +412,8 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: print("policy-boundary-check: PASS") return 1 if findings else 0 + if args.command == "doctor": + return _run_doctor(args) + parser.print_help(sys.stderr) return 2 diff --git a/src/legis/config.py b/src/legis/config.py index c3ea9b7..c89fca6 100644 --- a/src/legis/config.py +++ b/src/legis/config.py @@ -1,13 +1,200 @@ -"""Shared default store locations — the single source for the governance and -check database URLs. +"""Store-location resolver — the single source for legis's database URLs. These previously lived on ``legis.api.app``, which forced ``mcp`` (and any other composition root) to import from the HTTP layer just to learn where the governance store lives (Q-H2). They are transport-agnostic configuration, so they belong here; ``api`` and ``mcp`` both import them from this module. + +**Federated store layout.** legis's machine-written runtime state lives under +``.weft/legis/`` at the project root — the federation convention shared with +the other weft members. legis is the *sole writer* of this subtree. Resolution +is anchored at the current working directory: the same notion the installer +uses (``cli.py`` sets ``project_root = Path.cwd()``), and every member resolves +``.weft/`` against that same cwd, so running each tool from the project root +keeps them in agreement. The default URLs are therefore cwd-relative +(``sqlite:///.weft/legis/...``), preserving the historical resolution semantics. + +**weft.toml is enrich-only, never load-bearing.** The operator-authored +``weft.toml`` may carry a ``[legis]`` table; we read it but never write it. +The single enrichment knob is ``store_dir`` (relocate the subtree; relative to +the project root, or absolute). Per-DB overrides remain the ``LEGIS_*_DB`` env +vars, which take precedence over weft.toml — a precedence the ``*_db_url()`` +resolvers below implement directly (via ``_resolve_db_url``), so every consumer +gets it by calling the resolver, not by re-wrapping it. An absent file, an +absent ``[legis]`` section, or even a malformed weft.toml must still boot on the +built-in defaults — legis never *depends* on weft.toml (Doctrine §5 deletion +test). + +**Clean break.** There is no fallback to the old cwd-root locations +(``legis-governance.db`` &c.). Existing deployments move their files into +``.weft/legis/`` or pin the ``LEGIS_*_DB`` env vars. + +**Keys are out of scope.** Operator-held signing keys are the authority-key +carve-out — capability-confined and deliberately not agent-reachable. They are +env-provided secrets, not files under this subtree; nothing here touches key +storage. """ from __future__ import annotations -DEFAULT_CHECK_DB = "sqlite:///legis-checks.db" -DEFAULT_GOVERNANCE_DB = "sqlite:///legis-governance.db" +import logging +import os +import tomllib +from pathlib import Path + +from sqlalchemy.engine import make_url + +logger = logging.getLogger(__name__) + +WEFT_MEMBER = "legis" + +# Built-in DB filenames under the member's runtime-state subtree. The legacy +# names are preserved so a clean-break move is a relocation, not a rename. +_CHECK_DB_NAME = "legis-checks.db" +_GOVERNANCE_DB_NAME = "legis-governance.db" +_BINDING_DB_NAME = "legis-binding.db" +_PULL_DB_NAME = "legis-pulls.db" + +# Per-DB override env vars. Highest precedence (see ``_resolve_db_url``). +_CHECK_DB_ENV = "LEGIS_CHECK_DB" +_GOVERNANCE_DB_ENV = "LEGIS_GOVERNANCE_DB" +_BINDING_DB_ENV = "LEGIS_BINDING_DB" +_PULL_DB_ENV = "LEGIS_PULL_DB" + +# Public, stably-ordered (override env var, default filename) for every store. +# THE single source of store identity so consumers (e.g. ``legis doctor``) never +# re-list the env vars / filenames: adding a 5th store here automatically extends +# their coverage instead of silently dropping it. +STORE_DB_SPECS: tuple[tuple[str, str], ...] = ( + (_CHECK_DB_ENV, _CHECK_DB_NAME), + (_GOVERNANCE_DB_ENV, _GOVERNANCE_DB_NAME), + (_BINDING_DB_ENV, _BINDING_DB_NAME), + (_PULL_DB_ENV, _PULL_DB_NAME), +) + +# Protected-policy set: the policy names whose judge-ACCEPTED verdicts are +# downgraded to operator sign-off (Q-H3). Composition-root config like the DB +# URLs above, so resolved here. +_PROTECTED_POLICIES_ENV = "LEGIS_PROTECTED_POLICIES" + + +def project_root() -> Path: + """The directory the federation treats as project root (the cwd).""" + return Path.cwd() + + +def _weft_legis_config() -> dict: + """Read the operator-authored ``[legis]`` table from ``weft.toml``. + + Returns an empty enrichment ({}) when the file is absent, has no ``[legis]`` + table, or cannot be parsed — weft.toml is never load-bearing, so a missing + or broken operator file degrades to built-in defaults rather than failing + boot. We are READ-ONLY here; this function never writes weft.toml. + """ + path = project_root() / "weft.toml" + try: + with path.open("rb") as fh: + data = tomllib.load(fh) + except FileNotFoundError: + return {} + except (OSError, tomllib.TOMLDecodeError): + # A broken operator file must not be load-bearing. Surface it on the log + # (so a fat-fingered weft.toml is diagnosable) but boot on defaults. + logger.warning( + "weft.toml present but unreadable (%s); legis booting on built-in " + "store defaults", + path, + exc_info=True, + ) + return {} + section = data.get(WEFT_MEMBER) + return section if isinstance(section, dict) else {} + + +def _store_dir() -> Path: + """The runtime-state subtree: ``.weft/legis`` by default, or the operator's + ``[legis] store_dir`` if set. Relative paths resolve against cwd at connect + time (three-slash URL); an absolute store_dir yields an absolute URL. + """ + configured = _weft_legis_config().get("store_dir") + if isinstance(configured, str) and configured: + return Path(configured) + return Path(".weft") / WEFT_MEMBER + + +def _sqlite_url(path: Path) -> str: + """Render a filesystem path as a SQLite URL, preserving relative-ness. + + A relative path stays relative (``sqlite:///.weft/legis/x.db``, resolved by + SQLite against cwd); an absolute path renders with the leading slash intact + (``sqlite:////abs/x.db``). + """ + return f"sqlite:///{path.as_posix()}" + + +def _resolve_db_url(env_var: str, db_name: str) -> str: + """Resolve a store URL with the documented precedence (module docstring): + the per-DB ``LEGIS_*_DB`` override wins; otherwise the URL is composed from + the weft.toml ``store_dir`` (or the built-in ``.weft/legis`` default) under + the canonical filename. + + This is THE single resolution point — callers invoke the ``*_db_url()`` + function directly and never re-implement the env layering, so changing + precedence or adding an alias is a one-line edit here, not ~11 call sites. + ``env_var in os.environ`` (not ``.get(...) or``) so a present-but-empty + override is returned verbatim rather than silently falling through. + """ + if env_var in os.environ: + return os.environ[env_var] + return _sqlite_url(_store_dir() / db_name) + + +def check_db_url() -> str: + return _resolve_db_url(_CHECK_DB_ENV, _CHECK_DB_NAME) + + +def governance_db_url() -> str: + return _resolve_db_url(_GOVERNANCE_DB_ENV, _GOVERNANCE_DB_NAME) + + +def binding_db_url() -> str: + return _resolve_db_url(_BINDING_DB_ENV, _BINDING_DB_NAME) + + +def pull_db_url() -> str: + return _resolve_db_url(_PULL_DB_ENV, _PULL_DB_NAME) + + +def protected_policies() -> frozenset[str]: + """Resolve the protected-policy set from ``LEGIS_PROTECTED_POLICIES``. + + THE single parse point for the env var: the API factory, the MCP runtime, + and the CLI override-rate gate all call this rather than re-implementing the + ``frozenset(split(","))`` idiom, so the delimiter/trim rule cannot diverge + between composition roots (it decides whether a judge ACCEPTED is downgraded + to sign-off, so a divergence would be a real authority split). Read at call + time — like the ``*_db_url()`` resolvers — because ``cli.py`` writes the env + var from ``--protected-policies`` before the downstream root reads it. Empty, + whitespace-only, and absent all yield the empty set. + """ + raw = os.environ.get(_PROTECTED_POLICIES_ENV, "") + return frozenset(p.strip() for p in raw.split(",") if p.strip()) + + +def ensure_sqlite_parent(url: str) -> None: + """Create the parent directory for a SQLite *file* URL, if needed. + + Called at store-open time (not at URL-compute time) so that merely importing + config or computing a default URL never litters ``.weft/`` directories — the + subtree appears only when a DB is actually opened. No-op for in-memory or + non-SQLite URLs. SQLite creates the ``.db`` file but never its parent, so + without this an open against a fresh ``.weft/legis/`` raises "unable to open + database file". + """ + parsed = make_url(url) + if not parsed.drivername.startswith("sqlite"): + return + database = parsed.database + if not database or database == ":memory:": + return + Path(database).expanduser().parent.mkdir(parents=True, exist_ok=True) diff --git a/src/legis/data/instructions.md b/src/legis/data/instructions.md new file mode 100644 index 0000000..e951079 --- /dev/null +++ b/src/legis/data/instructions.md @@ -0,0 +1,16 @@ +## Legis (git/CI + governance) + +Legis is the git/CI and governance layer of the Weft suite. Reach for it when a policy fires at the CI/git boundary and a change needs a *recordable* override or human sign-off, when you need governance attestations keyed to stable code identity (SEI), or when you need git/CI context — branches, commits, pull requests, check outcomes, and the Loomweave-bound rename feed — around the work. Enforcement is graded: agent-programmable policy cells decide whether a violation self-clears with an audit trail, is judged inline, or escalates to a human; every decision lands in an append-only, SEI-keyed audit trail that survives rename/move. + +Prefer the `mcp__legis__*` MCP tools when available; fall back to the `legis` CLI. + +CLI subcommands: + +- `serve` — run the Legis API server. +- `mcp` — run the Legis MCP stdio server (launch-bound `--agent-id`). +- `check-override-rate` — exit 1 if the override-rate gate is FAIL (for CI). +- `governance-gate` — run governance CI gates (currently the override-rate gate). +- `sei-backfill` — resolve legacy locator-keyed governance records through Loomweave batch resolve. +- `policy-boundary-check` — fail when `@policy_boundary` metadata lacks current behavioural evidence. + +Full command + MCP-tool reference: see the `legis-workflow` skill. diff --git a/src/legis/data/skills/legis-workflow/SKILL.md b/src/legis/data/skills/legis-workflow/SKILL.md new file mode 100644 index 0000000..8056e00 --- /dev/null +++ b/src/legis/data/skills/legis-workflow/SKILL.md @@ -0,0 +1,249 @@ +--- +name: legis-workflow +description: > + This skill should be used when the user asks to explain or evaluate a policy cell, + submit a graded override, check the override-rate CI gate, run a governance gate, + read git branch/commit context, read the git-rename feed for Loomweave, gate a + Filigree closure on verified binding evidence, route Wardline scan findings through + governance, read recorded pull-request or CI check outcomes, run the + policy-boundary-check, or back-fill SEI-keyed governance records — or when working + in a project that uses legis for git/CI governance and graded enforcement. +--- + +# Legis Workflow + +Legis is the git/CI and **governance** side of the Weft suite. This skill is the +depth behind the lean `CLAUDE.md` block: the full CLI reference, the MCP tool +catalogue, the error/recovery table, and the worked patterns an agent actually +runs. Keep it faithful to the installed `legis` — when in doubt, `legis --help` +and `legis --help` are authoritative. + +## What legis is + +Legis answers *what changed, in which branch/commit/PR/check context, and what +governance or attestation state exists for that change?* It is an SEI **consumer** +(Loomweave remains the identity authority) and the suite's single governed judge — +**Wardline analyses trust; Legis governs it, one judge not two**. It does not own +issue state (Filigree) or code identity (Loomweave); it adds branch/commit/PR/check +context and a graded enforcement layer on top. + +Enforcement is a **2×2** of policy *cells*, each agent-set, each a distinct +override flow: + +| | Judge OFF | Judge ON | +|---|---|---| +| **Simple** | **chill** — agent self-reports a recordable override; human reviews async (`ACCEPTED_SELF`) | **coached** — an LLM wall evaluates the override *before* it records; `ACCEPTED_BY_JUDGE` or `BLOCKED` (not self-clearable) | +| **Complex** | **structured** — block + escalate; a human operator must sign off before the gate clears (`ESCALATED_PENDING`) | **protected** — full machinery: HMAC-signed verdicts, decay sweep, override-rate gate, operator override | + +The operating invariant is **agent-first: humans on the loop, not in the loop.** +Every cell produces an append-only audit trail keyed on SEI, so the record survives +rename/move. The recorded override is the safety mechanism — an attributable audit +event, never a silent pass. + +## Reaching the tools + +Prefer the MCP tools (`mcp__legis__*`) when a Legis MCP server is attached; fall +back to the `legis` CLI otherwise. Each surface maps thinly over the same service +layer, so they agree on outcomes. + +**Identity is launch-bound.** The MCP server is started with +`legis mcp --agent-id `; that `--agent-id` is the actor for every override, +sign-off, and audit record the session produces. **No tool schema accepts an actor +argument** — you cannot spoof or override identity from a call. (Contrast the CLI's +`sei-backfill --actor`, which stamps appended backfill events from a one-shot +command, not an interactive session.) + +The MCP transport is stdio JSON-RPC (one object per line). Tool errors come back as +`isError` results with a `structuredContent` envelope carrying `error_code`, +`message`, `recoverable`, and `next_action` (see Error handling). + +## CLI reference + +`legis [flags]`. Most stores fall back to environment variables; flags +override. + +### `legis serve` — run the Legis API server +- `--host` (default `127.0.0.1`), `--port` (default `8000`) — bind address. +- `--governance-db` — governance store URL (env `LEGIS_GOVERNANCE_DB`). +- `--check-db` — check store URL (env `LEGIS_CHECK_DB`). +- `--protected-policies` — comma-separated protected policy list (env `LEGIS_PROTECTED_POLICIES`). +- `--loomweave-url` — Loomweave identity API URL (env `LOOMWEAVE_API_URL`). +- `--filigree-url` — Filigree issue-tracker API URL (env `FILIGREE_API_URL`). +- `--binding-db` — sign-off binding ledger URL (env `LEGIS_BINDING_DB`). +- Judge flags (shared): `--judge-provider` (`openrouter`; omit to keep protected cells fail-closed), `--judge-model` (env `LEGIS_JUDGE_MODEL`), `--judge-max-tokens` (env `LEGIS_JUDGE_MAX_TOKENS`). + +### `legis mcp` — run the MCP stdio server +- `--agent-id` (**required**) — launch-bound agent identity; the actor for all records this session. +- `--governance-db` (env `LEGIS_GOVERNANCE_DB`), `--check-db` (env `LEGIS_CHECK_DB`). +- `--policy-cells` — policy cell registry TOML path (env `LEGIS_POLICY_CELLS`). +- `--protected-policies` (env `LEGIS_PROTECTED_POLICIES`), `--loomweave-url` (env `LOOMWEAVE_API_URL`). +- Judge flags (shared): `--judge-provider`, `--judge-model`, `--judge-max-tokens`. + +### `legis check-override-rate` — CI gate +Fails (exit 1) if the override-rate gate is `FAIL`. For CI use. +- `--db` — governance store URL (default mirrors the server's `LEGIS_GOVERNANCE_DB` / `DEFAULT_GOVERNANCE_DB`). + +Prints `override-rate gate: (rate=…, sample=…)`. A missing SQLite DB under +`CI=true` (without `LEGIS_ALLOW_MISSING_GOVERNANCE_DB=1`) fails; otherwise it prints +`PASS_WITH_NOTICE` and exits 0. A failed hash-chain integrity check exits 1. + +### `legis governance-gate` — run governance CI gates +Currently runs the override-rate gate (same implementation and `--db` semantics as +`check-override-rate`). Use this name for the general CI gate entry point. + +### `legis sei-backfill` — resolve legacy locator-keyed records +Resolves legacy locator-keyed governance records through Loomweave batch resolve and +emits a JSON report. +- `--db` — governance store URL (env `LEGIS_GOVERNANCE_DB`). +- `--loomweave-url` (**required**) — Loomweave identity API URL. +- `--execute` — append backfill events (omit for a dry-run report). +- `--actor` (default `legis-sei-backfill`) — actor stamped on appended events. + +### `legis policy-boundary-check` — boundary-evidence gate +Fails (exit 1) when `@policy_boundary` metadata lacks current behavioural evidence. +- `--root` (default `src`) — Python source root to scan. +- `--repo-root` (default `.`) — repo root for `test_ref` resolution. +- `--format` (`text` | `json`, default `text`) — human-readable lines vs machine-readable findings. + +Prints `policy-boundary-check: PASS` (exit 0) when clean; otherwise one +`path:line: rule_id: qualname: reason` per finding (exit 1). + +## MCP tool catalogue + +All tools return a `structuredContent` JSON payload. Names are exact. + +### Governance / policy +| Tool | Purpose | +|---|---| +| `policy_explain` | Explain which governance cell controls a policy/entity pair, whether that cell is enabled here, and which move the agent may make next. | +| `policy_evaluate` | Evaluate a policy against a target **without recording an override**. Returns outcome, detail, and any `provenance_gap`. | +| `override_submit` | Submit an override as the launch-bound agent. Routes to the governing cell and returns a discriminated outcome envelope (`ACCEPTED_SELF` / `ACCEPTED_BY_JUDGE` / `BLOCKED` / `ESCALATED_PENDING` / `NEED_INPUTS`). | +| `signoff_status_get` | Poll whether a **structured** sign-off request (by `seq`) has been cleared. | +| `override_rate_get` | Read the fixed operator force-past override-rate gate (status / rate / sample_size). Measures operator force-pasts; **not** movable by agent retries. | +| `scan_route` | Route Wardline scan findings through one cell, a `severity_map`, or a cell + `fail_on` threshold. Returns `ROUTED` or `SKIPPED_DIRTY_TREE` (typed amber skip). | + +### Git +| Tool | Purpose | +|---|---| +| `git_branch_list` | List local git branches and upstream divergence facts. | +| `git_commit_get` | Read one git commit by SHA or safe ref. | +| `git_rename_list` | List git rename evidence for a revision range (`rev_range`). | +| `git_rename_feed_get` | Loomweave-ready rename feed: committed renames over `base..head` plus optional uncommitted working-tree renames (`include_worktree`). | + +### Pulls / checks +| Tool | Purpose | +|---|---| +| `pull_request_get` | Read recorded pull-request metadata (`number`) with joined check outcomes. | +| `check_list` | Read recorded CI/check outcomes for a `target_type` of `commit`, `branch`, or `pr` plus a `target`. | + +### Filigree binding +| Tool | Purpose | +|---|---| +| `filigree_closure_gate_get` | Read whether legis holds **verified binding evidence** for closing a Filigree issue (`issue_id`). Requires the binding ledger to be enabled. | + +### Override-submit outcomes (by cell) +- **chill** → `ACCEPTED_SELF` — self-cleared; human reviews asynchronously. +- **coached** / **protected** → `ACCEPTED_BY_JUDGE` (may be re-judged later) or `BLOCKED`. A `BLOCKED` verdict carries a `blocked_reason_code` (`RATIONALE_INSUFFICIENT` / `CODE_VIOLATION` / `POLICY_HARD_BLOCK` / `UNCLASSIFIED`), `self_clearable: false`, and `next_actions: [REVISE_CODE, REVISE_RATIONALE]`. A blocked attempt **does not count toward your override-rate** — you cannot self-clear past the judge. +- **structured** → `ESCALATED_PENDING` — human sign-off required; poll `signoff_status_get` with the returned `seq`. +- **protected** with missing inputs → `NEED_INPUTS` — supply the listed fields (e.g. `file_fingerprint`, `ast_path`) and resubmit. + +Pass an `idempotency_key` on `override_submit` to make retries safe: a repeat with +the same request returns the original outcome; a reused key with a *different* +request is rejected (`INVALID_ARGUMENT`). + +## Error handling + +Tool errors carry `error_code`, `message`, `recoverable`, and a `next_action` hint. +Branch on `error_code`, not message text. + +| `error_code` | Recoverable | `next_action` | +|---|---|---| +| `INVALID_ARGUMENT` | yes | Correct the tool arguments and retry. | +| `INVALID_CELL_SPEC` | yes | Use server-owned routing or a valid cell configuration. | +| `CELL_NOT_ENABLED` | yes | Ask the operator to enable the required governance cell. | +| `NO_SUCH_REQUEST` | yes | Poll a known sign-off sequence returned by `override_submit`. | +| `NOT_FOUND` | yes | Refresh the target identifier and retry. | +| `UNKNOWN_TOOL` | yes | Call `tools/list` and use one of the advertised tool names. | +| `GIT_ERROR` | yes | Check the git ref or revision range and retry. | +| `SERVICE_ERROR` | yes | Inspect the error message before retrying. | +| `AUDIT_INTEGRITY_FAILURE` | **no** | Stop and ask an operator to inspect the governance trail. | +| `INTERNAL_ERROR` | **no** | Inspect the error message before retrying. | + +`AUDIT_INTEGRITY_FAILURE` (raised on a failed hash-chain verification or a binding +ledger error) and `INTERNAL_ERROR` are **not recoverable** — do not retry; surface +them to a human. Everything else is recoverable by fixing the input or asking the +operator to enable a cell. + +Two routing-specific notes for `scan_route`: +- Wardline routing is **server-owned**. Passing `cell` / `severity_map` / `fail_on` + when the server already configures routing (`LEGIS_WARDLINE_CELL` / + `LEGIS_WARDLINE_CELL_BY_SEVERITY`) returns `INVALID_CELL_SPEC`. Request-side + routing is only honoured under the explicit `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1` + escape hatch. +- An unsigned dirty-tree dev artifact arriving where signed provenance is required + is **not** an error — it returns `outcome: SKIPPED_DIRTY_TREE` (a typed amber skip; + nothing is governed). Commit for a signed artifact, or set + `LEGIS_WARDLINE_ALLOW_DIRTY=1` to govern it unsigned in dev. + +## Workflow patterns + +### Evaluate a policy cell, then submit a graded override +``` +policy_explain {policy, entity} # which cell governs, is it enabled, what move is next +# read explanation.cell and available_moves (already filtered to agent-callable tools) +override_submit {policy, entity, rationale [, file_fingerprint, ast_path, idempotency_key]} +``` +- **chill** → `ACCEPTED_SELF`; you are done, the human reviews the trail async. +- **coached/protected** → if `BLOCKED`, do not retry verbatim — `REVISE_CODE` or + `REVISE_RATIONALE` per `next_actions`; the judge cannot be talked past and the + blocked attempt costs you nothing on the override-rate. +- **structured** → `ESCALATED_PENDING`; poll `signoff_status_get {seq}` until + `cleared: true`. Do not proceed on the gated change until then. +- **protected** → if `NEED_INPUTS`, supply `file_fingerprint` + `ast_path` (the + bytes and AST node the judge binds its verdict to) and resubmit. + +### Check the override-rate gate in CI +The gate measures **operator force-pasts**, not agent retries — a high rate means +the policy is miscalibrated or an operator is breaking their own rules. +``` +# in-session read: +override_rate_get {} # → {status, rate, sample_size} +# CI step (exit 1 on FAIL): +legis check-override-rate --db +# or the general entry point: +legis governance-gate --db +``` + +### Read the git-rename feed for Loomweave +Legis is the (contract-locked) rename provider Loomweave's SEI re-binding matcher +consumes. +``` +git_rename_feed_get {base, head?, include_worktree?} +# committed renames over base..head, plus optional uncommitted working-tree renames +# lower-level evidence over an explicit range: +git_rename_list {rev_range} +``` + +### Gate a Filigree closure on verified binding evidence +Before closing a governed Filigree issue, confirm Legis holds verified, SEI-keyed +sign-off binding evidence for it. +``` +filigree_closure_gate_get {issue_id} # requires the binding ledger to be enabled +# only close in Filigree once this reports verified binding evidence; +# Filigree retains lifecycle authority — Legis only certifies the evidence. +``` +If the ledger is not enabled you get `CELL_NOT_ENABLED` — ask the operator to wire +`LEGIS_BINDING_DB` / `--binding-db`. + +### Route Wardline findings through governance +``` +scan_route {scan} # routing is server-owned; pass only the scan +# → ROUTED (governed into the configured cell) or SKIPPED_DIRTY_TREE (commit, or +# set LEGIS_WARDLINE_ALLOW_DIRTY=1 in dev) +``` + +### Gate boundary evidence in CI +``` +legis policy-boundary-check --root src --repo-root . --format json +# exit 1 with findings when @policy_boundary metadata lacks current behavioural evidence +``` diff --git a/src/legis/doctor.py b/src/legis/doctor.py new file mode 100644 index 0000000..fb64234 --- /dev/null +++ b/src/legis/doctor.py @@ -0,0 +1,394 @@ +"""`legis doctor` — view and repair legis install/config health. + +Operator/CLI tool only: it inspects and repairs the *host* install and legis's +own per-member artifacts. It is NOT on the agent MCP surface or the service +layer, and per hub doctrine C-9(b) it NEVER writes weft.toml. +""" + +from __future__ import annotations + +import json +import os +import tomllib +from dataclasses import dataclass +from pathlib import Path +from typing import Any +from urllib.parse import urlsplit + +from sqlalchemy.engine import make_url + +from legis import config +from legis import install as _install + + +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str + status: str # "ok" | "warn" | "error" + fixed: bool = False + message: str | None = None + + @property + def ok(self) -> bool: + return self.status != "error" + + def to_dict(self) -> dict[str, Any]: + data: dict[str, Any] = {"id": self.id, "status": self.status, "fixed": self.fixed} + if self.message: + data["message"] = self.message + return data + + +def _next_actions(checks: list[DoctorCheck]) -> list[str]: + return [f"{c.id}: {c.message}" for c in checks if c.status != "ok" and c.message] + + +def render_json(checks: list[DoctorCheck]) -> str: + payload = { + "ok": all(c.ok for c in checks), + "checks": [c.to_dict() for c in checks], + "next_actions": _next_actions(checks), + } + return json.dumps(payload, indent=2, sort_keys=True) + + +def render_text(checks: list[DoctorCheck]) -> str: + has_error = any(c.status == "error" for c in checks) + has_warn = any(c.status == "warn" for c in checks) + problems = [c for c in checks if c.status != "ok"] + if not has_error: + # warn-only or all-ok: the project is healthy; surface any warns below + if has_warn: + warn_count = sum(1 for c in checks if c.status == "warn") + lines = [f"legis doctor: ok ({warn_count} warning(s))"] + else: + return "legis doctor: ok" + else: + lines = ["legis doctor:"] + for c in problems: + lines.append(f" {c.id}: {c.status} — {c.message}" if c.message else f" {c.id}: {c.status}") + return "\n".join(lines) + + +def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: + """Check that `.mcp.json` has a current legis server entry. + + 'Current' means: a legis entry exists, its args invoke `mcp`, and its + command resolves to an existing executable. Byte-equality with the canonical + entry is deliberately NOT required — a valid but differently-resolved legis + binary (uv-tool vs venv path) must not read as drift. + """ + cid = "install.mcp_json" + if _install.mcp_entry_is_current(root): + return DoctorCheck(cid, "ok") + if repair: + from legis.install import register_mcp_json + + ok, msg = register_mcp_json(root) + if ok and _install.mcp_entry_is_current(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck( + cid, "error", message="legis server missing or stale (run: legis install --mcp)" + ) + + +# --------------------------------------------------------------------------- +# Install-wiring checks (Task 6) +# --------------------------------------------------------------------------- + + +def _block_fresh(root: Path, filename: str) -> bool: + """True iff / has the legis block at the current token.""" + path = root / filename + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + if _install.INSTRUCTIONS_MARKER not in content: + return False + return _install._extract_marker_token(content) == _install._marker_token() + + +def check_instruction_block(root: Path, filename: str, *, repair: bool) -> DoctorCheck: + """Check that / has the legis instruction block at the current token.""" + cid = "install.claude_md" if filename == "CLAUDE.md" else "install.agents_md" + if _block_fresh(root, filename): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.inject_instructions(root / filename) + if ok and _block_fresh(root, filename): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + missing = "missing" if not (root / filename).exists() else "block missing or drifted" + return DoctorCheck(cid, "error", message=f"{filename} {missing} (run: legis install)") + + +def _skill_fresh(root: Path, base: str) -> bool: + """True iff the skill pack under //skills/ matches the source fingerprint.""" + source = _install._get_skills_source_dir() / _install.SKILL_NAME + target = root / base / "skills" / _install.SKILL_NAME + if not source.is_dir() or not target.is_dir(): + return False + return _install._skill_tree_fingerprint(target) == _install._skill_tree_fingerprint(source) + + +def check_skill_pack(root: Path, base: str, *, repair: bool) -> DoctorCheck: + """Check that the legis skill pack under //skills/ is present and fresh.""" + cid = "install.claude_skill" if base == ".claude" else "install.agents_skill" + installer = _install.install_skills if base == ".claude" else _install.install_codex_skills + if _skill_fresh(root, base): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = installer(root) + if ok and _skill_fresh(root, base): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck( + cid, + "error", + message=f"{base}/skills/{_install.SKILL_NAME} missing or drifted (run: legis install)", + ) + + +def _hook_present(root: Path) -> bool: + """True iff the SessionStart hook is registered in .claude/settings.json.""" + settings_path = root / ".claude" / "settings.json" + if not settings_path.exists(): + return False + try: + settings = json.loads(settings_path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + return _install._has_unscoped_session_start_hook(settings, _install.SESSION_CONTEXT_COMMAND) + + +def check_hook(root: Path, *, repair: bool) -> DoctorCheck: + """Check that the legis SessionStart hook is registered.""" + cid = "install.hook" + if _hook_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.install_claude_code_hooks(root) + if ok and _hook_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="SessionStart hook not registered (run: legis install)") + + +def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: + """Check that legis .gitignore rules are present.""" + cid = "install.gitignore" + if _install.gitignore_rules_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.ensure_gitignore(root) + if ok and _install.gitignore_rules_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") + + +# --------------------------------------------------------------------------- +# Task 7: config & store checks +# --------------------------------------------------------------------------- + +# Sourced from config's single store-identity registry so adding a store there +# can't silently drop doctor coverage (review #2). +_DB_OVERRIDE_ENVS = tuple(env for env, _ in config.STORE_DB_SPECS) +_LEGACY_DB_NAMES = tuple(name for _, name in config.STORE_DB_SPECS) + + +def _store_dir_for(root: Path) -> Path: + """legis's store dir resolved from root/weft.toml (root-anchored, never cwd). + Returns an absolute path: an operator-set absolute store_dir is honored as-is; + otherwise the (relative) store_dir / default is joined to root. Malformed + weft.toml falls back to the default (check_weft_toml reports the malformed file).""" + configured: Path | None = None + wt = root / "weft.toml" + if wt.exists(): + try: + data = tomllib.loads(wt.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError): + data = {} + legis = data.get("legis") + if isinstance(legis, dict): + sd = legis.get("store_dir") + if isinstance(sd, str) and sd: + configured = Path(sd) + store_dir = configured if configured is not None else Path(".weft") / "legis" + return store_dir if store_dir.is_absolute() else (root / store_dir) + + +def check_weft_toml(root: Path) -> DoctorCheck: + """Report-only (C-9(b)): NEVER writes weft.toml. Distinguishes ABSENT (ok — + defaults intentional) from PRESENT-BUT-BROKEN (error — config silently not + applying), restoring the operator signal that C-9(c) silences at runtime.""" + cid = "config.weft_toml" + path = root / "weft.toml" + if not path.exists(): + return DoctorCheck(cid, "ok", message="absent (built-in defaults)") + try: + data = tomllib.loads(path.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError) as exc: + return DoctorCheck( + cid, + "error", + message=f"present but unparseable; [legis] silently not applying ({exc})", + ) + table = data.get("legis") + if table is not None and not isinstance(table, dict): + return DoctorCheck(cid, "error", message="[legis] in weft.toml must be a table") + return DoctorCheck(cid, "ok") + + +def _nearest_existing(path: Path) -> Path: + p = path + while not p.exists() and p != p.parent: + p = p.parent + return p + + +def check_store_dir(root: Path, *, repair: bool = False) -> DoctorCheck: + """An absent .weft/legis/ is ok (created lazily). A present-but-unwritable + dir is an error. --repair ensures the dir exists (explicit operator action).""" + cid = "store.dir" + store_dir = _store_dir_for(root) + if store_dir.exists(): + if not os.access(store_dir, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not writable") + return DoctorCheck(cid, "ok") + if repair: + try: + store_dir.mkdir(parents=True, exist_ok=True) + return DoctorCheck(cid, "ok", fixed=True) + except OSError as exc: + return DoctorCheck(cid, "error", message=f"cannot create {store_dir}: {exc}") + anchor = _nearest_existing(store_dir) + if not os.access(anchor, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not creatable ({anchor} not writable)") + return DoctorCheck(cid, "ok", message="absent (created on first store open)") + + +def check_db_overrides(root: Path) -> DoctorCheck: # noqa: ARG001 + cid = "store.db_overrides" + bad = [] + for env in _DB_OVERRIDE_ENVS: + # Match config's precedence: a present-but-empty override is a verbatim + # (broken) override, not "unset" — so validate membership, not truthiness. + if env not in os.environ: + continue + try: + make_url(os.environ[env]) + except Exception: # noqa: BLE001 — any parse failure is a bad override + bad.append(env) + if bad: + return DoctorCheck(cid, "error", message="invalid URL in: " + ", ".join(bad)) + return DoctorCheck(cid, "ok") + + +def check_legacy_stray_db(root: Path) -> DoctorCheck: + cid = "store.legacy_stray" + stray = [n for n in _LEGACY_DB_NAMES if (root / n).is_file()] + if stray: + return DoctorCheck( + cid, + "warn", + message="legacy DB at repo root (move to .weft/legis/): " + ", ".join(stray), + ) + return DoctorCheck(cid, "ok") + + +# --------------------------------------------------------------------------- +# Task 8: governance integrity + runtime/sibling checks +# --------------------------------------------------------------------------- + + +def _store_url(root: Path, db_name: str, env: str) -> str: + """Resolve a store URL anchored at *root* via ``root/weft.toml`` (never cwd). + The LEGIS_*_DB override wins when set (present-but-empty included, matching + config's verbatim-override precedence); otherwise a file URL is built under + the root-anchored store_dir.""" + if env in os.environ: + return os.environ[env] + return "sqlite:///" + (_store_dir_for(root) / db_name).as_posix() + + +def check_audit_chain(cid: str, url: str) -> DoctorCheck: + """Report-only. Absent file store => ok (nothing to verify; must NOT create + the DB). A tampered chain => error (cannot/must not be auto-repaired).""" + try: + parsed = make_url(url) + except Exception: # noqa: BLE001 + return DoctorCheck(cid, "ok", message="store URL not a file store") + db = parsed.database + if parsed.get_backend_name() != "sqlite" or not db or db == ":memory:": + return DoctorCheck(cid, "ok", message="not a file store") + if not Path(db).exists(): + return DoctorCheck(cid, "ok", message="no store yet") + from legis.store.audit_store import AuditStore + + try: + intact = AuditStore(url).verify_integrity() + except Exception as exc: # noqa: BLE001 — surface any verify failure, never raise from doctor + return DoctorCheck(cid, "error", message=f"integrity check failed: {exc}") + if intact: + return DoctorCheck(cid, "ok") + return DoctorCheck( + cid, "error", message="hash chain verification FAILED (report-only; cannot repair)" + ) + + +def check_hmac_key(root: Path) -> DoctorCheck: # noqa: ARG001 + """Presence-only; NEVER renders the key value.""" + cid = "runtime.hmac_key" + if not config.protected_policies(): + return DoctorCheck(cid, "ok", message="no protected policies configured") + if os.environ.get("LEGIS_HMAC_KEY"): + return DoctorCheck(cid, "ok") + return DoctorCheck( + cid, + "warn", + message="protected policies configured but LEGIS_HMAC_KEY not set; protected submissions will fail", + ) + + +def check_sibling_url(cid: str, env: str) -> DoctorCheck: + url = os.environ.get(env) + if not url: + return DoctorCheck(cid, "ok", message="not configured") + parsed = urlsplit(url) + if parsed.scheme.lower() in {"http", "https"} and parsed.netloc: + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "error", message=f"{env} invalid URL: {url!r}") + + +def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: + """Run every check against *root*. Repairs run inside individual checks + when *repair* is True; each returned check reflects post-repair state.""" + checks: list[DoctorCheck] = [] + checks.append(check_instruction_block(root, "CLAUDE.md", repair=repair)) + checks.append(check_instruction_block(root, "AGENTS.md", repair=repair)) + checks.append(check_skill_pack(root, ".claude", repair=repair)) + checks.append(check_skill_pack(root, ".agents", repair=repair)) + checks.append(check_hook(root, repair=repair)) + checks.append(check_gitignore(root, repair=repair)) + checks.append(check_mcp_json(root, repair=repair)) + checks.append(check_weft_toml(root)) + checks.append(check_store_dir(root, repair=repair)) + checks.append(check_db_overrides(root)) + checks.append(check_legacy_stray_db(root)) + checks.append(check_audit_chain("store.governance_chain", _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB"))) + checks.append(check_audit_chain("store.binding_chain", _store_url(root, "legis-binding.db", "LEGIS_BINDING_DB"))) + checks.append(check_hmac_key(root)) + checks.append(check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL")) + checks.append(check_sibling_url("runtime.filigree_url", "FILIGREE_API_URL")) + return checks + + +def run_doctor(root: Path, *, repair: bool, fmt: str) -> int: + checks = collect_checks(root, repair=repair) + print(render_json(checks) if fmt == "json" else render_text(checks)) + return 0 if all(c.ok for c in checks) else 1 diff --git a/src/legis/enforcement/protected.py b/src/legis/enforcement/protected.py index 16f7390..9e33be9 100644 --- a/src/legis/enforcement/protected.py +++ b/src/legis/enforcement/protected.py @@ -16,7 +16,7 @@ from legis.clock import Clock from legis.enforcement.judge import Judge -from legis.enforcement.signing import SIG_PREFIX_V1, sign, verify +from legis.enforcement.signing import sign, verify from legis.enforcement.signoff import signoff_signing_fields from legis.enforcement.verdict import Verdict from legis.identity.entity_key import EntityKey @@ -78,21 +78,6 @@ def signing_fields(payload: dict[str, Any]) -> dict[str, Any]: return fields -def legacy_signing_fields(payload: dict[str, Any]) -> dict[str, Any]: - """Protected override fields signed by legacy ``hmac-sha256:v1`` records.""" - ext = payload.get("extensions") or {} - return { - "policy": payload.get("policy"), - "entity": payload.get("entity_key"), - "verdict": ext.get("judge_verdict"), - "model": ext.get("judge_model"), - "recorded_at": payload.get("recorded_at"), - "rationale": payload.get("rationale"), - "file_fingerprint": ext.get("file_fingerprint"), - "ast_path": ext.get("ast_path"), - } - - class TrailVerifier: """Load-time signature check. A record whose policy is protected MUST carry a valid signature; a missing or mismatched signature is tampering. @@ -153,11 +138,7 @@ def verify(self, records) -> None: raise TamperError( f"protected record seq={rec.seq} is structurally malformed: {exc}" ) from exc - if not verify(fields, sig, self._key) and not ( - isinstance(sig, str) - and sig.startswith(SIG_PREFIX_V1) - and verify(legacy_signing_fields(rec.payload), sig, self._key) - ): + if not verify(fields, sig, self._key): raise TamperError( f"protected record seq={rec.seq} signature does not verify" ) diff --git a/src/legis/enforcement/signing.py b/src/legis/enforcement/signing.py index 8f99d2e..2853528 100644 --- a/src/legis/enforcement/signing.py +++ b/src/legis/enforcement/signing.py @@ -2,9 +2,10 @@ The Sprint 0 hash chain detects edits by an actor who *cannot* recompute it; an actor with DB-file access can re-chain a forged record. The HMAC closes that: -without the key, a forged record cannot carry a valid signature. Versioned -(`v2` pins the expanded audit field set and canonical-JSON v1) so future -canonicalisation or field-set upgrades can be introduced without ambiguity. +without the key, a forged record cannot carry a valid signature. Every signature +carries a version tag (currently `v2`, which pins the audit field set and +canonical-JSON v1) so a future canonicalisation or field-set change can be +introduced as a new tag without ambiguity. """ from __future__ import annotations @@ -14,14 +15,11 @@ from legis.canonical import canonical_json -SIG_PREFIX_V1 = "hmac-sha256:v1:" SIG_PREFIX_V2 = "hmac-sha256:v2:" SIG_PREFIX = SIG_PREFIX_V2 def _prefix_for(version: str) -> str: - if version == "v1": - return SIG_PREFIX_V1 if version == "v2": return SIG_PREFIX_V2 raise ValueError(f"unsupported signature version: {version}") @@ -41,7 +39,4 @@ def sign(fields: dict, key: bytes, *, version: str = "v2") -> str: def verify(fields: dict, signature: str, key: bytes) -> bool: if signature.startswith(SIG_PREFIX_V2): return hmac.compare_digest(_signed(fields, key, SIG_PREFIX_V2), signature) - if signature.startswith(SIG_PREFIX_V1): - return hmac.compare_digest(_signed(fields, key, SIG_PREFIX_V1), signature) - else: - return False + return False diff --git a/src/legis/filigree/client.py b/src/legis/filigree/client.py index 5bbf190..87608b8 100644 --- a/src/legis/filigree/client.py +++ b/src/legis/filigree/client.py @@ -8,8 +8,6 @@ from __future__ import annotations -import hashlib -import hmac import json import ipaddress import os @@ -20,6 +18,13 @@ import urllib.request from typing import Any, Callable, Protocol, runtime_checkable +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + Fetch = Callable[[str, str, "dict | None"], dict] @@ -30,18 +35,13 @@ class FiligreeError(RuntimeError): MAX_RESPONSE_BYTES = 1_000_000 -def _json_body_bytes(body: dict | None) -> bytes: - if body is None: - return b"" - return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") - - -def _path_and_query(url: str) -> str: - parsed = urllib.parse.urlsplit(url) - path_and_query = parsed.path or "/" - if parsed.query: - path_and_query = f"{path_and_query}?{parsed.query}" - return path_and_query +# The Weft-component transport-HMAC scheme is shared with the Loomweave channel; +# both delegate to ``weft_signing`` so the wire format (canonicalization + +# ``X-Weft-*`` headers) has a single definition and cannot silently diverge. The +# module-level ``_json_body_bytes`` / ``_path_and_query`` aliases keep the +# internal transport and existing call sites stable. +_json_body_bytes = weft_body_bytes +_path_and_query = weft_path_and_query def sign_filigree_request( @@ -55,28 +55,15 @@ def sign_filigree_request( ) -> dict[str, str]: """Weft-component HMAC headers for a legis->Filigree request (Q-M4). - Mirrors ``identity.loomweave_client.sign_loomweave_request`` so the Filigree - channel has the same transport authentication the Loomweave channel already - had. The attach ``signature`` is an app-level attestation about WHAT is + Delegates to the shared ``weft_signing`` seam (same scheme as the Loomweave + channel). The attach ``signature`` is an app-level attestation about WHAT is bound; this proves WHO is calling. ``timestamp`` and ``nonce`` are injected - (not generated here) so the signature is deterministically testable. - - Canonicalization contract: the body hash is taken over ``_json_body_bytes`` - (sorted keys, compact ``(",", ":")`` separators). The wire transport - (``_urllib_fetch``) sends those exact bytes, and a Filigree verifier MUST - canonicalize the received body identically before hashing — any spacing or - key-ordering drift on either side breaks every signature. See ADR-0003. + (not generated here) so the signature is deterministically testable. See + ``weft_signing`` for the canonicalization contract and ADR-0003. """ - body_hash = hashlib.sha256(_json_body_bytes(body)).hexdigest() - message = ( - f"{method}\n{_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" - ).encode("utf-8") - signature = hmac.new(key, message, hashlib.sha256).hexdigest() - return { - "X-Weft-Component": f"filigree:{signature}", - "X-Weft-Timestamp": str(timestamp), - "X-Weft-Nonce": nonce, - } + return sign_weft_request( + "filigree", key, method, url, body, timestamp=timestamp, nonce=nonce + ) def filigree_hmac_key_from_env() -> bytes | None: @@ -85,8 +72,7 @@ def filigree_hmac_key_from_env() -> bytes | None: Absent key -> unsigned (backward compatible with deployments that have not provisioned the channel key yet), mirroring ``loomweave_hmac_key_from_env``. """ - value = os.environ.get("LEGIS_FILIGREE_HMAC_KEY") or os.environ.get("LEGIS_HMAC_KEY") - return value.encode("utf-8") if value else None + return weft_hmac_key_from_env("LEGIS_FILIGREE_HMAC_KEY") @runtime_checkable diff --git a/src/legis/governance/sei_backfill.py b/src/legis/governance/sei_backfill.py index 60c2309..9024b7b 100644 --- a/src/legis/governance/sei_backfill.py +++ b/src/legis/governance/sei_backfill.py @@ -16,6 +16,7 @@ from legis.clock import Clock from legis.identity.loomweave_client import LoomweaveIdentity from legis.identity.entity_key import EntityKey +from legis.identity.resolver import IdentityResolutionStatus, LineageSnapshotStatus from legis.store.protocol import AppendOnlyStore, AuditRecordLike SEI_PREFIX = "loomweave:eid:" @@ -206,7 +207,7 @@ def _resolved_event( "alive": True, "content_hash": resolution.get("content_hash"), "lineage_snapshot": lineage_snapshot, - "identity_resolution_status": "resolved", + "identity_resolution_status": IdentityResolutionStatus.RESOLVED, "lineage_snapshot_status": lineage_status, }, "backfill": { @@ -226,7 +227,11 @@ def _unresolved_event( reason: str, ) -> dict[str, Any]: locator_key = EntityKey.from_dict(rec.payload["entity_key"]) - status = "invalid" if reason == "invalid" else "not_alive" + status = ( + IdentityResolutionStatus.INVALID + if reason == "invalid" + else IdentityResolutionStatus.NOT_ALIVE + ) return { "event": "SEI_BACKFILL_UNRESOLVED", "original_seq": rec.seq, @@ -239,7 +244,7 @@ def _unresolved_event( "loomweave": { "alive": False, "identity_resolution_status": status, - "lineage_snapshot_status": "not_applicable", + "lineage_snapshot_status": LineageSnapshotStatus.NOT_APPLICABLE, }, "backfill": { "source": "pre_sei_locator", @@ -252,9 +257,12 @@ def _unresolved_event( def _lineage_snapshot( client: LoomweaveIdentity, sei: str -) -> tuple[dict[str, Any] | None, str]: +) -> tuple[dict[str, Any] | None, LineageSnapshotStatus]: try: lineage = client.lineage(sei) except Exception: - return None, "unavailable" - return {"length": len(lineage), "hash": content_hash(lineage)}, "verified" + return None, LineageSnapshotStatus.UNAVAILABLE + return ( + {"length": len(lineage), "hash": content_hash(lineage)}, + LineageSnapshotStatus.VERIFIED, + ) diff --git a/src/legis/hooks.py b/src/legis/hooks.py new file mode 100644 index 0000000..9a95813 --- /dev/null +++ b/src/legis/hooks.py @@ -0,0 +1,108 @@ +"""SessionStart / MCP-boot refresh for legis instruction artifacts. + +Two callers drive ``refresh_instructions``: + +- the ``legis session-context`` CLI subcommand, registered as a Claude Code + SessionStart hook, and +- ``legis mcp`` startup (best-effort), which is the universal trigger that also + covers Codex-only repos with no ``.claude/`` hook. + +Both refresh *in place* only — they never create a block or skill pack that is +not already present (that is ``legis install``'s job). A non-project cwd simply +produces no work, because the refresh only ever touches marker-bearing files. +""" + +from __future__ import annotations + +import logging +from pathlib import Path + +from legis.install import ( + INSTRUCTIONS_MARKER, + SKILL_NAME, + _extract_marker_token, + _get_skills_source_dir, + _marker_token, + _skill_tree_fingerprint, + inject_instructions, + install_codex_skills, + install_skills, +) + +logger = logging.getLogger(__name__) + + +def refresh_instructions(root: Path) -> list[str]: + """Refresh drifted legis instruction blocks and skill packs under *root*. + + Compares the embedded ``v{version}:{hash}`` token against the current one + for ``CLAUDE.md`` / ``AGENTS.md`` (re-injecting on drift), and each installed + skill pack's tree fingerprint against the bundled source (reinstalling on + drift). Returns human-readable update messages (empty when everything is + current). Only marker-bearing files and already-installed skill packs are + touched. + """ + messages: list[str] = [] + current_token = _marker_token() + + for filename in ("CLAUDE.md", "AGENTS.md"): + md_path = root / filename + if not md_path.exists(): + continue + try: + content = md_path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + logger.debug("Could not read %s for freshness check", md_path, exc_info=True) + continue + if INSTRUCTIONS_MARKER not in content: + continue + if _extract_marker_token(content) == current_token: + continue + ok, reason = inject_instructions(md_path) + if ok: + messages.append(f"Updated legis instructions in {filename}") + else: + # Drift was detected and re-injection was attempted but failed + # (e.g. a symlinked target → (False, reason), not a raise). Never + # drop it: agents would keep running on drifted instructions with no + # signal. Surface it for the operator (peer of the boot-log path). + logger.warning( + "legis instruction refresh failed for %s: %s", md_path, reason + ) + + source_root = _get_skills_source_dir() / SKILL_NAME + if source_root.is_dir(): + source_hash = _skill_tree_fingerprint(source_root) + skill_targets = ( + (root / ".claude" / "skills" / SKILL_NAME, install_skills, "Updated legis skill pack"), + (root / ".agents" / "skills" / SKILL_NAME, install_codex_skills, "Updated legis Codex skill pack"), + ) + for target_root, installer, msg in skill_targets: + if not target_root.is_dir(): + continue + if _skill_tree_fingerprint(target_root) != source_hash: + ok, reason = installer(root) + if ok: + messages.append(msg) + else: + logger.warning( + "legis skill refresh failed for %s: %s", target_root, reason + ) + + return messages + + +def generate_session_context() -> str | None: + """Refresh instruction drift in the cwd and return any update messages. + + Returns ``None`` when nothing changed (silent SessionStart output — legis + keeps no project snapshot and depends on no governance database here). + """ + try: + messages = refresh_instructions(Path.cwd()) + except (OSError, UnicodeDecodeError, ValueError): + logger.warning("Instruction freshness check failed", exc_info=True) + return None + if not messages: + return None + return "\n".join(messages) diff --git a/src/legis/identity/loomweave_client.py b/src/legis/identity/loomweave_client.py index 4ff897e..19e1d7c 100644 --- a/src/legis/identity/loomweave_client.py +++ b/src/legis/identity/loomweave_client.py @@ -18,8 +18,6 @@ from __future__ import annotations -import hashlib -import hmac import json import ipaddress import os @@ -31,6 +29,13 @@ from collections.abc import Mapping from typing import Any, Callable, Protocol, runtime_checkable +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + Fetch = Callable[[str, str, "dict | None", Mapping[str, str]], dict] @@ -50,18 +55,12 @@ def resolve_sei(self, sei: str) -> dict[str, Any]: ... def lineage(self, sei: str) -> list[dict[str, Any]]: ... -def _json_body_bytes(body: dict | None) -> bytes: - if body is None: - return b"" - return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") - - -def _path_and_query(url: str) -> str: - parsed = urllib.parse.urlsplit(url) - path_and_query = parsed.path or "/" - if parsed.query: - path_and_query = f"{path_and_query}?{parsed.query}" - return path_and_query +# The Weft-component transport-HMAC scheme is shared with the Filigree channel; +# both delegate to ``weft_signing`` so the wire format has a single definition +# (the module-level ``_json_body_bytes`` / ``_path_and_query`` aliases keep the +# internal transport and existing call sites stable). +_json_body_bytes = weft_body_bytes +_path_and_query = weft_path_and_query def sign_loomweave_request( @@ -74,23 +73,14 @@ def sign_loomweave_request( nonce: str, ) -> dict[str, str]: """Return Loomweave's current Weft-component HMAC request headers.""" - body_bytes = _json_body_bytes(body) - body_hash = hashlib.sha256(body_bytes).hexdigest() - message = ( - f"{method}\n{_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" - ).encode("utf-8") - signature = hmac.new(key, message, hashlib.sha256).hexdigest() - return { - "X-Weft-Component": f"loomweave:{signature}", - "X-Weft-Timestamp": str(timestamp), - "X-Weft-Nonce": nonce, - } + return sign_weft_request( + "loomweave", key, method, url, body, timestamp=timestamp, nonce=nonce + ) def loomweave_hmac_key_from_env() -> bytes | None: """Resolve Loomweave HMAC key material from env without making it mandatory.""" - value = os.environ.get("LEGIS_LOOMWEAVE_HMAC_KEY") or os.environ.get("LEGIS_HMAC_KEY") - return value.encode("utf-8") if value else None + return weft_hmac_key_from_env("LEGIS_LOOMWEAVE_HMAC_KEY") def _urllib_fetch( diff --git a/src/legis/identity/resolver.py b/src/legis/identity/resolver.py index e9f6589..c0de786 100644 --- a/src/legis/identity/resolver.py +++ b/src/legis/identity/resolver.py @@ -9,13 +9,45 @@ from __future__ import annotations +import logging +import time from dataclasses import dataclass -from typing import Any +from enum import Enum +from typing import Any, Callable from legis.canonical import content_hash from legis.identity.loomweave_client import LoomweaveIdentity from legis.identity.entity_key import EntityKey +logger = logging.getLogger(__name__) + +# A long-lived resolver re-probes the Loomweave sei capability at most once per +# this window. Without it a positive latch is permanent: a Loomweave that loses +# the capability mid-life would be trusted forever (Q-L6). +_DEFAULT_CAPABILITY_TTL_SECONDS = 300.0 + + +class IdentityResolutionStatus(str, Enum): + """The identity axis verdict (str,Enum — serializes as the bare string). + + ``INVALID`` is produced only by the SEI backfill path (it keys raw dicts, + not :class:`IdentityResolution`); the resolver itself emits only the other + three. + """ + + RESOLVED = "resolved" + NOT_ALIVE = "not_alive" + UNAVAILABLE = "unavailable" + INVALID = "invalid" + + +class LineageSnapshotStatus(str, Enum): + """The REQ-L-01 lineage-snapshot verdict (str,Enum — bare-string wire).""" + + VERIFIED = "verified" + UNAVAILABLE = "unavailable" + NOT_APPLICABLE = "not_applicable" + @dataclass(frozen=True) class IdentityResolution: @@ -23,14 +55,64 @@ class IdentityResolution: alive: bool | None # identity axis; None when no capability/decision content_hash: str | None # content axis; None when unavailable lineage_snapshot: dict[str, Any] | None # {"length": N, "hash": ...} or None - identity_resolution_status: str - lineage_snapshot_status: str + identity_resolution_status: IdentityResolutionStatus + lineage_snapshot_status: LineageSnapshotStatus + + def __post_init__(self) -> None: + # The identity axis and its status are two views of one fact — keep them + # from contradicting each other at construction. A "resolved" record with + # alive=False (or any other crossed pair) was representable before this + # guard; the invariant lived only in the construction sites. The bijection + # is exactly the three shapes the resolver actually builds. + # + # ``alive`` must be exactly None/False/True by identity: a bare ``in`` / + # dict lookup would alias ints (1 == True, 0 == False) and a non-bool + # would surface as a KeyError, not this guard's ValueError. + if not any(self.alive is v for v in (None, False, True)): + raise ValueError( + f"IdentityResolution.alive must be None/False/True, got {self.alive!r}" + ) + expected = { + None: IdentityResolutionStatus.UNAVAILABLE, + False: IdentityResolutionStatus.NOT_ALIVE, + True: IdentityResolutionStatus.RESOLVED, + }[self.alive] + if self.identity_resolution_status is not expected: + raise ValueError( + f"contradictory IdentityResolution: alive={self.alive!r} " + f"requires identity_resolution_status=" + f"{expected.value!r}, got {self.identity_resolution_status.value!r}" + ) + # The lineage axis is the record's other half: a snapshot is present iff + # the status is VERIFIED (the resolver pairs them so — VERIFIED carries a + # snapshot; UNAVAILABLE/NOT_APPLICABLE carry None). Keep that pairing from + # contradicting itself too, so the whole record — not just the identity + # axis — is impossible to construct in a self-contradictory state. + snapshot_present = self.lineage_snapshot is not None + verified = self.lineage_snapshot_status is LineageSnapshotStatus.VERIFIED + if snapshot_present != verified: + raise ValueError( + f"contradictory IdentityResolution: lineage_snapshot " + f"{'present' if snapshot_present else 'absent'} requires " + f"lineage_snapshot_status" + f"{'==' if snapshot_present else '!='} VERIFIED, got " + f"{self.lineage_snapshot_status.value!r}" + ) class IdentityResolver: - def __init__(self, client: LoomweaveIdentity | None) -> None: + def __init__( + self, + client: LoomweaveIdentity | None, + *, + capability_ttl: float = _DEFAULT_CAPABILITY_TTL_SECONDS, + monotonic: Callable[[], float] = time.monotonic, + ) -> None: self._client = client - self._capable: bool | None = None # probe once per instance + self._capable: bool | None = None # cached probe result; None = unknown + self._capable_checked_at: float | None = None + self._capability_ttl = capability_ttl + self._monotonic = monotonic @property def client(self) -> LoomweaveIdentity | None: @@ -40,19 +122,51 @@ def client(self) -> LoomweaveIdentity | None: def _capability(self) -> bool: if self._client is None: return False - if self._capable is None: + now = self._monotonic() + checked_at = self._capable_checked_at + # The latch (positive OR negative) is fresh only while within the TTL. + # The original code latched the first result for the resolver's whole + # life, so a capability lost (or gained) upstream was never noticed by a + # long-lived resolver (Q-L6). + fresh = ( + self._capable is not None + and checked_at is not None + and now - checked_at < self._capability_ttl + ) + if not fresh: try: self._capable = bool(self._client.capability()) except Exception: - return False # honest transient degrade — retry on next resolve - return self._capable + # Honest transient degrade — clear the latch so the next resolve + # retries rather than trusting a stale value. Log it: the typed + # return is indistinguishable from a Loomweave that genuinely has + # no sei capability, so the warning is the only operator signal + # that the integration is broken rather than absent. + logger.warning( + "Loomweave sei-capability probe failed; degrading to locator keys", + exc_info=True, + ) + self._capable = None + self._capable_checked_at = None + return False + self._capable_checked_at = now + return self._capable if self._capable is not None else False - def _snapshot(self, sei: str) -> tuple[dict[str, Any] | None, str]: + def _snapshot( + self, sei: str + ) -> tuple[dict[str, Any] | None, LineageSnapshotStatus]: try: lineage = self._client.lineage(sei) # type: ignore[union-attr] except Exception: - return None, "unavailable" - return {"length": len(lineage), "hash": content_hash(lineage)}, "verified" + logger.warning( + "Loomweave lineage snapshot failed; recording lineage as unavailable", + exc_info=True, + ) + return None, LineageSnapshotStatus.UNAVAILABLE + return ( + {"length": len(lineage), "hash": content_hash(lineage)}, + LineageSnapshotStatus.VERIFIED, + ) def resolve(self, locator: str) -> IdentityResolution: degraded = IdentityResolution( @@ -60,14 +174,18 @@ def resolve(self, locator: str) -> IdentityResolution: None, None, None, - "unavailable", - "not_applicable", + IdentityResolutionStatus.UNAVAILABLE, + LineageSnapshotStatus.NOT_APPLICABLE, ) if not self._capability(): return degraded try: res = self._client.resolve_locator(locator) # type: ignore[union-attr] except Exception: + logger.warning( + "Loomweave locator resolve failed; degrading to locator key", + exc_info=True, + ) return degraded if not isinstance(res, dict): return degraded @@ -79,18 +197,23 @@ def resolve(self, locator: str) -> IdentityResolution: False, None, None, - "not_alive", - "not_applicable", + IdentityResolutionStatus.NOT_ALIVE, + LineageSnapshotStatus.NOT_APPLICABLE, ) sei = res.get("sei") if not isinstance(sei, str) or not sei: return degraded snapshot, snapshot_status = self._snapshot(sei) + # content_hash is carried verbatim into the governance record; trust only + # a string. A non-string from a buggy/hostile Loomweave degrades to None + # rather than polluting the typed content axis (Q-L6). + raw_content_hash = res.get("content_hash") + content_hash_value = raw_content_hash if isinstance(raw_content_hash, str) else None return IdentityResolution( EntityKey.from_sei(sei), True, - res.get("content_hash"), + content_hash_value, snapshot, - "resolved", + IdentityResolutionStatus.RESOLVED, snapshot_status, ) diff --git a/src/legis/install.py b/src/legis/install.py new file mode 100644 index 0000000..2a0e0ba --- /dev/null +++ b/src/legis/install.py @@ -0,0 +1,809 @@ +"""Project installation helpers for legis. + +Legis "stands itself up": ``legis install`` injects a lean agent-orientation +block into ``CLAUDE.md`` / ``AGENTS.md``, installs the ``legis-workflow`` skill +pack, registers a Claude Code SessionStart hook, and extends ``.gitignore``. + +The block carries a versioned, content-hashed marker +(````) so a drift check can +re-inject it when either the bundled content or the package version changes. +This mirrors filigree's mechanism (``filigree/src/filigree/install.py`` and +``install_support/``), right-sized for legis: no dashboard, no server mode. +""" + +from __future__ import annotations + +import hashlib +import importlib.metadata +import importlib.resources +import json +import logging +import os +import re +import shlex +import shutil +import stat +import tempfile +from pathlib import Path +from typing import Any + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +INSTRUCTIONS_MARKER = "" + +# Recognises ANY tool's instruction-block fence (open or close) by its vendor +# namespace, so legis can bound its own rewrite at a *foreign* fence and never +# delete a co-resident sibling block (wardline/filigree) in a shared +# CLAUDE.md/AGENTS.md (peer of filigree-bcbd4d66fd). The namespace match is +# case-insensitive: an uppercase-namespaced sibling must still register as a +# boundary. The cross-tool multi-owner block contract lives in weft +# conventions.md (C-4). +_INSTR_FENCE_RE = re.compile(r"" + return f"{opening}\n{text}{_END_MARKER}" + + +# Reader counterpart to the opening marker built in `_build_instructions_block`. +# It lives next to the writer (and is derived from the same `INSTRUCTIONS_MARKER` +# constant) so the freshness check cannot silently desync from the marker format: +# the prefix is `re.escape`d from the constant, and the token is captured as an +# opaque `\S+` rather than re-encoding its `v{version}:{hash}` shape — so a future +# change to the token shape needs no edit here. The round-trip is pinned by a test. +_MARKER_TOKEN_RE = re.compile(re.escape(INSTRUCTIONS_MARKER) + r":(\S+) -->") + + +def _extract_marker_token(content: str) -> str | None: + """Return the token from the first legis instruction marker, or ``None``.""" + m = _MARKER_TOKEN_RE.search(content) + return m.group(1) if m else None + + +def _atomic_write_text(path: Path, content: str) -> None: + """Write *content* to *path* atomically (temp + rename), preserving mode.""" + # Refuse-to-empty guard (filigree-04bad2a2bf parity). Every caller of this + # writer (instruction injection, .gitignore management, settings.json) always + # has non-empty content; an empty or whitespace-only payload can only be + # corruption or a logic bug. Refuse loudly rather than rename an empty temp + # file over a populated CLAUDE.md/AGENTS.md/.gitignore. + if not content.strip(): + msg = f"refusing to write empty content to {path}" + raise ValueError(msg) + reject_symlink(path) + existing_mode: int | None + try: + existing_mode = stat.S_IMODE(path.stat().st_mode) + except FileNotFoundError: + existing_mode = None + + fd, tmp = tempfile.mkstemp(dir=path.parent, suffix=".tmp", prefix=path.name) + try: + with os.fdopen(fd, "w", encoding="utf-8") as f: + f.write(content) + if existing_mode is not None: + os.chmod(tmp, existing_mode) + else: + umask = os.umask(0) + os.umask(umask) + os.chmod(tmp, 0o666 & ~umask) + os.replace(tmp, path) + except BaseException: + Path(tmp).unlink(missing_ok=True) + raise + + +def inject_instructions(file_path: Path) -> tuple[bool, str]: + """Inject legis workflow instructions into a markdown file. + + - missing file → create with just the block; + - has the marker → replace the block in place; + - exists without the marker → append the block. + """ + try: + reject_symlink(file_path) + except UnsafeInstallPathError as exc: + return False, str(exc) + + block = _build_instructions_block() + + if not file_path.exists(): + _atomic_write_text(file_path, block + "\n") + return True, f"Created {file_path}" + + content = file_path.read_text(encoding="utf-8") + start = _first_own_open_fence_pos(content) + if start != -1: + # Bound legis's writable region at the first of: + # (a) its own close marker, *if* that close precedes any foreign fence + # → normal in-place replace; + # (b) the next foreign-namespace fence — bounded recovery for a + # malformed/unclosed block, and for the unclosed-first / closed- + # later "Shape 2" where a bare ``find`` would otherwise jump over a + # foreign block to a later legis close; + # (c) EOF. + # Own-namespace fences are absorbed (see _first_foreign_fence_pos), so + # duplicate/unclosed legis blocks still collapse to one clean block — + # preserving the orphan-tail idempotency invariant. Monotonic safety: + # in every branch ``bound`` ≤ the old code's cut point, so this can only + # *preserve* bytes the old code deleted, never delete bytes it kept. + # ``start`` is legis's own top-level open fence (see + # _first_own_open_fence_pos), never a marker quoted inside a sibling block. + own_end = content.find(_END_MARKER, start) + foreign = _first_foreign_fence_pos(content, start + len(INSTRUCTIONS_MARKER)) + if own_end != -1 and own_end < foreign: + bound = own_end + len(_END_MARKER) + tail = content[bound:] + sep = "" + else: + # Bounded recovery: stop at the foreign fence (or EOF). Re-insert the + # separating newline we may have eaten, so our close marker is never + # glued mid-line against a following foreign fence — keeping us + # independent of whether a sibling's block detector is line-anchored. + bound = foreign + tail = content[bound:] + sep = "\n" if (bound < len(content) and not tail.startswith("\n")) else "" + if _first_own_open_fence_pos(tail) != -1: + # A second legis block survives beyond the boundary because + # canonicalising it would mean reaching across a block we don't own. + # It is STALE, conflicting guidance — not a harmless duplicate — so + # surface it instead of silently shipping a split brain + # (foreign-safety wins over own-dedup). + logger.warning( + "legis instruction block in %s has a duplicate that could not be " + "canonicalised without crossing another tool's block; the stale copy " + "was left in place. Resolve it by hand.", + file_path, + ) + content = content[:start] + block + sep + tail + _atomic_write_text(file_path, content) + return True, f"Updated instructions in {file_path}" + + if not content.strip(): + # An existing empty / whitespace-only file is effectively a create: write + # just the block rather than leaving leading blank-line artifacts. + _atomic_write_text(file_path, block + "\n") + return True, f"Created {file_path}" + + if not content.endswith("\n"): + content += "\n" + content += "\n" + block + "\n" + _atomic_write_text(file_path, content) + return True, f"Appended instructions to {file_path}" + + +# --------------------------------------------------------------------------- +# Skill pack +# --------------------------------------------------------------------------- + + +def _get_skills_source_dir() -> Path: + """Return the path to the bundled skills directory inside the package.""" + return Path(__file__).parent / "data" / "skills" + + +def _skill_tree_fingerprint(root: Path) -> str: + """Return a short hash of every file under *root* (relative path + bytes).""" + digest = hashlib.sha256() + files = sorted(p for p in root.rglob("*") if p.is_file()) + for path in files: + rel = path.relative_to(root).as_posix().encode("utf-8") + digest.update(rel) + digest.update(b"\0") + try: + digest.update(path.read_bytes()) + except OSError: + digest.update(b"") + digest.update(b"\0") + return digest.hexdigest()[:8] + + +def _install_skill_to(project_root: Path, target_subpath: Path) -> tuple[bool, str]: + """Copy the legis skill pack into *target_subpath* under *project_root*. + + Idempotent — overwrites existing skill files to track the installed legis + version. Safe under concurrent invocation: each call stages into a unique + directory and tolerates a peer winning the final rename race. + """ + skill_source = _get_skills_source_dir() / SKILL_NAME + if not skill_source.is_dir(): + return False, f"Skill source not found at {skill_source}" + + try: + target_parent = ensure_project_dir(project_root, *target_subpath.parts) + except UnsafeInstallPathError as exc: + return False, str(exc) + target_dir = target_parent / SKILL_NAME + try: + reject_symlink(target_dir) + except UnsafeInstallPathError as exc: + return False, str(exc) + + staging = Path(tempfile.mkdtemp(dir=target_dir.parent, prefix=f"{SKILL_NAME}.installing.")) + staging.rmdir() + staging_consumed = False + swap_done = False + backup: Path | None = None + try: + shutil.copytree(skill_source, staging) + if target_dir.exists(): + backup_holder = Path(tempfile.mkdtemp(dir=target_dir.parent, prefix=f"{SKILL_NAME}.old.")) + backup_holder.rmdir() + try: + os.rename(target_dir, backup_holder) + backup = backup_holder + except FileNotFoundError: + pass + try: + os.rename(staging, target_dir) + staging_consumed = True + swap_done = True + except OSError: + # Distinguish a peer winning the race (target now holds their + # identical content) from a genuine failure. Only the former is + # safe to report as success — otherwise we would claim a successful + # install over a pack we just destroyed. + if target_dir.exists() and target_dir.is_dir(): + swap_done = True + else: + # Genuine failure: restore the original pack we set aside and + # report failure rather than a false-positive "Installed". + if backup is not None and backup.exists(): + try: + os.rename(backup, target_dir) + backup = None + except OSError: + # Could not restore — leave the backup in place (it may + # be the only surviving copy) rather than delete it. + pass + return False, f"Failed to install skill pack to {target_dir}: swap failed" + finally: + if not staging_consumed and staging.exists(): + shutil.rmtree(staging, ignore_errors=True) + # Only discard the prior pack once the new one is in place. If the swap + # failed we must not delete the backup — it may be the only copy left. + if backup is not None and swap_done and backup.exists(): + shutil.rmtree(backup, ignore_errors=True) + + return True, f"Installed skill pack to {target_dir}" + + +def install_skills(project_root: Path) -> tuple[bool, str]: + """Copy the legis skill pack into ``.claude/skills/`` for the project.""" + return _install_skill_to(project_root, Path(".claude") / "skills") + + +def install_codex_skills(project_root: Path) -> tuple[bool, str]: + """Copy the legis skill pack into ``.agents/skills/`` for Codex.""" + return _install_skill_to(project_root, Path(".agents") / "skills") + + +# --------------------------------------------------------------------------- +# Claude Code SessionStart hook +# --------------------------------------------------------------------------- + + +def _find_legis_command() -> list[str]: + """Resolve how to invoke legis for a hook command. + + Prefer a ``legis`` binary on PATH; otherwise fall back to the safe-path + module form `` -P -m legis`` so module resolution does not prepend + the project directory. + """ + found = shutil.which("legis") + if found: + return [found] + import sys + + return [sys.executable, "-P", "-m", "legis"] + + +def _hook_cmd_matches(hook_command: str, bare_command: str) -> bool: + """Whether *hook_command* is a bare, absolute-path, or module form of *bare_command*.""" + if hook_command == bare_command: + return True + try: + hook_tokens = shlex.split(hook_command) + bare_tokens = shlex.split(bare_command) + except ValueError: + return False + if not hook_tokens or not bare_tokens: + return False + n = len(bare_tokens) + bare_bin = bare_tokens[0] # "legis" + + if len(hook_tokens) == n: + if hook_tokens[1:] != bare_tokens[1:]: + return False + hook_bin = hook_tokens[0] + if hook_bin == bare_bin: + return True + hook_base = hook_bin.rsplit("/", 1)[-1].rsplit("\\", 1)[-1] + return hook_base.lower() in {bare_bin.lower(), f"{bare_bin.lower()}.exe"} + + module_prefixes = (["-m", bare_bin], ["-P", "-m", bare_bin]) + for prefix in module_prefixes: + if len(hook_tokens) == n + len(prefix) and hook_tokens[1 : 1 + len(prefix)] == prefix: + return hook_tokens[1 + len(prefix) :] == bare_tokens[1:] + + return False + + +def _has_unscoped_session_start_hook(settings: dict[str, Any], command: str) -> bool: + """Whether *command* appears in an unscoped/wildcard SessionStart block.""" + if not isinstance(settings, dict): + return False + hooks = settings.get("hooks", {}) + if not isinstance(hooks, dict): + return False + session_start = hooks.get("SessionStart", []) + if not isinstance(session_start, list): + return False + for matcher in session_start: + if not isinstance(matcher, dict): + continue + if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): + continue + hook_list = matcher.get("hooks", []) + if not isinstance(hook_list, list): + continue + for hook in hook_list: + if isinstance(hook, dict) and _hook_cmd_matches(hook.get("command", ""), command): + return True + return False + + +def _upgrade_hook_commands(settings: dict[str, Any], bare_command: str, new_command: str) -> bool: + """Replace hook commands matching *bare_command* with *new_command*.""" + changed = False + hooks = settings.get("hooks", {}) + if not isinstance(hooks, dict): + return False + session_start = hooks.get("SessionStart", []) + if not isinstance(session_start, list): + return False + for matcher in session_start: + if not isinstance(matcher, dict): + continue + # Only upgrade commands in unscoped blocks legis owns. A user's scoped + # block (e.g. {"matcher": "resume"}) is their config — never rewrite a + # portable bare command there into a venv-specific absolute path. + if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): + continue + hook_list = matcher.get("hooks", []) + if not isinstance(hook_list, list): + continue + for hook in hook_list: + if not isinstance(hook, dict): + continue + cmd = hook.get("command", "") + if _hook_cmd_matches(cmd, bare_command) and cmd != new_command: + hook["command"] = new_command + changed = True + return changed + + +def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: + """Register ``legis session-context`` as a Claude Code SessionStart hook. + + Idempotent: re-running upgrades a bare/stale command to the resolved binary + and never duplicates the entry. Reuses only an unscoped block already + carrying the legis hook; otherwise appends a dedicated matcher-less block so + the hook fires on every SessionStart source. + """ + try: + claude_dir = ensure_project_dir(project_root, ".claude") + except UnsafeInstallPathError as exc: + return False, str(exc) + settings_path = claude_dir / "settings.json" + try: + reject_symlink(settings_path) + except UnsafeInstallPathError as exc: + return False, str(exc) + + recovered_backup: str | None = None # set when a corrupt file was backed up + + settings: dict[str, Any] = {} + if settings_path.exists(): + try: + parsed = json.loads(settings_path.read_text(encoding="utf-8")) + if not isinstance(parsed, dict): + raise ValueError("settings.json is not a JSON object") + settings = parsed + except (json.JSONDecodeError, ValueError): + backup = settings_path.with_suffix(".json.bak") + try: + reject_symlink(backup) + except UnsafeInstallPathError as exc: + return False, str(exc) + shutil.copy2(settings_path, backup) + recovered_backup = backup.name + logger.warning( + "malformed .claude/settings.json backed up to %s and replaced with " + "a fresh file; reconcile any lost settings by hand", + backup.name, + ) + + prefix = shlex.join(_find_legis_command()) + session_context_cmd = f"{prefix} session-context" + + upgraded = _upgrade_hook_commands(settings, SESSION_CONTEXT_COMMAND, session_context_cmd) + needs_add = not _has_unscoped_session_start_hook(settings, SESSION_CONTEXT_COMMAND) + + if not needs_add: + _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") + if upgraded: + return True, f"Upgraded hook command in .claude/settings.json to use {prefix}" + return True, "Hook already registered in .claude/settings.json" + + # A valid top-level object whose "hooks"/"SessionStart" is the wrong type + # parses cleanly (so the malformed-JSON backup above did not fire), but the + # resets below would silently drop that user data — preserve a recoverable + # copy first. + existing_hooks = settings.get("hooks") + existing_ss = existing_hooks.get("SessionStart") if isinstance(existing_hooks, dict) else None + nested_corrupt = (existing_hooks is not None and not isinstance(existing_hooks, dict)) or ( + isinstance(existing_hooks, dict) and "SessionStart" in existing_hooks and not isinstance(existing_ss, list) + ) + if nested_corrupt and settings_path.exists(): + backup = settings_path.with_suffix(".json.bak") + try: + reject_symlink(backup) + except UnsafeInstallPathError as exc: + return False, str(exc) + shutil.copy2(settings_path, backup) + recovered_backup = backup.name + logger.warning( + "corrupt hooks structure in .claude/settings.json backed up to %s " + "before resetting it; reconcile any lost hooks by hand", + backup.name, + ) + + if not isinstance(settings.get("hooks"), dict): + settings["hooks"] = {} + hooks = settings["hooks"] + if not isinstance(hooks.get("SessionStart"), list): + hooks["SessionStart"] = [] + session_start = hooks["SessionStart"] + + # needs_add is True only when no unscoped block already carries the legis + # hook (see _has_unscoped_session_start_hook), so there is never a reusable + # block to find — append a dedicated matcher-less block that fires on every + # SessionStart source regardless of how neighbouring blocks are scoped. + session_start.append( + {"hooks": [{"type": "command", "command": session_context_cmd, "timeout": 5000}]} + ) + + _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") + msg = f"Registered hook in .claude/settings.json: {session_context_cmd}" + if recovered_backup is not None: + msg += f" (backed up malformed settings.json to {recovered_backup})" + return True, msg + + +# --------------------------------------------------------------------------- +# .gitignore +# --------------------------------------------------------------------------- + +# Only legis's OWN rules — never another member's. ``.weft/legis/`` is legis's +# machine-written runtime-state subtree (DBs &c.); ``.weft/`` as a whole is the +# shared federation namespace and must NOT be claimed wholesale here. The legacy +# ``.legis/`` / ``legis.yaml`` surfaces were retired with the weft store +# consolidation — no legis code reads them (``legis.yaml`` was the per-member +# config that ``weft.toml`` ``[legis]`` now replaces). +_LEGIS_IGNORE_RULES = (".weft/legis/",) +_LEGIS_IGNORE_BLOCK = ( + "\n# Legis — machine-written runtime state (regenerated/local; never commit)\n" + ".weft/legis/\n" +) + + +def gitignore_rules_present(project_root: Path) -> bool: + """True iff every legis ignore rule is already a non-comment line in .gitignore.""" + try: + gitignore = project_path(project_root, ".gitignore") + except UnsafeInstallPathError: + return False + if not gitignore.exists(): + return False + try: + content = gitignore.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} + return all(rule in present for rule in _LEGIS_IGNORE_RULES) + + +def mcp_entry_is_current(project_root: Path) -> bool: + """True iff .mcp.json has a usable legis stdio server entry: a dict whose + args invoke `mcp` and whose command resolves to an existing executable. + Deliberately NOT byte-equality with the canonical entry — a valid but + differently-resolved legis binary (uv-tool vs venv path) must not read as + drift. Only a missing entry, malformed args, or a dead command path is stale. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError: + return False + if not path.is_file(): + return False + try: + data = json.loads(path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + if not isinstance(data, dict): + return False + servers = data.get("mcpServers") + entry = servers.get("legis") if isinstance(servers, dict) else None + if not isinstance(entry, dict): + return False + args = entry.get("args") + if not (isinstance(args, list) and "mcp" in args): + return False + command = entry.get("command") + if not isinstance(command, str) or not command: + return False + # command resolves: absolute/relative existing file OR found on PATH + return bool(shutil.which(command)) or Path(command).is_file() + + +def ensure_gitignore(project_root: Path) -> tuple[bool, str]: + """Ensure legis's runtime-state subtree (``.weft/legis/``) is ignored.""" + try: + gitignore = project_path(project_root, ".gitignore") + except UnsafeInstallPathError as exc: + return False, str(exc) + + if gitignore.exists(): + if gitignore_rules_present(project_root): + return True, "legis config already in .gitignore" + content = gitignore.read_text(encoding="utf-8") + present = { + line.strip() for line in content.splitlines() if line.strip() and not line.lstrip().startswith("#") + } + missing = [rule for rule in _LEGIS_IGNORE_RULES if rule not in present] + if not content.endswith("\n"): + content += "\n" + # Append only the rules that are actually absent — writing the whole + # block when one rule is already present would duplicate the other. + content += "\n# Legis — local working dir / config (regenerated/local; never commit)\n" + content += "".join(f"{rule}\n" for rule in missing) + _atomic_write_text(gitignore, content) + return True, f"Added {', '.join(missing)} to .gitignore" + + _atomic_write_text(gitignore, _LEGIS_IGNORE_BLOCK.lstrip("\n")) + return True, "Created .gitignore with legis config rules" + + +# --------------------------------------------------------------------------- +# .mcp.json (agent MCP server registration) +# --------------------------------------------------------------------------- + +_DEFAULT_AGENT_ID = "claude-code" + + +def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: + """The canonical legis stdio server entry for .mcp.json. + + Splits the resolved invocation into a bare ``command`` (the executable an + MCP client execs directly) plus ``args`` so the module-fallback form + (`` -P -m legis ...``) launches correctly — a single joined string + in ``command`` would not be exec'd as separate argv tokens. + """ + cmd = _find_legis_command() + return { + "args": cmd[1:] + ["mcp", "--agent-id", agent_id], + "command": cmd[0], + "env": {}, + "type": "stdio", + } + + +def register_mcp_json( + project_root: Path, agent_id: str | None = None +) -> tuple[bool, str]: + """Register (or refresh) the legis server in /.mcp.json. + + Creates the file if absent; merges into mcpServers without disturbing + sibling entries. An explicit *agent_id* always wins; when it is ``None`` + (the default), an existing legis entry's agent-id is preserved (operator + choice), falling back to ``_DEFAULT_AGENT_ID`` for a fresh entry. Refreshes + only the command/args shape otherwise. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError as exc: + return False, str(exc) + + data: dict[str, Any] = {} + if path.exists(): + try: + parsed = json.loads(path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False, ".mcp.json present but unreadable; fix or remove it by hand" + if not isinstance(parsed, dict): + return False, ".mcp.json present but not a JSON object; fix or remove it by hand" + data = parsed + + servers = data.get("mcpServers") + if not isinstance(servers, dict): + servers = {} + data["mcpServers"] = servers + + existing = servers.get("legis") + if agent_id is not None: + keep_agent = agent_id # explicit caller wins + else: + keep_agent = _DEFAULT_AGENT_ID # default... + if isinstance(existing, dict): # ...but preserve an existing entry's id + args = existing.get("args", []) + if isinstance(args, list) and "--agent-id" in args: + i = args.index("--agent-id") + if i + 1 < len(args) and isinstance(args[i + 1], str): + keep_agent = args[i + 1] + + desired = _legis_mcp_entry(keep_agent) + if existing == desired: + return True, "legis already registered in .mcp.json" + servers["legis"] = desired + _atomic_write_text(path, json.dumps(data, indent=2, sort_keys=True) + "\n") + return True, "Registered legis server in .mcp.json" diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 53e901e..25c0070 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -8,8 +8,10 @@ from __future__ import annotations +from collections.abc import Callable from dataclasses import asdict, dataclass import json +import logging import os from pathlib import Path import sys @@ -41,6 +43,7 @@ NotEnabledError, NotFoundError, ServiceError, + WardlineRoutingError, ) from legis.service.explain import explain_policy from legis.service.governance import ( @@ -51,10 +54,9 @@ request_signoff, verified_records as service_verified_records, ) -from legis.service.wardline import route_wardline_scan +from legis.service.wardline import resolve_scan_routing, route_wardline_scan from legis.store.audit_store import AuditStore -from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlinePayloadError, WardlineSeverity +from legis.wardline.ingest import ScanOutcome, WardlineDirtyTreeError _AGENT_TOOLS = frozenset( @@ -78,6 +80,42 @@ _SUPPORTED_PROTOCOL_VERSIONS = ("2024-11-05", "2025-03-26") _DEFAULT_PROTOCOL_VERSION = _SUPPORTED_PROTOCOL_VERSIONS[-1] +# Upper bound on a single JSON-RPC line read from stdin. The hand-rolled framing +# is one object per line; without a bound a peer (or a corrupted pipe) sending a +# line with no newline forces an unbounded read into memory. 16 MiB comfortably +# fits a maximal scan_route request (MAX_FINDINGS=500 with properties) while +# refusing a pathological one. Override with LEGIS_MCP_MAX_REQUEST_BYTES. +_DEFAULT_MAX_REQUEST_BYTES = 16 * 1024 * 1024 + +logger = logging.getLogger(__name__) + + +def _max_request_bytes() -> int: + raw = os.environ.get("LEGIS_MCP_MAX_REQUEST_BYTES") + if raw: + try: + value = int(raw) + except ValueError: + logger.warning( + "LEGIS_MCP_MAX_REQUEST_BYTES=%r is not an integer; ignoring it " + "and using the default %d-byte bound", + raw, + _DEFAULT_MAX_REQUEST_BYTES, + ) + return _DEFAULT_MAX_REQUEST_BYTES + if value > 0: + return value + # A non-positive bound (a fat-fingered 0 or negative) would otherwise + # fall through silently — the operator meant to lower the cap and it was + # ignored. Say so. + logger.warning( + "LEGIS_MCP_MAX_REQUEST_BYTES=%r is not positive; ignoring it and " + "using the default %d-byte bound", + raw, + _DEFAULT_MAX_REQUEST_BYTES, + ) + return _DEFAULT_MAX_REQUEST_BYTES + @dataclass class McpRuntime: @@ -96,6 +134,7 @@ class McpRuntime: grammar: PolicyGrammar | None = None source_root: str | Path | None = None wardline_artifact_key: bytes | None = None + wardline_allow_dirty: bool = False binding_ledger: Any | None = None @@ -119,7 +158,7 @@ def _load_policy_cell_registry() -> PolicyCellRegistry: def build_runtime(agent_id: str) -> McpRuntime: - from legis.config import DEFAULT_GOVERNANCE_DB + from legis.config import binding_db_url, governance_db_url, protected_policies clock = SystemClock() engine = None @@ -140,26 +179,23 @@ def build_runtime(agent_id: str) -> McpRuntime: hmac_key = os.environ.get("LEGIS_HMAC_KEY") if hmac_key: key = hmac_key.encode("utf-8") - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB)) - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) - trail_verifier = TrailVerifier(key, protected_policies) + store = AuditStore(governance_db_url()) + protected = protected_policies() + trail_verifier = TrailVerifier(key, protected) # Protected policies: the LLM judge is advisory only (Q-H3). With no # deterministic validator wired, a judge ACCEPTED is downgraded and the # agent must escalate to operator sign-off. protected_gate = ProtectedGate( store, clock, build_judge_from_env("MCP"), key, - protected_policies=protected_policies, + protected_policies=protected, ) signoff_gate = SignoffGate(store, clock, signer=True, key=key) from legis.governance.binding_ledger import BindingLedger binding_ledger = BindingLedger( - AuditStore(os.environ.get("LEGIS_BINDING_DB", "sqlite:///legis-binding.db")), + AuditStore(binding_db_url()), clock, key, ) @@ -182,6 +218,7 @@ def build_runtime(agent_id: str) -> McpRuntime: if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") else None ), + wardline_allow_dirty=os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1", binding_ledger=binding_ledger, ) @@ -249,7 +286,12 @@ def tool_definitions() -> list[dict[str, Any]]: "name": "scan_route", "description": ( "Route Wardline scan findings through one cell, a severity_map " - "policy, or a cell plus fail_on threshold." + "policy, or a cell plus fail_on threshold. Returns a discriminated " + "outcome: ROUTED (governed) or SKIPPED_DIRTY_TREE (an unsigned " + "dirty-tree dev artifact arrived where signed provenance is " + "required — a typed amber skip, not a failure; commit for a " + "signed artifact, or set LEGIS_WARDLINE_ALLOW_DIRTY=1 to govern " + "it unsigned in dev)." ), "inputSchema": _schema( ["scan"], @@ -332,7 +374,13 @@ def _recovery_for(code: str) -> dict[str, Any]: next_actions = { "INVALID_ARGUMENT": "Correct the tool arguments and retry.", "INVALID_CELL_SPEC": "Use server-owned routing or a valid cell configuration.", - "CELL_NOT_ENABLED": "Ask the operator to enable the required governance cell.", + "CELL_NOT_ENABLED": ( + "Enable the cell by wiring its backing store: set LEGIS_HMAC_KEY " + "(enables the binding ledger + protected/structured gates), and " + "configure the policy cells via LEGIS_POLICY_CELLS or policy/cells.toml " + "(LEGIS_DEV_DEFAULT_CELLS=1 for the dev posture). The error message " + "names which cell is unenabled." + ), "NO_SUCH_REQUEST": "Poll a known sign-off sequence returned by override_submit.", "NOT_FOUND": "Refresh the target identifier and retry.", "UNKNOWN_TOOL": "Call tools/list and use one of the advertised tool names.", @@ -369,12 +417,23 @@ def _service_error(exc: Exception) -> dict[str, Any]: return _tool_error("NOT_FOUND", str(exc)) if isinstance(exc, InvalidArgumentError): return _tool_error("INVALID_ARGUMENT", str(exc)) + if isinstance(exc, WardlineRoutingError): + # All three routing kinds (server-misconfigured / server-owned / + # malformed) collapse to one MCP code; the HTTP adapter splits them by + # status. Must precede the generic ServiceError case below. + return _tool_error("INVALID_CELL_SPEC", str(exc)) if isinstance(exc, GitError): return _tool_error("GIT_ERROR", str(exc)) if isinstance(exc, ServiceError): return _tool_error("SERVICE_ERROR", str(exc)) if isinstance(exc, ValueError): return _tool_error("INVALID_ARGUMENT", str(exc)) + # Unexpected: the typed cases above are expected and reach the caller as their + # own codes, so they stay quiet. This fall-through is a genuine surprise — the + # caller gets INTERNAL_ERROR, but the operator/Sentry would see nothing unless + # we log it here with the exception. (exc_info=exc, not True: _service_error + # may be called outside an active except block.) + logger.error("unhandled MCP tool error: %s", exc, exc_info=exc) return _tool_error("INTERNAL_ERROR", str(exc)) @@ -468,22 +527,6 @@ def _registry(runtime: McpRuntime) -> PolicyCellRegistry: return runtime.cell_registry or fail_closed_policy_cells() -def _parse_wardline_cell_map(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: - mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} - for part in raw.split(","): - if not part.strip(): - continue - severity_raw, sep, cell_raw = part.partition("=") - if not sep: - raise ValueError("cell map entries must be SEVERITY=cell") - mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( - cell_raw.strip() - ) - if not mapping: - raise ValueError("cell map must not be empty") - return mapping - - def _explanation_payload(explanation) -> dict[str, Any]: payload = explanation.to_payload() payload["available_moves"] = [ @@ -508,28 +551,26 @@ def _git(runtime: McpRuntime) -> GitSurface: def _engine(runtime: McpRuntime) -> EnforcementEngine: if runtime.engine is None: - from legis.config import DEFAULT_GOVERNANCE_DB + from legis.config import governance_db_url - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB)) + store = AuditStore(governance_db_url()) runtime.engine = EnforcementEngine(store, SystemClock()) return runtime.engine def _checks(runtime: McpRuntime) -> CheckSurface: if runtime.check_surface is None: - from legis.config import DEFAULT_CHECK_DB + from legis.config import check_db_url - runtime.check_surface = CheckSurface( - os.environ.get("LEGIS_CHECK_DB", DEFAULT_CHECK_DB) - ) + runtime.check_surface = CheckSurface(check_db_url()) return runtime.check_surface def _pulls(runtime: McpRuntime) -> PullSurface: if runtime.pull_surface is None: - runtime.pull_surface = PullSurface( - os.environ.get("LEGIS_PULL_DB", "sqlite:///legis-pulls.db") - ) + from legis.config import pull_db_url + + runtime.pull_surface = PullSurface(pull_db_url()) return runtime.pull_surface @@ -601,6 +642,11 @@ def _override_idempotency_request_hash( def _existing_idempotent_record( runtime: McpRuntime, key: str, request_hash: str ) -> Any | None: + # The O(N) hash + HMAC cost of the scan below is `_verified_records`' whole- + # trail tamper check, paid deliberately on this interactive path — NOT a + # keyed single-row lookup, which would skip verification (the optimization + # operator-confirmed declined in rc4 review #7; see service.verified_records' + # cost note). The scan itself is over the already-verified list. for rec in _verified_records(runtime): ext = rec.payload.get("extensions", {}) if ext.get("mcp_idempotency_key") != key: @@ -688,365 +734,377 @@ def _verified_records(runtime: McpRuntime) -> list[Any]: return runtime.engine.records() -def call_tool(runtime: McpRuntime, name: str, args: dict[str, Any]) -> dict[str, Any]: - try: - _validate_argument_keys(name, args) - if name == "policy_explain": - explanation = explain_policy( - _registry(runtime), - policy=_require(args, "policy"), - entity=_require(args, "entity"), - engine=runtime.engine, - protected_gate=runtime.protected_gate, - signoff_gate=runtime.signoff_gate, - ) - return _tool_result(_explanation_payload(explanation)) - - if name == "override_submit": - policy = _require(args, "policy") - entity = _require(args, "entity") - rationale = _require(args, "rationale") - idempotency_key = _optional_string(args, "idempotency_key") - simple_engine = ( - _engine(runtime) - if _registry(runtime).cell_for(policy) in ("chill", "coached") - else runtime.engine - ) - explanation = explain_policy( - _registry(runtime), - policy=policy, - entity=entity, - engine=simple_engine, - protected_gate=runtime.protected_gate, - signoff_gate=runtime.signoff_gate, - ) - if not explanation.enabled: - raise NotEnabledError( - f"cell {explanation.cell!r} is not enabled for override submission" - ) - idempotency_request_hash = ( - _override_idempotency_request_hash( - agent_id=runtime.agent_id, - policy=policy, - entity=entity, - rationale=rationale, - cell=explanation.cell, - file_fingerprint=_optional_string(args, "file_fingerprint"), - ast_path=_optional_string(args, "ast_path"), - ) - if idempotency_key is not None - else None +def _tool_policy_explain(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + explanation = explain_policy( + _registry(runtime), + policy=_require(args, "policy"), + entity=_require(args, "entity"), + engine=runtime.engine, + protected_gate=runtime.protected_gate, + signoff_gate=runtime.signoff_gate, + ) + return _tool_result(_explanation_payload(explanation)) + + +def _tool_override_submit(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + policy = _require(args, "policy") + entity = _require(args, "entity") + rationale = _require(args, "rationale") + idempotency_key = _optional_string(args, "idempotency_key") + simple_engine = ( + _engine(runtime) + if _registry(runtime).cell_for(policy) in ("chill", "coached") + else runtime.engine + ) + explanation = explain_policy( + _registry(runtime), + policy=policy, + entity=entity, + engine=simple_engine, + protected_gate=runtime.protected_gate, + signoff_gate=runtime.signoff_gate, + ) + if not explanation.enabled: + raise NotEnabledError( + f"cell {explanation.cell!r} is not enabled for override submission" + ) + idempotency_request_hash = ( + _override_idempotency_request_hash( + agent_id=runtime.agent_id, + policy=policy, + entity=entity, + rationale=rationale, + cell=explanation.cell, + file_fingerprint=_optional_string(args, "file_fingerprint"), + ast_path=_optional_string(args, "ast_path"), + ) + if idempotency_key is not None + else None + ) + extra_extensions = ( + { + "mcp_idempotency_key": idempotency_key, + "mcp_idempotency_request_hash": idempotency_request_hash, + "mcp_cell": explanation.cell, + } + if idempotency_key is not None + else {"mcp_cell": explanation.cell} + ) + if idempotency_key is not None and idempotency_request_hash is not None: + existing = _existing_idempotent_record( + runtime, idempotency_key, idempotency_request_hash + ) + if existing is not None: + return _tool_result( + _idempotent_override_response(existing.payload, existing.seq) ) - extra_extensions = ( + if explanation.cell in ("chill", "coached"): + override_result = submit_override( + _engine(runtime), + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + extra_extensions=extra_extensions, + ) + if explanation.cell == "chill": + return _tool_result( { - "mcp_idempotency_key": idempotency_key, - "mcp_idempotency_request_hash": idempotency_request_hash, - "mcp_cell": explanation.cell, + "outcome": "ACCEPTED_SELF", + "cell": "chill", + "seq": override_result.seq, + "note": "self-cleared; human reviews asynchronously", } - if idempotency_key is not None - else {"mcp_cell": explanation.cell} ) - if idempotency_key is not None and idempotency_request_hash is not None: - existing = _existing_idempotent_record( - runtime, idempotency_key, idempotency_request_hash - ) - if existing is not None: - return _tool_result( - _idempotent_override_response(existing.payload, existing.seq) - ) - if explanation.cell in ("chill", "coached"): - override_result = submit_override( - _engine(runtime), - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - extra_extensions=extra_extensions, - ) - if explanation.cell == "chill": - return _tool_result( - { - "outcome": "ACCEPTED_SELF", - "cell": "chill", - "seq": override_result.seq, - "note": "self-cleared; human reviews asynchronously", - } - ) - return _tool_result( - _judged_result_payload( - cell="coached", - seq=override_result.seq, - accepted=override_result.accepted, - judge_model=override_result.judge_model, - judge_rationale=override_result.judge_rationale, - ) - ) - if explanation.cell == "structured": - signoff = request_signoff( - runtime.signoff_gate, - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - extra_extensions=extra_extensions, - ) - return _tool_result( - { - "outcome": "ESCALATED_PENDING", - "cell": "structured", - "seq": signoff.seq, - "cleared": signoff.cleared, - "human_required": True, - "operator_instruction": ( - f"Human sign-off required for seq {signoff.seq}." - ), - "poll_tool": "signoff_status_get", - "poll_handle": signoff.seq, - } - ) - if explanation.cell == "protected": - missing = [ - item.to_payload() - for item in explanation.required_inputs - if not _optional_string(args, item.field) - ] - if missing: - return _tool_result( - { - "outcome": "NEED_INPUTS", - "cell": "protected", - "required_inputs": missing, - } - ) - protected = submit_protected_override( - runtime.protected_gate, - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - file_fingerprint=_require(args, "file_fingerprint"), - ast_path=_require(args, "ast_path"), - source_root=runtime.source_root, - extra_extensions=extra_extensions, - ) - return _tool_result( - _judged_result_payload( - cell="protected", - seq=protected.seq, - accepted=protected.accepted, - judge_model=protected.judge_model, - judge_rationale=protected.judge_rationale, - ) - ) - raise NotEnabledError(f"unsupported policy cell {explanation.cell!r}") - - if name == "signoff_status_get": - seq = _require_int(args, "seq") - if runtime.signoff_gate is None: - raise NotEnabledError("structured cell not enabled") - request = runtime.signoff_gate.request_record(seq) - if request is None: - return _tool_error("NO_SUCH_REQUEST", f"no sign-off request at seq {seq}") - if not runtime.signoff_gate.is_cleared(seq): - return _tool_result({"cleared": False, "seq": seq}) - signed = _signoff_signed_record(runtime, seq) - payload: dict[str, Any] = {"cleared": True, "seq": seq} - if signed is not None: - payload["signed_by"] = signed.get("agent_id") - payload["signed_at"] = signed.get("recorded_at") - return _tool_result(payload) - - if name == "policy_evaluate": - ev = evaluate_policy( - _grammar(runtime), - engine=_engine(runtime), - policy=_require(args, "policy"), - target=_require_object(args, "target"), + return _tool_result( + _judged_result_payload( + cell="coached", + seq=override_result.seq, + accepted=override_result.accepted, + judge_model=override_result.judge_model, + judge_rationale=override_result.judge_rationale, ) + ) + if explanation.cell == "structured": + signoff = request_signoff( + runtime.signoff_gate, + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + extra_extensions=extra_extensions, + ) + return _tool_result( + { + "outcome": "ESCALATED_PENDING", + "cell": "structured", + "seq": signoff.seq, + "cleared": signoff.cleared, + "human_required": True, + "operator_instruction": ( + f"Human sign-off required for seq {signoff.seq}." + ), + "poll_tool": "signoff_status_get", + "poll_handle": signoff.seq, + } + ) + if explanation.cell == "protected": + missing = [ + item.to_payload() + for item in explanation.required_inputs + if not _optional_string(args, item.field) + ] + if missing: return _tool_result( { - "outcome": ev.result.value, - "detail": ev.detail, - "provenance_gap": ev.provenance_gap, + "outcome": "NEED_INPUTS", + "cell": "protected", + "required_inputs": missing, } ) + protected = submit_protected_override( + runtime.protected_gate, + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + file_fingerprint=_require(args, "file_fingerprint"), + ast_path=_require(args, "ast_path"), + source_root=runtime.source_root, + extra_extensions=extra_extensions, + ) + return _tool_result( + _judged_result_payload( + cell="protected", + seq=protected.seq, + accepted=protected.accepted, + judge_model=protected.judge_model, + judge_rationale=protected.judge_rationale, + ) + ) + raise NotEnabledError(f"unsupported policy cell {explanation.cell!r}") + + +def _tool_signoff_status_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + seq = _require_int(args, "seq") + if runtime.signoff_gate is None: + raise NotEnabledError("structured cell not enabled") + request = runtime.signoff_gate.request_record(seq) + if request is None: + return _tool_error("NO_SUCH_REQUEST", f"no sign-off request at seq {seq}") + if not runtime.signoff_gate.is_cleared(seq): + return _tool_result({"cleared": False, "seq": seq}) + signed = _signoff_signed_record(runtime, seq) + payload: dict[str, Any] = {"cleared": True, "seq": seq} + if signed is not None: + payload["signed_by"] = signed.get("agent_id") + payload["signed_at"] = signed.get("recorded_at") + return _tool_result(payload) + + +def _tool_policy_evaluate(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + ev = evaluate_policy( + _grammar(runtime), + engine=_engine(runtime), + policy=_require(args, "policy"), + target=_require_object(args, "target"), + ) + return _tool_result( + { + "outcome": ev.result.value, + "detail": ev.detail, + "provenance_gap": ev.provenance_gap, + } + ) - if name == "scan_route": - server_cell = os.environ.get("LEGIS_WARDLINE_CELL") - server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") - if server_cell and server_cell_by_severity: - return _tool_error( - "INVALID_CELL_SPEC", "server Wardline routing is misconfigured" - ) - has_cell = "cell" in args - has_map = "severity_map" in args - has_fail_on = "fail_on" in args - server_routing = server_cell is not None or server_cell_by_severity is not None - if server_routing and (has_cell or has_map or has_fail_on): - return _tool_error( - "INVALID_CELL_SPEC", "Wardline routing is server-owned" + +def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + # "severity_map" must be an object if present (transport-type check); the + # governance decision — is request routing allowed, and is the spec + # well-formed? — lives in resolve_scan_routing, shared with the HTTP adapter. + # A WardlineRoutingError propagates to call_tool's translator → INVALID_CELL_SPEC. + request_severity_map = ( + _require_object(args, "severity_map") if "severity_map" in args else None + ) + routing = resolve_scan_routing( + server_cell=os.environ.get("LEGIS_WARDLINE_CELL"), + server_cell_by_severity=os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY"), + request_cell=args.get("cell"), + request_severity_map=request_severity_map, + request_fail_on=args.get("fail_on"), + allow_request_routing=( + os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") == "1" + ), + ) + scan = _require_object(args, "scan") + try: + routed = route_wardline_scan( + scan, + agent_id=runtime.agent_id, + identity=runtime.identity, + engine=_engine(runtime), + signoff=runtime.signoff_gate, + policy=routing.policy, + cell_map=routing.cell_map, + fail_on=routing.fail_on, + artifact_key=( + runtime.wardline_artifact_key + or ( + os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") + if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") + else None ) - if not server_routing: - if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": - return _tool_error( - "INVALID_CELL_SPEC", - "Wardline routing is server-owned; configure " - "LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", - ) - if has_fail_on: - if not has_cell or has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "fail_on routing requires cell and forbids severity_map", - ) - elif has_cell == has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "provide exactly one of cell or severity_map", - ) - scan = _require_object(args, "scan") - scan_policy: WardlineCellPolicy | None = None - scan_cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None - scan_fail_on: WardlineSeverity | None = None - try: - if server_cell_by_severity is not None: - scan_cell_map = _parse_wardline_cell_map(server_cell_by_severity) - elif server_cell is not None: - scan_policy = WardlineCellPolicy(server_cell) - elif has_cell: - scan_policy = WardlineCellPolicy(_require(args, "cell")) - if has_fail_on: - scan_fail_on = WardlineSeverity[_require(args, "fail_on")] - else: - raw_map = _require_object(args, "severity_map") - scan_cell_map = { - WardlineSeverity[severity]: WardlineCellPolicy(cell) - for severity, cell in raw_map.items() - } - except (KeyError, ValueError) as exc: - return _tool_error("INVALID_CELL_SPEC", str(exc)) - routed = route_wardline_scan( - scan, - agent_id=runtime.agent_id, - identity=runtime.identity, - engine=_engine(runtime), - signoff=runtime.signoff_gate, - policy=scan_policy, - cell_map=scan_cell_map, - fail_on=scan_fail_on, - artifact_key=( - runtime.wardline_artifact_key - or ( - os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") - if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") - else None - ) - ), - ) - return _tool_result({"outcome": "ROUTED", "routed": routed}) + ), + allow_dirty=( + runtime.wardline_allow_dirty + or os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1" + ), + ) + except WardlineDirtyTreeError as exc: + # Amber, not red (INVALID_ARGUMENT): a dirty dev tree is "environment + # not ready", not a broken/tampered scan. A typed outcome lets a harness + # tell "commit first" apart from a genuine legis/scan fault; nothing is + # governed. + return _tool_result( + {"outcome": exc.reason, "routed": [], "detail": str(exc)} + ) + return _tool_result({"outcome": ScanOutcome.ROUTED, "routed": routed}) - if name == "git_branch_list": - return _tool_result( - {"branches": [asdict(branch) for branch in _git(runtime).branches()]} - ) - if name == "git_commit_get": - return _tool_result( - {"commit": asdict(_git(runtime).commit(_require(args, "sha")))} - ) +def _tool_git_branch_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + {"branches": [asdict(branch) for branch in _git(runtime).branches()]} + ) - if name == "git_rename_list": - return _tool_result( - { - "renames": [ - asdict(rename) - for rename in _git(runtime).renames(_require(args, "rev_range")) - ] - } - ) - if name == "git_rename_feed_get": - from legis.git.rename_feed import build_rename_feed +def _tool_git_commit_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + {"commit": asdict(_git(runtime).commit(_require(args, "sha")))} + ) - return _tool_result( - build_rename_feed( - runtime.source_root or os.getcwd(), - base=_require(args, "base"), - head=args.get("head", "HEAD"), - include_worktree=bool(args.get("include_worktree", False)), - ) - ) - if name == "filigree_closure_gate_get": - from legis.governance.filigree_gate import evaluate_issue_closure +def _tool_git_rename_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + { + "renames": [ + asdict(rename) + for rename in _git(runtime).renames(_require(args, "rev_range")) + ] + } + ) - if runtime.binding_ledger is None: - raise NotEnabledError("binding ledger not enabled") - return _tool_result( - evaluate_issue_closure(runtime.binding_ledger, issue_id=_require(args, "issue_id")) - ) - if name == "pull_request_get": - number = _require_int(args, "number") - pull = _pulls(runtime).get(number) - if pull is None: - return _tool_error("NOT_FOUND", f"unknown PR: {number}") - pull_payload = asdict(pull) - pull_payload["state"] = pull.state.value - pull_checks = ( - _checks(runtime).for_pr(number) - if runtime.check_surface is not None - else [] - ) - pull_payload["checks"] = [_check_to_dict(run) for run in pull_checks] - return _tool_result(pull_payload) - - if name == "check_list": - check_surface = _checks(runtime) - target_type = _require(args, "target_type") - target = _require(args, "target") - if target_type == "commit": - checks = check_surface.for_commit(target) - response_target: str | int = target - elif target_type == "branch": - checks = check_surface.for_branch(target) - response_target = target - elif target_type == "pr": - try: - pr_number = int(target) - except ValueError as exc: - raise InvalidArgumentError( - "target_type 'pr' requires an integer target" - ) from exc - checks = check_surface.for_pr(pr_number) - response_target = pr_number - else: - raise InvalidArgumentError( - "target_type must be one of: commit, branch, pr" - ) - return _tool_result( - { - "target_type": target_type, - "target": response_target, - "checks": [_check_to_dict(run) for run in checks], - } - ) +def _tool_git_rename_feed_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + from legis.git.rename_feed import build_rename_feed + + return _tool_result( + build_rename_feed( + runtime.source_root or os.getcwd(), + base=_require(args, "base"), + head=args.get("head", "HEAD"), + include_worktree=bool(args.get("include_worktree", False)), + ) + ) - if name == "override_rate_get": - rate = compute_override_rate(_verified_records(runtime)) - return _tool_result( - { - "status": rate.status.value, - "rate": rate.rate, - "sample_size": rate.sample_size, - "note": _OVERRIDE_RATE_NOTE, - } - ) - return _tool_error("UNKNOWN_TOOL", f"unknown tool: {name}") +def _tool_filigree_closure_gate_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + from legis.governance.filigree_gate import evaluate_issue_closure + + if runtime.binding_ledger is None: + raise NotEnabledError("binding ledger not enabled") + return _tool_result( + evaluate_issue_closure(runtime.binding_ledger, issue_id=_require(args, "issue_id")) + ) + + +def _tool_pull_request_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + number = _require_int(args, "number") + pull = _pulls(runtime).get(number) + if pull is None: + return _tool_error("NOT_FOUND", f"unknown PR: {number}") + pull_payload = asdict(pull) + pull_payload["state"] = pull.state.value + # Build the check surface unconditionally — `_checks()` lazily initialises it + # from LEGIS_CHECK_DB. Guarding on `runtime.check_surface is not None` made the + # result call-order-dependent: a fresh runtime (build_runtime sets it to None) + # reported no checks until some other tool happened to initialise the surface + # first, so an agent could be told a PR is clean when checks exist and fail. + pull_checks = _checks(runtime).for_pr(number) + pull_payload["checks"] = [_check_to_dict(run) for run in pull_checks] + return _tool_result(pull_payload) + + +def _tool_check_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + check_surface = _checks(runtime) + target_type = _require(args, "target_type") + target = _require(args, "target") + if target_type == "commit": + checks = check_surface.for_commit(target) + response_target: str | int = target + elif target_type == "branch": + checks = check_surface.for_branch(target) + response_target = target + elif target_type == "pr": + try: + pr_number = int(target) + except ValueError as exc: + raise InvalidArgumentError( + "target_type 'pr' requires an integer target" + ) from exc + checks = check_surface.for_pr(pr_number) + response_target = pr_number + else: + raise InvalidArgumentError( + "target_type must be one of: commit, branch, pr" + ) + return _tool_result( + { + "target_type": target_type, + "target": response_target, + "checks": [_check_to_dict(run) for run in checks], + } + ) + + +def _tool_override_rate_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + rate = compute_override_rate(_verified_records(runtime)) + return _tool_result( + { + "status": rate.status.value, + "rate": rate.rate, + "sample_size": rate.sample_size, + "note": _OVERRIDE_RATE_NOTE, + } + ) + + +_TOOL_HANDLERS: dict[str, Callable[["McpRuntime", dict[str, Any]], dict[str, Any]]] = { + "policy_explain": _tool_policy_explain, + "override_submit": _tool_override_submit, + "signoff_status_get": _tool_signoff_status_get, + "policy_evaluate": _tool_policy_evaluate, + "scan_route": _tool_scan_route, + "git_branch_list": _tool_git_branch_list, + "git_commit_get": _tool_git_commit_get, + "git_rename_list": _tool_git_rename_list, + "git_rename_feed_get": _tool_git_rename_feed_get, + "filigree_closure_gate_get": _tool_filigree_closure_gate_get, + "pull_request_get": _tool_pull_request_get, + "check_list": _tool_check_list, + "override_rate_get": _tool_override_rate_get, +} + + +def call_tool(runtime: McpRuntime, name: str, args: dict[str, Any]) -> dict[str, Any]: + try: + _validate_argument_keys(name, args) + handler = _TOOL_HANDLERS.get(name) + if handler is None: + return _tool_error("UNKNOWN_TOOL", f"unknown tool: {name}") + return handler(runtime, args) except Exception as exc: return _service_error(exc) @@ -1070,17 +1128,15 @@ def handle_request(request: dict[str, Any], runtime: McpRuntime) -> dict[str, An "error": {"code": -32602, "message": "initialize params must be an object"}, } requested = params.get("protocolVersion") - if requested is not None and requested not in _SUPPORTED_PROTOCOL_VERSIONS: - return { - "jsonrpc": "2.0", - "id": request_id, - "error": { - "code": -32602, - "message": f"unsupported protocolVersion: {requested}", - "data": {"supported": list(_SUPPORTED_PROTOCOL_VERSIONS)}, - }, - } - runtime.protocol_version = requested or _DEFAULT_PROTOCOL_VERSION + if requested in _SUPPORTED_PROTOCOL_VERSIONS: + runtime.protocol_version = requested + else: + # MCP spec: when the client requests a protocolVersion the server + # does not support (or omits it), the server responds with a version + # it does support and lets the client decide whether to proceed — + # it must not hard-error. Hard-erroring here made newer clients + # (e.g. those negotiating 2025-06-18) fail to connect entirely. + runtime.protocol_version = _DEFAULT_PROTOCOL_VERSION runtime.initialized = True result = { "protocolVersion": runtime.protocol_version, @@ -1113,8 +1169,61 @@ def handle_request(request: dict[str, Any], runtime: McpRuntime) -> dict[str, An return {"jsonrpc": "2.0", "id": request_id, "result": result} +def _read_bounded_line(stream: TextIO, max_bytes: int) -> tuple[str, bool]: + """Read one newline-terminated record, bounded to ``max_bytes`` UTF-8 bytes. + + Returns ``(line, overflow)``. ``overflow`` is True when the record exceeded + the bound. ``readline(max_bytes + 1)`` caps the *character* read — a decoded + ``str`` holds at most 4 bytes per char, so this keeps the in-memory read + bounded — and is the cheap first gate: a record longer than the cap in + characters comes back without a trailing newline, so its physical remainder + is drained to the next newline to keep framing aligned. A record that fits in + characters but whose UTF-8 encoding still exceeds ``max_bytes`` (multibyte + content) is rejected too, so the limit means bytes as its name promises. + Returns ``("", False)`` at EOF. + """ + line = stream.readline(max_bytes + 1) + if line == "": + return "", False + if len(line) > max_bytes and not line.endswith("\n"): + # Truncated mid-record at the character cap: drain the rest of the + # physical line so the next read starts on a record boundary. + while True: + extra = stream.readline(max_bytes + 1) + if extra == "" or extra.endswith("\n"): + break + return line, True + if len(line.encode("utf-8")) > max_bytes: + # Complete record (newline-terminated, or the final EOF record with no + # trailing newline) but over the byte budget; framing is already aligned + # — nothing follows the read — so no drain is needed. + return line, True + return line, False + + def run_jsonrpc(input_stream: TextIO, output_stream: TextIO, runtime: McpRuntime) -> None: - for line in input_stream: + max_bytes = _max_request_bytes() + while True: + line, overflow = _read_bounded_line(input_stream, max_bytes) + if not line: + break # EOF + if overflow: + output_stream.write( + json.dumps( + { + "jsonrpc": "2.0", + "id": None, + "error": { + "code": -32700, + "message": f"request exceeds maximum size of {max_bytes} bytes", + }, + }, + separators=(",", ":"), + ) + + "\n" + ) + output_stream.flush() + continue if not line.strip(): continue try: diff --git a/src/legis/policy/boundary_scan.py b/src/legis/policy/boundary_scan.py index fa44cee..38cd505 100644 --- a/src/legis/policy/boundary_scan.py +++ b/src/legis/policy/boundary_scan.py @@ -3,13 +3,11 @@ from __future__ import annotations import ast -import textwrap from dataclasses import asdict, dataclass from pathlib import Path from typing import Any, cast -from legis.canonical import content_hash -from legis.policy.decorator import get_normalized_ast_str +from legis.policy.decorator import fingerprint_source from legis.policy.evidence import evaluate_test_evidence @@ -154,9 +152,10 @@ def _visit_function(self, node: ast.FunctionDef | ast.AsyncFunctionDef) -> None: test_source, test_node = test_result test_segment = ast.get_source_segment(test_source, test_node) or "" - actual_fingerprint = content_hash( - get_normalized_ast_str(textwrap.dedent(test_segment)) - ) + # Same canonicalization the runtime honesty gate uses — CRLF/dedent + # normalization and a decorator-insensitive AST hash — so the two + # paths cannot diverge for a decorated / class-method test_ref (Q-L5). + actual_fingerprint = fingerprint_source(test_segment) if actual_fingerprint != test_fingerprint: self._add( "POLICY_BOUNDARY_TEST_FINGERPRINT_MISMATCH", diff --git a/src/legis/policy/decorator.py b/src/legis/policy/decorator.py index aa32c14..fdf19d8 100644 --- a/src/legis/policy/decorator.py +++ b/src/legis/policy/decorator.py @@ -104,16 +104,44 @@ def wrapper(*args: Any, **kwargs: Any) -> Any: def get_normalized_ast_str(source: str) -> str: import ast parsed = ast.parse(source) - # Strip docstrings for node in ast.walk(parsed): + # Strip docstrings. if isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.Module)): if node.body and isinstance(node.body[0], ast.Expr): val = node.body[0].value if isinstance(val, ast.Constant) and isinstance(val.value, str): node.body.pop(0) + # Strip decorators so the fingerprint does not depend on whether the + # extracted source carried the decorator lines. The runtime gate reads + # the test via inspect.getsource (decorators INCLUDED); the static + # scanner reads it via ast.get_source_segment of the FunctionDef + # (decorators EXCLUDED). Without this, a decorated or class-method + # test_ref fingerprints differently on each path (Q-L5). + if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)): + node.decorator_list = [] return ast.dump(parsed) +def fingerprint_source(source: str) -> str: + """The single canonicalization both fingerprint paths share (Q-L5). + + Normalizes platform line endings (CRLF->LF) and indentation, then hashes the + docstring- and decorator-stripped AST. Falls back to hashing the normalized + source text when it cannot be parsed (e.g. an extracted fragment). The + runtime honesty gate (``fingerprint``) and the static scanner + (``boundary_scan``) MUST both route through here so they can never compute + divergent fingerprints for the same referenced test. + """ + import textwrap + + source = source.replace("\r\n", "\n") + source = textwrap.dedent(source) + try: + return content_hash(get_normalized_ast_str(source)) + except Exception: + return content_hash(source) + + def fingerprint(test_fn: Callable[..., Any]) -> str: """Content hash of a test function's source — the gate's anti-vibe teeth. @@ -126,16 +154,9 @@ def fingerprint(test_fn: Callable[..., Any]) -> str: except (OSError, TypeError) as exc: raise OSError(f"Source code not available for test: {exc}") from exc - # Normalize CRLF to LF to handle platform line ending differences - source = source.replace("\r\n", "\n") - - try: - import textwrap - source = textwrap.dedent(source) - normalized = get_normalized_ast_str(source) - return content_hash(normalized) - except Exception: - return content_hash(source) + # Route through the shared canonicalization the static scanner also uses, so + # the two paths cannot diverge (Q-L5). + return fingerprint_source(source) @dataclass(frozen=True) diff --git a/src/legis/policy/grammar.py b/src/legis/policy/grammar.py index 0517928..7b654f9 100644 --- a/src/legis/policy/grammar.py +++ b/src/legis/policy/grammar.py @@ -12,7 +12,7 @@ from __future__ import annotations -from collections.abc import Hashable, Mapping +from collections.abc import Mapping from dataclasses import dataclass from enum import Enum from typing import Any, Protocol, runtime_checkable diff --git a/src/legis/provenance.py b/src/legis/provenance.py new file mode 100644 index 0000000..4be22af --- /dev/null +++ b/src/legis/provenance.py @@ -0,0 +1,27 @@ +"""Provenance vocabulary shared by recorded forge/CI facts. + +``CheckRun`` and ``PullRequest`` are both *writer-supplied claims* — legis +records what a writer asserted, not what a forge cryptographically attested. The +provenance axis names how far that claim is backed. Today there is exactly one +member; an authenticated path (e.g. a signed forge webhook) would add a stronger +value here rather than as another hand-typed string literal. + +This is the single vocabulary source for both ``checks`` and ``pulls``; neither +package imports the other, so the enum lives at the package root they share. The +field stays typed ``str`` on the wire-facing dataclasses (matching the +``Suppressed`` precedent in the rc-series str,Enum conversion): a ``str,Enum`` +member *is* its wire string, so ``json.dumps`` / ``canonical_json`` emit +byte-identical payloads, and raw values read back out of the ``Text`` DB columns +never need coercion that could raise on a legacy/unexpected value. +""" + +from __future__ import annotations + +from enum import Enum + + +class Provenance(str, Enum): + """How far a recorded forge/CI claim is backed.""" + + # A writer-asserted fact with no signature or forge attestation behind it. + UNAUTHENTICATED = "unauthenticated" diff --git a/src/legis/pulls/models.py b/src/legis/pulls/models.py index 7141742..bba946f 100644 --- a/src/legis/pulls/models.py +++ b/src/legis/pulls/models.py @@ -5,6 +5,8 @@ from dataclasses import dataclass from enum import Enum +from legis.provenance import Provenance + class PullRequestState(str, Enum): OPEN = "open" @@ -24,4 +26,4 @@ class PullRequest: # Q-M4: recorded PR metadata is a writer-supplied claim, not forge-verified. # "unauthenticated" so a consumer never treats writer-asserted PR state as # authoritative (see CheckRun.provenance). - provenance: str = "unauthenticated" + provenance: str = Provenance.UNAUTHENTICATED diff --git a/src/legis/pulls/surface.py b/src/legis/pulls/surface.py index 7c17eb6..3e883de 100644 --- a/src/legis/pulls/surface.py +++ b/src/legis/pulls/surface.py @@ -5,11 +5,15 @@ from sqlalchemy import Column, Integer, MetaData, String, Table, Text, create_engine, delete, insert, select from sqlalchemy.pool import NullPool +from legis.provenance import Provenance from legis.pulls.models import PullRequest, PullRequestState class PullSurface: def __init__(self, db_url: str) -> None: + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(db_url) self._engine = create_engine(db_url, future=True, poolclass=NullPool) self._md = MetaData() self._pulls = Table( @@ -69,5 +73,5 @@ def get(self, number: int) -> PullRequest | None: state=PullRequestState(row.state), url=row.url, recorded_by=row.recorded_by, - provenance=row.provenance or "unauthenticated", + provenance=row.provenance or Provenance.UNAUTHENTICATED, ) diff --git a/src/legis/service/errors.py b/src/legis/service/errors.py index 0b952e2..94065d3 100644 --- a/src/legis/service/errors.py +++ b/src/legis/service/errors.py @@ -28,6 +28,25 @@ class InvalidArgumentError(ServiceError): """Caller input is structurally valid for the transport but invalid for Legis.""" +class WardlineRoutingError(ServiceError): + """A Wardline scan-routing request is not permitted or is malformed. + + Carries a ``kind`` discriminator so each adapter can preserve its own + taxonomy without re-implementing the decision: the HTTP adapter maps + ``server_misconfigured`` → 500, ``server_owned`` → 403, ``malformed`` → 422, + while the MCP adapter collapses all three to ``INVALID_CELL_SPEC``. Adapters + switch on the ``kind`` attribute, never on message text. + """ + + SERVER_MISCONFIGURED = "server_misconfigured" + SERVER_OWNED = "server_owned" + MALFORMED = "malformed" + + def __init__(self, kind: str, message: str) -> None: + super().__init__(message) + self.kind = kind + + class ProtectedKeyRequiredError(ServiceError): """A protected trail was read without the HMAC key needed to verify it. diff --git a/src/legis/service/governance.py b/src/legis/service/governance.py index 2fc1582..24f2747 100644 --- a/src/legis/service/governance.py +++ b/src/legis/service/governance.py @@ -51,20 +51,15 @@ def resolve_for_record( res = identity.resolve(locator) ext: dict = {} if res.alive is not None: - identity_status = getattr( - res, "identity_resolution_status", "resolved" if res.alive else "not_alive" - ) - lineage_status = getattr( - res, - "lineage_snapshot_status", - "verified" if res.lineage_snapshot is not None else "not_applicable", - ) + # Both status axes are mandatory str,Enum fields on IdentityResolution now, + # so read them directly — the old getattr fallbacks guarded a shape the + # type no longer permits. The members serialize as their bare strings. ext["loomweave"] = { "alive": res.alive, "content_hash": res.content_hash, "lineage_snapshot": res.lineage_snapshot, - "identity_resolution_status": identity_status, - "lineage_snapshot_status": lineage_status, + "identity_resolution_status": res.identity_resolution_status, + "lineage_snapshot_status": res.lineage_snapshot_status, } return res.entity_key, ext @@ -89,6 +84,23 @@ def verified_records( owner exposing ``records()`` / ``verify_integrity()`` and a verifier exposing ``verify()``) so the service layer is not coupled to the enforcement concrete types. + + Cost note (rc4 review #7): this verifies the *whole* trail on every call — + ``verify_integrity()`` re-hashes the chain (O(N)) and ``trail_verifier.verify`` + re-checks signatures (O(N)) — including on interactive paths (the keyed + override-submit idempotency check and every override-rate read). That cost is + the tamper-evidence property, not an oversight: there is no load-time or + open-time verification anywhere (``AuditStore.__init__`` only creates the + schema), so this path is the only thing standing between a tampered record and + an interactive read. Two tempting optimizations are deliberately NOT taken: + reserving full verification for the explicit governance-gate would leave every + interactive read unverified (a silent tamper window); and incremental + verification (trusting a cached last-verified prefix and re-hashing only the + new tail) cannot detect out-of-band tampering of an already-verified record — + exactly what the hash chain exists to catch — and still would not reach O(1), + because the signature pass is O(N) regardless. If trail size ever makes this + latency-bound, the honest lever is trail retention/compaction, not narrowing + what each read verifies. """ if trail_owner is not None: records = trail_owner.records() @@ -119,14 +131,30 @@ def compute_override_rate(records: list): def _requires_protected_verification(payload: dict[str, Any], protected_policies) -> bool: + """Gate-local protected-detection for the KEYLESS branch of the override-rate + gate: would refusing to score this record be right because it genuinely needs + a signature we have no key to check? + + The discriminator is *status-claim vs incidental metadata*. The markers kept + below — ``protected_cell`` and the signature keys — are a record purporting to + BE protected, so failing closed on them in a keyless deployment is correct + even if injected. ``file_fingerprint`` / ``ast_path`` carry no such claim: + they are ordinary metadata, and the simple-tier engine accepts an arbitrary + ``extensions`` dict, so they can ride on a never-signed chill/coached record — + flagging them would fail-close a non-protected deployment on a record that has + nothing to verify. That over-reach is why those two sniffs are dropped here. + + Intentionally NARROWER than ``TrailVerifier._requires_verification`` (the + verify path, which must stay over-inclusive): the two answer different + questions — keyless "must I refuse to score this?" vs with-key "must this be + signed?" — so do NOT re-merge them. + """ ext = payload.get("extensions", {}) or {} return ( payload.get("policy") in protected_policies or ext.get("protected_cell") is True or "judge_metadata_signature" in ext or "signoff_signature" in ext - or "file_fingerprint" in ext - or "ast_path" in ext ) diff --git a/src/legis/service/wardline.py b/src/legis/service/wardline.py index cb86e9e..33c0aef 100644 --- a/src/legis/service/wardline.py +++ b/src/legis/service/wardline.py @@ -3,6 +3,7 @@ from __future__ import annotations from collections.abc import Mapping +from dataclasses import dataclass from typing import Any from legis.canonical import content_hash @@ -10,6 +11,7 @@ from legis.enforcement.signoff import SignoffGate from legis.identity.entity_key import EntityKey from legis.identity.resolver import IdentityResolver +from legis.service.errors import WardlineRoutingError from legis.service.governance import resolve_for_record from legis.wardline.governor import WardlineCellPolicy, route_findings from legis.wardline.ingest import ( @@ -21,6 +23,136 @@ from legis.wardline.policy import resolve_cell +@dataclass(frozen=True) +class ResolvedRouting: + """The resolved Wardline routing intent for a single scan. + + Exactly one of ``policy`` / ``cell_map`` is set unless ``fail_on`` is given + (then ``policy`` is the gate cell and per-finding resolution happens inside + ``route_wardline_scan``). ``cells`` is the set of cells that may actually run + — an adapter uses it to decide whether the governance engine is needed. + """ + + policy: WardlineCellPolicy | None + cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None + fail_on: WardlineSeverity | None + cells: frozenset[WardlineCellPolicy] + + +def _parse_cell_map_env(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: + mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} + for part in raw.split(","): + if not part.strip(): + continue + severity_raw, sep, cell_raw = part.partition("=") + if not sep: + raise ValueError("cell map entries must be SEVERITY=cell") + mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( + cell_raw.strip() + ) + if not mapping: + raise ValueError("cell map must not be empty") + return mapping + + +def resolve_scan_routing( + *, + server_cell: str | None, + server_cell_by_severity: str | None, + request_cell: str | None, + request_severity_map: dict[str, str] | None, + request_fail_on: str | None, + allow_request_routing: bool, +) -> ResolvedRouting: + """Resolve a scan-routing request to a ``ResolvedRouting`` or reject it. + + This is the single home for the governance decision the two transports used + to hand-copy: *is request-side routing allowed, and is the cell-spec + well-formed?* The caller passes already-read server-config values (env stays + in the adapter) plus the normalized request fields; every rejection is a + ``WardlineRoutingError`` whose ``kind`` the adapter maps to its own taxonomy. + + Routing is server-owned by default: a deployment pins the cell(s) via env and + callers may not override. ``allow_request_routing`` (the + ``LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING`` opt-in) is the only path to a + caller-supplied spec. Check order is part of the contract: + misconfigured → server-owned → malformed. + """ + if server_cell is not None and server_cell_by_severity is not None: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_MISCONFIGURED, + "server Wardline routing is misconfigured", + ) + server_routing = server_cell is not None or server_cell_by_severity is not None + request_routing = ( + request_cell is not None + or request_severity_map is not None + or request_fail_on is not None + ) + if server_routing: + if request_routing: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_OWNED, "Wardline routing is server-owned" + ) + else: + if not allow_request_routing: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_OWNED, + "Wardline routing is server-owned; configure LEGIS_WARDLINE_CELL " + "or LEGIS_WARDLINE_CELL_BY_SEVERITY", + ) + if request_fail_on is not None: + if request_cell is None or request_severity_map is not None: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, + "fail_on routing requires cell and forbids a per-severity map", + ) + elif (request_cell is None) == (request_severity_map is None): + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, + "provide exactly one of cell or a per-severity map", + ) + if request_severity_map is not None and not request_severity_map: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, "per-severity map must not be empty" + ) + + policy: WardlineCellPolicy | None = None + cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None + fail_on: WardlineSeverity | None = None + try: + if server_cell_by_severity is not None: + cell_map = _parse_cell_map_env(server_cell_by_severity) + elif server_cell is not None: + policy = WardlineCellPolicy(server_cell) + elif request_severity_map is not None: + cell_map = { + WardlineSeverity[sev]: WardlineCellPolicy(cell) + for sev, cell in request_severity_map.items() + } + else: + policy = WardlineCellPolicy(request_cell) # type: ignore[arg-type] + if request_fail_on is not None: + fail_on = WardlineSeverity[request_fail_on] + except (KeyError, ValueError) as exc: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, f"unknown cell/severity: {exc}" + ) from exc + + if fail_on is not None: + cells = {policy, WardlineCellPolicy.SURFACE_ONLY} + elif cell_map is not None: + cells = set(cell_map.values()) + else: + cells = {policy} + return ResolvedRouting( + policy=policy, + cell_map=cell_map, + fail_on=fail_on, + cells=frozenset(c for c in cells if c is not None), + ) + + def route_wardline_scan( scan: Mapping[str, Any], *, @@ -32,8 +164,11 @@ def route_wardline_scan( cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None, fail_on: WardlineSeverity | None = None, artifact_key: bytes | None = None, + allow_dirty: bool = False, ) -> list[dict[str, Any]]: - artifact_provenance = verify_wardline_artifact(scan, artifact_key) + artifact_provenance = verify_wardline_artifact( + scan, artifact_key, allow_dirty=allow_dirty + ) findings = active_defects(scan) def resolve(qualname: str | None) -> tuple[EntityKey, dict[str, Any]]: diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index c17b623..c999ddc 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -16,6 +16,7 @@ import hashlib import json +import logging import threading from collections.abc import Iterator from contextlib import contextmanager @@ -37,9 +38,51 @@ from legis.canonical import canonical_json, content_hash +logger = logging.getLogger(__name__) + GENESIS = "0" * 64 +def _apply_sqlite_pragmas(dbapi_connection: Any, url: str) -> None: + """Apply the durability/concurrency PRAGMAs to a freshly-opened connection. + + Best-effort: a PRAGMA failure must not break connection setup (the store is + still usable without WAL), but it must NOT vanish silently either. Two + distinct failure channels are surfaced: + + * An exception while issuing a PRAGMA → logged with ``exc_info``. + * WAL silently not taking effect → ``PRAGMA journal_mode=WAL`` does *not* + raise when WAL is unavailable (read-only mount, some network filesystems, + in-memory DBs); it returns the journal mode actually in force. The old + ``except Exception: pass`` never caught this most-likely case, so the + connection ran without WAL and the symptom surfaced much later as an + opaque "database is locked" under concurrency. Detect and log it here. + """ + cursor = dbapi_connection.cursor() + try: + journal_row = cursor.execute("PRAGMA journal_mode=WAL").fetchone() + cursor.execute("PRAGMA synchronous=NORMAL") + cursor.execute("PRAGMA busy_timeout=5000") + journal_mode = journal_row[0] if journal_row else None + if journal_mode is not None and str(journal_mode).lower() != "wal": + logger.warning( + "audit store SQLite did not enter WAL mode (journal_mode=%r, " + "url=%s); concurrent appends may surface as opaque 'database is " + "locked' errors instead of waiting", + journal_mode, + url, + ) + except Exception: # noqa: BLE001 (PRAGMA failure must not break connect) + logger.warning( + "audit store failed to apply SQLite PRAGMAs (url=%s); connection " + "falls back to defaults (no WAL / default busy_timeout)", + url, + exc_info=True, + ) + finally: + cursor.close() + + @dataclass(frozen=True) class AuditRecord: seq: int @@ -55,6 +98,11 @@ def _chain(prev_hash: str, c_hash: str) -> str: class AuditStore: def __init__(self, url: str) -> None: + # The federated store subtree (.weft/legis) is created lazily, here at + # open time — SQLite makes the .db file but never its parent directory. + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(url) # NullPool: hold no connection between operations — an append-only # audit store wants no lingering locks and clean resource lifecycle. self._engine = create_engine(url, future=True, poolclass=NullPool) @@ -68,17 +116,7 @@ def __init__(self, url: str) -> None: @event.listens_for(self._engine, "connect") def set_sqlite_pragma(dbapi_connection, connection_record): if "sqlite" in url: - cursor = dbapi_connection.cursor() - try: - cursor.execute("PRAGMA journal_mode=WAL") - cursor.execute("PRAGMA synchronous=NORMAL") - cursor.execute("PRAGMA busy_timeout=5000") - except Exception: - pass - finally: - cursor.close() - - # Remove the global force_immediate_transaction event listener to prevent locking on read-only queries. + _apply_sqlite_pragmas(dbapi_connection, url) self._md = MetaData() self._log = Table( @@ -122,13 +160,16 @@ def transaction(self) -> Iterator[None]: connection thread-locally; nested ``transaction()`` calls reuse the outer one. - Appends only. ``read_all`` / ``read_by_seq`` / ``verify_integrity`` open - their own connection via ``self._engine.begin()`` — they will NOT see - this batch's uncommitted appends, and on SQLite a read connection can - hit ``SQLITE_BUSY`` against the held ``BEGIN IMMEDIATE`` write lock. Do - all reads before entering the context (as ``wardline.governor`` does: it - resolves every entity before opening the batch). Only ``append``'s own - chain-head read is safe here, because it runs on the ambient connection. + Appends only, and now *enforced*: ``read_all`` / ``read_by_seq`` / + ``verify_integrity`` / ``get_latest_sequence_and_hash`` open their own + connection via ``self._engine.begin()``, so a read issued inside this + context would not see the batch's uncommitted appends and on SQLite would + hit ``SQLITE_BUSY`` against the held ``BEGIN IMMEDIATE`` write lock. Each + guards on the thread-local and raises ``RuntimeError`` rather than + contending silently (``_assert_no_batch_in_progress``). Do all reads + before entering the context (as ``wardline.governor`` does: it resolves + every entity before opening the batch). Only ``append``'s own chain-head + read is safe here, because it runs on the ambient connection. """ if getattr(self._txn, "conn", None) is not None: # Already inside a batch on this thread — reuse it (nested no-op). @@ -143,6 +184,26 @@ def transaction(self) -> Iterator[None]: finally: self._txn.conn = None + def _assert_no_batch_in_progress(self, method: str) -> None: + """Fail loudly if a fresh-connection read runs inside a held batch (Q-M5). + + ``transaction()`` holds a ``BEGIN IMMEDIATE`` write lock on the ambient + thread-local connection. Every public read opens its OWN connection, so + a read issued while the batch is held would (a) contend with that lock + (``SQLITE_BUSY`` on SQLite, and possibly no error on other backends) and + (b) miss the batch's uncommitted appends. The original contract relied on + callers never doing this; this guard *enforces* it, turning a silent, + backend-dependent contention into an explicit, deterministic error so a + future in-batch read in a gate append path fails its tests immediately. + """ + if getattr(self._txn, "conn", None) is not None: + raise RuntimeError( + f"AuditStore.{method}() called inside an active transaction() batch " + "on this thread. Fresh-connection reads contend with the batch's " + "held BEGIN IMMEDIATE write lock and cannot see its uncommitted " + "appends — resolve all reads before opening the batch (Q-M5)." + ) + def _insert(self, conn: Any, payload: dict[str, Any]) -> int: c_hash = content_hash(payload) prev = conn.execute( @@ -176,6 +237,7 @@ def append(self, payload: dict[str, Any]) -> int: return self._insert(conn, payload) def read_all(self) -> list[AuditRecord]: + self._assert_no_batch_in_progress("read_all") with self._engine.begin() as conn: rows = conn.execute( select(self._log).order_by(self._log.c.seq.asc()) @@ -192,6 +254,7 @@ def read_all(self) -> list[AuditRecord]: ] def read_by_seq(self, seq: int) -> AuditRecord | None: + self._assert_no_batch_in_progress("read_by_seq") with self._engine.begin() as conn: row = conn.execute( select(self._log).where(self._log.c.seq == seq) @@ -207,10 +270,24 @@ def read_by_seq(self, seq: int) -> AuditRecord | None: ) def verify_integrity(self) -> bool: + # O(N) by design: a full chain re-hash is the only way to detect + # out-of-band tampering of an arbitrary record (the hash chain gives O(1) + # verification of *appends*, never of a mutated prefix). Callers on + # interactive read paths (service.verified_records) pay this deliberately; + # see that function's cost note (rc4 review #7) for why it is not narrowed. + self._assert_no_batch_in_progress("verify_integrity") prev_hash = GENESIS try: records = self.read_all() except (json.JSONDecodeError, TypeError, ValueError): + # No seq survives a decode failure of the whole read; name the + # failure mode so an investigator knows the trail is unreadable + # rather than merely mismatched. + logger.error( + "audit trail integrity check failed: a record payload did not " + "decode as JSON", + exc_info=True, + ) return False for rec in records: # json.loads accepts Infinity/NaN, so a directly-tampered payload @@ -220,17 +297,43 @@ def verify_integrity(self) -> bool: try: computed = content_hash(rec.payload) except (ValueError, TypeError): + logger.error( + "audit trail integrity check failed at seq=%s: payload is " + "not canonicalizable (tamper)", + rec.seq, + exc_info=True, + ) return False if computed != rec.content_hash: + logger.error( + "audit trail integrity check failed at seq=%s: content hash " + "mismatch (recorded %s, recomputed %s)", + rec.seq, + rec.content_hash, + computed, + ) return False if rec.prev_hash != prev_hash: + logger.error( + "audit trail integrity check failed at seq=%s: broken chain " + "link (prev_hash %s != expected %s)", + rec.seq, + rec.prev_hash, + prev_hash, + ) return False if rec.chain_hash != _chain(rec.prev_hash, rec.content_hash): + logger.error( + "audit trail integrity check failed at seq=%s: chain hash " + "does not match prev+content", + rec.seq, + ) return False prev_hash = rec.chain_hash return True def get_latest_sequence_and_hash(self) -> tuple[int, str]: + self._assert_no_batch_in_progress("get_latest_sequence_and_hash") with self._engine.begin() as conn: row = conn.execute( select(self._log.c.seq, self._log.c.chain_hash) diff --git a/src/legis/store/protocol.py b/src/legis/store/protocol.py index dc0a3e8..db10c6f 100644 --- a/src/legis/store/protocol.py +++ b/src/legis/store/protocol.py @@ -37,6 +37,8 @@ def transaction(self) -> AbstractContextManager[None]: ``read_by_seq``, ``verify_integrity``) is NOT guaranteed to observe uncommitted appends from the same batch — it sees a pre-batch snapshot — and on a single-connection backend (SQLite) may contend with the - held write transaction. Resolve all reads before opening the batch. + held write transaction. Resolve all reads before opening the batch. The + SQLite implementation (``AuditStore``) *enforces* this: an in-batch read + on the same thread raises ``RuntimeError`` instead of contending. """ ... diff --git a/src/legis/wardline/ingest.py b/src/legis/wardline/ingest.py index 825c36c..538f723 100644 --- a/src/legis/wardline/ingest.py +++ b/src/legis/wardline/ingest.py @@ -53,6 +53,52 @@ class WardlinePayloadError(ValueError): """A Wardline scan payload is not shaped like the trusted wire contract.""" +class ArtifactStatus(str, Enum): + """How far the Wardline artifact's provenance verified (str,Enum — the member + IS its bare-string wire value, so records serialize byte-identically).""" + + VERIFIED = "verified" + DIRTY = "dirty" + UNVERIFIED = "unverified" + + +class ScanOutcome(str, Enum): + """The ``scan_route`` boundary outcome (str,Enum — bare-string wire). + + ``ROUTED`` — findings were governed into the configured cell. A dirty working + tree is not a malformed payload — it is "the dev environment is not ready for + a signed artifact yet". wardline emits an UNSIGNED, ``dirty: true`` dev + artifact for this case (signing stays clean-tree-only); legis classifies it + as the typed amber ``SKIPPED_DIRTY_TREE`` state, NOT a generic red, so a + harness can tell "commit first" apart from "legis/the scan is broken". + """ + + ROUTED = "ROUTED" + SKIPPED_DIRTY_TREE = "SKIPPED_DIRTY_TREE" + + +# Back-compat alias for the bare-string constant callers/tests imported before the +# enum existed; ``== "SKIPPED_DIRTY_TREE"`` still holds (str,Enum). +SKIPPED_DIRTY_TREE = ScanOutcome.SKIPPED_DIRTY_TREE + + +class WardlineDirtyTreeError(Exception): + """A dirty-tree dev artifact arrived where signed CI provenance is required. + + Deliberately NOT a ``WardlinePayloadError`` (which boundaries map to a + generic red — HTTP 422 / MCP ``INVALID_ARGUMENT``): the whole point is that + this amber/skipped state is *distinguishable* from a malformed-or-tampered + payload. Raised only in the CI posture (artifact key configured) when the + dirty dev artifact is unsigned and the dev-mode opt-in is off. Boundaries + catch it and surface a typed ``SKIPPED_DIRTY_TREE`` outcome. + """ + + # A ScanOutcome member (via the alias). Boundaries put it straight into the + # response as ``{"outcome": exc.reason}`` (app.py / mcp.py), so it is relied + # on to serialize as the bare ``"SKIPPED_DIRTY_TREE"`` string on the wire. + reason = SKIPPED_DIRTY_TREE + + def wardline_artifact_fields(scan: Mapping[str, Any]) -> dict[str, Any]: """The Wardline artifact payload covered by ``artifact_signature``.""" if not isinstance(scan, Mapping): @@ -67,25 +113,66 @@ def wardline_artifact_fields(scan: Mapping[str, Any]) -> dict[str, Any]: def verify_wardline_artifact( scan: Mapping[str, Any], artifact_key: bytes | None, + *, + allow_dirty: bool = False, ) -> dict[str, Any]: """Validate optional server-required artifact authentication. When ``artifact_key`` is configured, the scan must carry signed scanner, rule-set, commit, and tree provenance. Without a configured key we still record any supplied metadata, but mark it explicitly unverified. + + Dirty-tree dev artifacts (``dirty: true`` + no signature — wardline + ``--allow-dirty``) are a typed amber case, never a generic red: + + * keyless dev posture — already permissive; the scan governs, but the + dirty marker is recorded honestly (``artifact_status == "dirty"``) so a + dirty dev scan is distinguishable from a clean unsigned one. + * CI posture (``artifact_key`` configured) — by default a dirty dev + artifact raises :class:`WardlineDirtyTreeError` (the boundary surfaces a + typed ``SKIPPED_DIRTY_TREE`` outcome). ``allow_dirty`` is the explicit + server-side dev-mode opt-in that lets legis govern it UNSIGNED, recorded + as ``"dirty"`` (never ``"verified"``). + + The relaxation is scoped to exactly ``dirty is True AND no signature``: a + signed payload still verifies normally (so a forged signature stays red), + and a clean unsigned payload still requires a signature (``allow_dirty`` + relaxes only the dirty case, not "any unsigned"). ``dirty`` is checked as + strict boolean ``True`` because the scan dict is caller-controlled. """ fields = wardline_artifact_fields(scan) - provenance = { - "artifact_status": "unverified", + provenance: dict[str, Any] = { + "artifact_status": ArtifactStatus.UNVERIFIED, } for key in ARTIFACT_PROVENANCE_FIELDS: value = scan.get(key) if isinstance(value, str) and value: provenance[key] = value + signature_present = isinstance(scan.get(ARTIFACT_SIGNATURE_FIELD), str) and bool( + scan.get(ARTIFACT_SIGNATURE_FIELD) + ) + is_dirty_dev_artifact = scan.get("dirty") is True and not signature_present + if artifact_key is None: + if is_dirty_dev_artifact: + provenance["artifact_status"] = ArtifactStatus.DIRTY return provenance + if is_dirty_dev_artifact: + if not allow_dirty: + raise WardlineDirtyTreeError( + "wardline emitted an unsigned dirty-tree dev artifact " + "(dirty: true); signing is clean-tree-only. Commit for a " + "signed artifact, or set LEGIS_WARDLINE_ALLOW_DIRTY=1 to " + "govern it unsigned in dev." + ) + return { + "artifact_status": ArtifactStatus.DIRTY, + **{key: value for key in ARTIFACT_PROVENANCE_FIELDS + if isinstance(value := scan.get(key), str) and value}, + } + missing = [ key for key in ARTIFACT_PROVENANCE_FIELDS if not isinstance(scan.get(key), str) or not scan[key] @@ -101,7 +188,7 @@ def verify_wardline_artifact( if not verify(fields, signature, artifact_key): raise WardlinePayloadError("Wardline artifact signature does not verify") return { - "artifact_status": "verified", + "artifact_status": ArtifactStatus.VERIFIED, **{key: scan[key] for key in ARTIFACT_PROVENANCE_FIELDS}, "artifact_signature": signature, } @@ -169,8 +256,28 @@ def from_wire(cls, d: Mapping[str, Any]) -> "WardlineFinding": # be able to silently dismiss a defect. Non-agent suppressions # (``baselined`` / ``judged``) are simply not active and carry no proof. Any # other state is malformed and rejected. -AGENT_SUPPRESSED: frozenset[str] = frozenset({"waived", "suppressed"}) -NON_AGENT_SUPPRESSED: frozenset[str] = frozenset({"baselined", "judged"}) +class Suppressed(str, Enum): + """The finding suppression-state vocabulary (str,Enum — bare-string wire). + + The ``suppressed`` field stays ``str`` on the wire-facing dataclass so the + validation timing is unchanged (any string is accepted off the wire; only a + *defect* with an out-of-vocabulary state is rejected, in ``active_defects``). + This enum is the single source of truth for the vocabulary — members compare + and hash equal to their strings, so the frozensets below match the bare + ``suppressed`` strings carried verbatim from the scan. + """ + + ACTIVE = "active" + WAIVED = "waived" + SUPPRESSED = "suppressed" + BASELINED = "baselined" + JUDGED = "judged" + + +AGENT_SUPPRESSED: frozenset[Suppressed] = frozenset({Suppressed.WAIVED, Suppressed.SUPPRESSED}) +NON_AGENT_SUPPRESSED: frozenset[Suppressed] = frozenset( + {Suppressed.BASELINED, Suppressed.JUDGED} +) def _has_suppression_proof(finding: Mapping[str, Any]) -> bool: @@ -209,7 +316,7 @@ def active_defects(scan: Mapping[str, Any]) -> list[WardlineFinding]: f = WardlineFinding.from_wire(raw) if f.kind != "defect": continue - if f.suppressed == "active": + if f.suppressed == Suppressed.ACTIVE: out.append(f) continue if f.suppressed in AGENT_SUPPRESSED: diff --git a/src/legis/weft_signing.py b/src/legis/weft_signing.py new file mode 100644 index 0000000..bfa4f24 --- /dev/null +++ b/src/legis/weft_signing.py @@ -0,0 +1,82 @@ +"""Shared Weft-component transport-HMAC seam. + +The Loomweave SEI client (``identity/loomweave_client.py``) and the Filigree +association client (``filigree/client.py``) authenticate their requests to a +sibling Weft component with the *same* wire scheme: an +``X-Weft-Component: :`` header alongside ``X-Weft-Timestamp`` and +``X-Weft-Nonce``, where the HMAC is computed over +``METHOD\\npath?query\\nsha256(body)\\ntimestamp\\nnonce``. This module is the +single definition of that scheme so the two channels cannot silently diverge — +a change to the canonicalization or header shape now happens in one place. + +Canonicalization contract: the signed body bytes are +``json.dumps(body, sort_keys=True, separators=(",", ":"))`` with the default +``ensure_ascii=True``. This is deliberately **NOT** ``canonical.canonical_json``, +whose ``ensure_ascii=False`` is the byte-for-byte HMAC contract shared with +Wardline; routing a transport body through it would change every signed +request's bytes. The wire transport MUST send exactly ``weft_body_bytes(body)`` +and a verifier MUST recanonicalize identically before hashing. +""" + +from __future__ import annotations + +import hashlib +import hmac +import json +import os +import urllib.parse + + +def weft_body_bytes(body: dict | None) -> bytes: + """Serialize a request body to the exact bytes the signature commits to.""" + if body is None: + return b"" + return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") + + +def weft_path_and_query(url: str) -> str: + """The path (plus query, if any) the signed message commits to.""" + parsed = urllib.parse.urlsplit(url) + path_and_query = parsed.path or "/" + if parsed.query: + path_and_query = f"{path_and_query}?{parsed.query}" + return path_and_query + + +def sign_weft_request( + component: str, + key: bytes, + method: str, + url: str, + body: dict | None, + *, + timestamp: int, + nonce: str, +) -> dict[str, str]: + """Return the Weft-component HMAC request headers for ``component``. + + ``timestamp`` and ``nonce`` are injected (not generated here) so the + signature is deterministically testable. + """ + body_hash = hashlib.sha256(weft_body_bytes(body)).hexdigest() + message = ( + f"{method}\n{weft_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" + ).encode("utf-8") + signature = hmac.new(key, message, hashlib.sha256).hexdigest() + return { + "X-Weft-Component": f"{component}:{signature}", + "X-Weft-Timestamp": str(timestamp), + "X-Weft-Nonce": nonce, + } + + +def weft_hmac_key_from_env(component_env_var: str) -> bytes | None: + """Resolve a channel HMAC key without making it mandatory. + + The channel-specific variable (e.g. ``LEGIS_LOOMWEAVE_HMAC_KEY``) wins; an + absent channel key falls back to the shared ``LEGIS_HMAC_KEY``; absent both, + the channel is unsigned (backward compatible with deployments that have not + provisioned a key yet). + """ + value = os.environ.get(component_env_var) or os.environ.get("LEGIS_HMAC_KEY") + return value.encode("utf-8") if value else None diff --git a/tests/api/test_combinations_api.py b/tests/api/test_combinations_api.py index 80a885a..16ca506 100644 --- a/tests/api/test_combinations_api.py +++ b/tests/api/test_combinations_api.py @@ -386,7 +386,7 @@ def test_scan_results_rejects_both_or_neither_cell_form(tmp_path): def test_scan_results_block_escalate_only_needs_no_engine(tmp_path): # A pure block_escalate scan must route with only a signoff gate wired — no - # enforcement engine, so engine()'s lazy legis-governance.db is never created. + # enforcement engine, so engine()'s lazy .weft/legis/legis-governance.db is never created. sg = SignoffGate(AuditStore(f"sqlite:///{tmp_path / 's.db'}"), FixedClock("2026-06-02T12:00:00+00:00")) c = TestClient(create_app(signoff_gate=sg)) # NOT _client: no enforcement injected @@ -556,6 +556,74 @@ def test_scan_results_records_verified_artifact_provenance(tmp_path, monkeypatch assert wardline["artifact_signature"].startswith("hmac-sha256:v2:") +def _dirty_wardline_scan(): + return { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "dirty": True, + "findings": [ + {"rule_id": "R", "message": "m", "severity": "INFO", "kind": "defect", + "fingerprint": "fp", "qualname": "m.f", "properties": {}, "suppressed": "active"} + ], + } + + +def test_scan_results_dirty_tree_is_amber_skip_not_red(tmp_path, monkeypatch): + # P1: key configured, dirty + unsigned, no dev-mode -> HTTP 200 typed amber + # SKIPPED_DIRTY_TREE (distinguishable from the 422 generic red); nothing + # governed. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.delenv("LEGIS_WARDLINE_ALLOW_DIRTY", raising=False) + c = _client(tmp_path) + + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + + assert resp.status_code == 200 + body = resp.json() + assert body["outcome"] == "SKIPPED_DIRTY_TREE" + assert body["routed"] == [] + assert c.get("/overrides").json() == [] + + +def test_scan_results_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypatch): + # P0: the explicit dev-mode opt-in governs the unsigned dirty artifact, + # recorded honestly as artifact_status="dirty". + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", "1") + c = _client(tmp_path) + + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + + assert resp.status_code == 200 + assert resp.json()["outcome"] == "ROUTED" + wardline = c.get("/overrides").json()[0]["extensions"]["wardline"] + assert wardline["artifact_status"] == "dirty" + assert "artifact_signature" not in wardline + + +def test_scan_results_devmode_optin_is_strict_and_fails_safe(tmp_path, monkeypatch): + # The dev-mode opt-in is `LEGIS_WARDLINE_ALLOW_DIRTY == "1"` exactly. A + # governing knob that gates UNSIGNED artifacts must fail safe: any value other + # than "1" (truthy-looking "true", "0", "yes") must NOT govern — it stays the + # typed amber skip. Pins the strict parse against a future drift to truthiness. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + for value in ("0", "true", "True", "yes", "2", ""): + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", value) + c = _client(tmp_path) + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + assert resp.status_code == 200, value + assert resp.json()["outcome"] == "SKIPPED_DIRTY_TREE", value + assert resp.json()["routed"] == [], value + + def test_scan_results_single_cell_still_works(tmp_path): c = _client(tmp_path) body = {"cell": "surface_override", "agent_id": "agent-1", "scan": {"findings": [ diff --git a/tests/conftest.py b/tests/conftest.py index 2db5518..0f466f8 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -15,6 +15,30 @@ } +@pytest.fixture(autouse=True) +def _isolate_legis_store_locations( + tmp_path_factory: pytest.TempPathFactory, monkeypatch: pytest.MonkeyPatch +) -> None: + """Redirect every legis store to a per-test tmp dir. + + Store URLs default to the cwd-relative ``.weft/legis/`` subtree (see + ``legis.config``); a test that builds a default-path store without pinning a + location would otherwise drop that subtree into the repo working tree. + Pointing the four ``LEGIS_*_DB`` env vars at a unique tmp directory isolates + them centrally for the whole suite (legis-3d295a6f7f). A test that sets — or + deletes — its own ``LEGIS_*_DB`` still overrides this, since its monkeypatch + runs after the fixture. + """ + store = tmp_path_factory.mktemp("legis-store") + for var, name in ( + ("LEGIS_CHECK_DB", "legis-checks.db"), + ("LEGIS_GOVERNANCE_DB", "legis-governance.db"), + ("LEGIS_BINDING_DB", "legis-binding.db"), + ("LEGIS_PULL_DB", "legis-pulls.db"), + ): + monkeypatch.setenv(var, f"sqlite:///{(store / name).as_posix()}") + + @pytest.fixture def unsafe_dev_auth(monkeypatch: pytest.MonkeyPatch) -> None: monkeypatch.setenv("LEGIS_UNSAFE_DEV_AUTH", "1") diff --git a/tests/enforcement/test_protected_extensions.py b/tests/enforcement/test_protected_extensions.py index a49021b..c3b6176 100644 --- a/tests/enforcement/test_protected_extensions.py +++ b/tests/enforcement/test_protected_extensions.py @@ -1,5 +1,12 @@ +import pytest + from legis.clock import FixedClock -from legis.enforcement.protected import ProtectedGate, TrailVerifier, signing_fields +from legis.enforcement.protected import ( + ProtectedGate, + TamperError, + TrailVerifier, + signing_fields, +) from legis.enforcement.signing import verify from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey @@ -48,10 +55,6 @@ def test_loomweave_block_does_not_break_the_signature(tmp_path): assert verify(signing_fields(payload), sig, KEY) is True -import pytest -from legis.enforcement.protected import TamperError - - def test_mutating_loomweave_block_invalidates_the_signature(tmp_path): # Discriminating regression lock for WP-A1/L-05: the loomweave block must be bound # to the signed field set. Mutating it after signing MUST break the signature. diff --git a/tests/enforcement/test_regressions.py b/tests/enforcement/test_regressions.py index ca43c97..ba20af2 100644 --- a/tests/enforcement/test_regressions.py +++ b/tests/enforcement/test_regressions.py @@ -5,10 +5,8 @@ from legis.api.app import create_app from legis.cli import main from legis.clock import FixedClock -from legis.enforcement.engine import EnforcementEngine from legis.enforcement.signoff import SignoffGate from legis.git.surface import GitSurface, GitError -from legis.identity.entity_key import EntityKey from legis.policy.decorator import check_policy_boundary, policy_boundary, fingerprint from legis.policy.grammar import PolicyGrammar, PolicyResult from legis.policy.exemptions import ExemptionRegistry, Exemption diff --git a/tests/enforcement/test_signing.py b/tests/enforcement/test_signing.py index afb5514..524171b 100644 --- a/tests/enforcement/test_signing.py +++ b/tests/enforcement/test_signing.py @@ -1,4 +1,6 @@ -from legis.enforcement.signing import SIG_PREFIX, SIG_PREFIX_V1, sign, verify +import pytest + +from legis.enforcement.signing import SIG_PREFIX, sign, verify def test_sign_is_prefixed_and_deterministic(): @@ -19,8 +21,13 @@ def test_verify_round_trips_and_rejects_wrong_key_or_tamper(): assert verify(fields, "", b"key-1") is False -def test_verify_accepts_explicit_legacy_v1_signature(): +def test_verify_rejects_unknown_prefix(): fields = {"verdict": "ACCEPTED", "policy": "p"} - sig = sign(fields, b"key-1", version="v1") - assert sig.startswith(SIG_PREFIX_V1) - assert verify(fields, sig, b"key-1") is True + sig = sign(fields, b"key-1") + forged = sig.replace("v2", "v1", 1) # a tag verify no longer recognises + assert verify(fields, forged, b"key-1") is False + + +def test_sign_rejects_unknown_version(): + with pytest.raises(ValueError, match="unsupported signature version"): + sign({"verdict": "ACCEPTED"}, b"key-1", version="v1") diff --git a/tests/enforcement/test_trail_verify.py b/tests/enforcement/test_trail_verify.py index 3c32654..a67edb0 100644 --- a/tests/enforcement/test_trail_verify.py +++ b/tests/enforcement/test_trail_verify.py @@ -7,9 +7,7 @@ ProtectedGate, TamperError, TrailVerifier, - legacy_signing_fields, ) -from legis.enforcement.signing import sign from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey from legis.store.audit_store import GENESIS, AuditStore, _chain @@ -55,19 +53,6 @@ def test_clean_protected_trail_verifies(tmp_path): TrailVerifier(KEY, PROTECTED).verify(store.read_all()) # no raise -def test_legacy_v1_protected_signature_still_verifies(tmp_path): - g, store = _gate(tmp_path / "gov.db") - _submit(g) - - def replace_with_legacy_signature(p): - p["extensions"]["judge_metadata_signature"] = sign( - legacy_signing_fields(p), KEY, version="v1" - ) - - _edit_payload_and_rechain(tmp_path / "gov.db", replace_with_legacy_signature) - TrailVerifier(KEY, PROTECTED).verify(store.read_all()) # no raise - - def test_missing_signature_on_protected_policy_is_tampering(tmp_path): g, store = _gate(tmp_path / "gov.db") _submit(g) diff --git a/tests/filigree/test_client.py b/tests/filigree/test_client.py index 6eaf477..052fb07 100644 --- a/tests/filigree/test_client.py +++ b/tests/filigree/test_client.py @@ -1,5 +1,6 @@ import pytest +import legis.filigree.client as client_mod from legis.filigree.client import FiligreeError, HttpFiligreeClient @@ -213,3 +214,64 @@ def fake_urlopen(req, timeout=None): ).encode("utf-8") expected = hmac.new(key, message, hashlib.sha256).hexdigest() assert signature == expected + + +# --- roadmap 13: transport / error-path branches (the surface a security +# reviewer cares about, and the unsigned-transport seam tied to Q-M4) --- + +def test_json_body_bytes_none_is_empty(): + # A None body signs and sends zero bytes (the body-hash is over b""). + assert client_mod._json_body_bytes(None) == b"" + + +def test_path_and_query_includes_query_string(): + # The signed message commits to path AND query; a verifier that dropped the + # query would compute a different signature, so the query must be carried. + assert ( + client_mod._path_and_query("https://filigree/api/entity-associations?entity_id=x") + == "/api/entity-associations?entity_id=x" + ) + # No query -> bare path; empty path -> "/". + assert client_mod._path_and_query("https://filigree/api/x") == "/api/x" + assert client_mod._path_and_query("https://filigree") == "/" + + +def test_urllib_fetch_wraps_transport_error(monkeypatch): + # A urllib URLError (DNS failure, connection refused, timeout) surfaces as a + # typed FiligreeError, never an unhandled urllib exception. + import urllib.request + + def boom(req, timeout=None): + raise urllib.error.URLError("connection refused") + + monkeypatch.setattr(urllib.request, "urlopen", boom) + with pytest.raises(FiligreeError, match="connection refused"): + client_mod._urllib_fetch("GET", "https://filigree.example/api/x", None) + + +def test_decode_rejects_non_json_content_type(): + # A proxy/error page returning text/html must not be json-parsed; it is a + # typed transport error. + class _HtmlResp: + headers = {"Content-Type": "text/html; charset=utf-8"} + + def read(self, _n): # pragma: no cover - not reached; type check first + return b"503" + + with pytest.raises(FiligreeError, match="non-JSON content type"): + client_mod._decode_json_response(_HtmlResp(), "GET /api/x") + + +def test_decode_rejects_oversized_response(): + # A response larger than MAX_RESPONSE_BYTES is rejected before decode so a + # hostile/buggy Filigree cannot exhaust memory. + big = b"x" * (client_mod.MAX_RESPONSE_BYTES + 1) + + class _BigResp: + headers = {"Content-Type": "application/json"} + + def read(self, n): + return big[:n] + + with pytest.raises(FiligreeError, match="response too large"): + client_mod._decode_json_response(_BigResp(), "GET /api/x") diff --git a/tests/identity/test_resolver.py b/tests/identity/test_resolver.py index 8461c77..d3bb159 100644 --- a/tests/identity/test_resolver.py +++ b/tests/identity/test_resolver.py @@ -1,14 +1,34 @@ +import logging + +import pytest + from legis.canonical import content_hash -from legis.identity.resolver import IdentityResolver +from legis.identity.entity_key import EntityKey +from legis.identity.resolver import ( + IdentityResolution, + IdentityResolutionStatus, + IdentityResolver, + LineageSnapshotStatus, +) class FakeClient: - def __init__(self, *, capable=True, resolve=None, lineage=None, boom=False, lineage_boom=False): + def __init__( + self, + *, + capable=True, + resolve=None, + lineage=None, + boom=False, + lineage_boom=False, + resolve_boom=False, + ): self._capable = capable self._resolve = resolve or {"alive": False} self._lineage = lineage or [] self._boom = boom self._lineage_boom = lineage_boom + self._resolve_boom = resolve_boom def capability(self): if self._boom: @@ -16,6 +36,8 @@ def capability(self): return self._capable def resolve_locator(self, locator): + if self._resolve_boom: + raise RuntimeError("resolve_locator down") return self._resolve def resolve_sei(self, sei): # not used by the resolver @@ -45,6 +67,119 @@ def test_alive_sei_is_keyed_opaquely_with_two_axes(): assert res.lineage_snapshot_status == "verified" +# --- the str,Enum axes + the IdentityResolution construction invariant --- + + +def test_status_axes_are_str_enums_serializing_to_bare_strings(): + # str,Enum members ARE their wire string — comparison and serialization + # are byte-identical to the old bare strings (the whole compat argument). + assert IdentityResolutionStatus.RESOLVED == "resolved" + assert LineageSnapshotStatus.NOT_APPLICABLE == "not_applicable" + assert content_hash({"s": IdentityResolutionStatus.NOT_ALIVE}) == content_hash( + {"s": "not_alive"} + ) + + +def test_identity_resolution_rejects_contradictory_status_alive(): + # The sharpest case: a frozen record claiming "resolved" while alive is False + # is self-contradictory and must be unrepresentable at construction. + ek = EntityKey.from_locator("python:function:m.f") + with pytest.raises(ValueError): + IdentityResolution( + ek, + False, + None, + None, + IdentityResolutionStatus.RESOLVED, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, + None, + None, + None, + IdentityResolutionStatus.NOT_ALIVE, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, + True, + None, + None, + IdentityResolutionStatus.UNAVAILABLE, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + + +def test_identity_resolution_accepts_the_three_consistent_shapes(): + ek = EntityKey.from_locator("python:function:m.f") + # alive None ↔ UNAVAILABLE, False ↔ NOT_ALIVE, True ↔ RESOLVED + IdentityResolution( + ek, None, None, None, + IdentityResolutionStatus.UNAVAILABLE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + IdentityResolution( + ek, False, None, None, + IdentityResolutionStatus.NOT_ALIVE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + IdentityResolution( + ek, True, "h", {"length": 1, "hash": "x"}, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + + +def test_identity_resolution_rejects_contradictory_lineage_axis(): + # The lineage axis is the other half of the record: a snapshot is present + # iff the status is VERIFIED. Any crossed pair is self-contradictory. + ek = EntityKey.from_locator("python:function:m.f") + # VERIFIED but no snapshot. + with pytest.raises(ValueError): + IdentityResolution( + ek, True, "h", None, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + # Snapshot present but status NOT_APPLICABLE. + with pytest.raises(ValueError): + IdentityResolution( + ek, False, None, {"length": 1, "hash": "x"}, + IdentityResolutionStatus.NOT_ALIVE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + # Snapshot present but status UNAVAILABLE. + with pytest.raises(ValueError): + IdentityResolution( + ek, True, "h", {"length": 1, "hash": "x"}, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.UNAVAILABLE, + ) + + +def test_identity_resolution_accepts_resolved_with_unavailable_lineage(): + # A real producer shape: RESOLVED identity but the lineage probe failed — + # snapshot None, status UNAVAILABLE. Must construct. + ek = EntityKey.from_locator("python:function:m.f") + IdentityResolution( + ek, True, "h", None, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.UNAVAILABLE, + ) + + +def test_identity_resolution_rejects_non_bool_alive_as_value_error(): + # A non-bool alive (and int aliases like 1/0 that collide with True/False) + # must raise the guard's own ValueError, not a KeyError. + ek = EntityKey.from_locator("python:function:m.f") + with pytest.raises(ValueError): + IdentityResolution( + ek, "yes", None, None, # type: ignore[arg-type] + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, 1, None, None, # type: ignore[arg-type] + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + + def test_capability_absent_degrades_to_locator(): r = IdentityResolver(FakeClient(capable=False)) res = r.resolve("python:function:m.f") @@ -108,3 +243,118 @@ def test_alive_sei_with_lineage_failure_records_unavailable_status(): assert res.lineage_snapshot is None assert res.identity_resolution_status == "resolved" assert res.lineage_snapshot_status == "unavailable" + + +# --- each degrade path must leave an operator-visible trail. A broken Loomweave +# (auth/network/HMAC failure) returns the SAME typed-degraded record as a genuine +# "no SEI" — so when governance shows `unavailable` en masse, the WARNING is the +# only thing telling an operator "integration broken" from "nothing to resolve". +# One test per except block, each needing a differently-configured fake. --- + + +def test_capability_probe_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + assert res.entity_key.identity_stable is False # typed return unchanged + assert caplog.records, "expected a warning when capability() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + +def test_resolve_locator_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(resolve_boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + assert res.entity_key.identity_stable is False # typed return unchanged + assert caplog.records, "expected a warning when resolve_locator() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + +def test_lineage_snapshot_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(resolve=ALIVE, lineage_boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + # The resolution still succeeds; only the lineage axis degrades — but the + # failure must still surface. + assert res.alive is True + assert res.lineage_snapshot_status == "unavailable" + assert caplog.records, "expected a warning when lineage() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + +# --- Q-L6: the capability latch must revalidate (TTL), and content_hash must be +# type-checked, not trusted verbatim from the Loomweave response. --- + + +class _Probe(FakeClient): + """A client whose capability can be flipped, counting probes.""" + + def __init__(self, *, capable=True, resolve=None, lineage=None): + super().__init__(capable=capable, resolve=resolve, lineage=lineage) + self.probes = 0 + + def capability(self): + self.probes += 1 + return self._capable + + +def test_capability_is_cached_within_ttl(): + # Within the TTL window the positive latch is reused — one probe across many + # resolves (the caching the original code intended). + clock = {"t": 1000.0} + client = _Probe(resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + for _ in range(5): + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + +def test_capability_latch_revalidates_after_ttl(): + # A Loomweave that LOSES the sei capability mid-life must not be treated as + # capable forever by a long-lived resolver. After the TTL elapses the latch + # is re-probed and the resolver honestly degrades. + clock = {"t": 1000.0} + client = _Probe(resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + client._capable = False # capability revoked upstream + clock["t"] += 299.0 # still within TTL → stale latch reused + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + clock["t"] += 2.0 # now past TTL → re-probe, sees the loss + assert r.resolve("python:function:m.f").entity_key.identity_stable is False + assert client.probes == 2 + + +def test_capability_regained_after_ttl_is_noticed(): + # Symmetric to revocation: a negative latch must also age out, so a Loomweave + # that GAINS the capability is eventually picked up. + clock = {"t": 0.0} + client = _Probe(capable=False, resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + + assert r.resolve("python:function:m.f").entity_key.identity_stable is False + client._capable = True + clock["t"] += 301.0 + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + + +def test_non_string_content_hash_is_dropped(): + # content_hash is carried verbatim into the record; a non-string value from a + # buggy/hostile Loomweave must not land in the typed str|None field. + for bad in (12345, {"nested": "obj"}, ["list"], 3.14): + resolve = {**ALIVE, "content_hash": bad} + r = IdentityResolver(FakeClient(resolve=resolve, lineage=[{"event": "born"}])) + res = r.resolve("python:function:m.f") + assert res.entity_key.value == "loomweave:eid:deadbeef" + assert res.content_hash is None diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index d48c991..15b0411 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1,5 +1,6 @@ import io import json +import logging import sqlite3 from legis.canonical import canonical_json, content_hash @@ -208,7 +209,10 @@ def test_tools_reject_before_initialize(tmp_path): assert responses[0]["error"]["code"] == -32002 -def test_initialize_rejects_unsupported_protocol_version(tmp_path): +def test_initialize_negotiates_unsupported_protocol_version(tmp_path): + # MCP spec: an unsupported (or newer) requested version must not hard-error; + # the server replies with a version it does support and lets the client + # decide. This is what lets newer clients (e.g. 2025-06-18) connect. runtime, _store = _runtime(tmp_path) runtime.initialized = False @@ -218,14 +222,15 @@ def test_initialize_rejects_unsupported_protocol_version(tmp_path): "jsonrpc": "2.0", "id": 1, "method": "initialize", - "params": {"protocolVersion": "1999-01-01"}, + "params": {"protocolVersion": "2025-06-18"}, } ), runtime, ) - assert responses[0]["error"]["code"] == -32602 - assert "2025-03-26" in responses[0]["error"]["data"]["supported"] + assert "error" not in responses[0] + assert responses[0]["result"]["protocolVersion"] == "2025-03-26" + assert responses[0]["result"]["serverInfo"]["name"] == "legis" def test_build_runtime_initialize_does_not_create_local_state(tmp_path, monkeypatch): @@ -244,9 +249,12 @@ def test_build_runtime_initialize_does_not_create_local_state(tmp_path, monkeypa ) assert responses[0]["result"]["serverInfo"]["name"] == "legis" - assert not (tmp_path / "legis-governance.db").exists() - assert not (tmp_path / "legis-checks.db").exists() - assert not (tmp_path / "legis-pulls.db").exists() + # The federated store subtree must not be created on the initialize path — + # stores are opened lazily, so neither the .weft/legis dir nor any DB appears. + assert not (tmp_path / ".weft").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-governance.db").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-checks.db").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-pulls.db").exists() def test_policy_explain_returns_service_explanation_payload(tmp_path): @@ -862,6 +870,31 @@ def test_scan_route_requires_exactly_one_cell_spec_and_routes_findings(tmp_path, } +def test_scan_route_rejects_empty_severity_map(tmp_path, monkeypatch): + # Drift fix: the HTTP adapter already rejected an empty cell_by_severity, but + # MCP silently accepted an empty severity_map (routed nothing). Both transports + # now reject it up front via the shared resolver — no silent governance skip. + monkeypatch.setenv("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING", "1") + runtime, store = _runtime(tmp_path) + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "scan_route", + "arguments": {"scan": _active_scan(), "severity_map": {}}, + }, + } + ), + runtime, + )[0]["result"] + assert result["isError"] is True + assert result["structuredContent"]["error_code"] == "INVALID_CELL_SPEC" + assert store.read_all() == [] + + def test_scan_route_rejects_request_routing_when_server_owned(tmp_path, monkeypatch): monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") runtime, store = _runtime(tmp_path) @@ -997,6 +1030,109 @@ def test_scan_route_records_verified_artifact_provenance(tmp_path, monkeypatch): assert wardline["artifact_signature"].startswith("hmac-sha256:v2:") +def _dirty_scan(): + return { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "dirty": True, + **_active_scan(), + } + + +def test_scan_route_dirty_tree_is_amber_skip_not_red(tmp_path, monkeypatch): + # P1: a dirty dev artifact in the CI posture (key configured) is a typed + # amber SKIPPED_DIRTY_TREE outcome, NOT the generic INVALID_ARGUMENT red, + # and nothing is governed. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + monkeypatch.delenv("LEGIS_WARDLINE_ALLOW_DIRTY", raising=False) + runtime, store = _runtime(tmp_path) + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": _dirty_scan()}}, + } + ), + runtime, + )[0]["result"] + + assert result.get("isError") is not True + structured = result["structuredContent"] + assert structured["outcome"] == "SKIPPED_DIRTY_TREE" + assert structured["routed"] == [] + assert store.read_all() == [] + + +def test_scan_route_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypatch): + # P0: the explicit server-side dev-mode opt-in governs the unsigned dirty + # artifact, recorded honestly as artifact_status="dirty". + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", "1") + runtime, store = _runtime(tmp_path) + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": _dirty_scan()}}, + } + ), + runtime, + )[0]["result"]["structuredContent"] + + assert result["outcome"] == "ROUTED" + assert result["routed"][0]["mode"] == "surface_only" + wardline = store.read_all()[0].payload["extensions"]["wardline"] + assert wardline["artifact_status"] == "dirty" + assert "artifact_signature" not in wardline + + +def test_scan_route_malformed_finding_is_invalid_argument_red(tmp_path, monkeypatch): + # The other half of the dirty-vs-malformed contract (cf. the amber test + # above): a malformed finding — here an unknown severity — is a generic red + # INVALID_ARGUMENT, NOT the amber SKIPPED_DIRTY_TREE. WardlinePayloadError is + # deliberately not a WardlineDirtyTreeError, so the boundary keeps "broken or + # tampered scan" distinct from "commit first". Nothing is governed. + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + runtime, store = _runtime(tmp_path) + malformed = { + "findings": [ + { + "rule_id": "PY-WL-101", + "message": "untrusted reaches trusted", + "severity": "NOT_A_SEVERITY", + "kind": "defect", + "fingerprint": "fp1", + } + ] + } + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": malformed}}, + } + ), + runtime, + )[0]["result"] + + assert result["isError"] is True + assert result["structuredContent"]["error_code"] == "INVALID_ARGUMENT" + assert store.read_all() == [] + + def test_scan_route_fail_on_threshold_routes_each_finding(tmp_path, monkeypatch): monkeypatch.setenv("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING", "1") runtime, _store = _runtime(tmp_path) @@ -1184,6 +1320,62 @@ def test_read_tools_return_git_pull_checks_and_override_rate(tmp_path, git_repo) assert rate["note"] == "measures operator force-pasts; not movable by agent retries" +def test_pull_request_get_returns_checks_on_a_fresh_runtime(tmp_path, monkeypatch): + # Regression: build_runtime yields check_surface=None, and the first tool + # call an agent makes may be pull_request_get (no prior check_list to lazily + # initialise the surface). The result must NOT be call-order-dependent — a PR + # with recorded checks must report them, or a governance agent is told a PR is + # clean when checks exist and may be failing. + checks = CheckSurface(f"sqlite:///{tmp_path / 'checks.db'}") + checks.record( + CheckRun( + check_name="unit", + run_id="run-1", + commit_sha="abc123", + outcome=CheckOutcome.FAIL, + pr=7, + ran_against="abc123", + ) + ) + # The lazy _checks() builder resolves the DB from LEGIS_CHECK_DB, exactly as a + # deployed server does — so the surface is uninitialised but reachable. + monkeypatch.setenv("LEGIS_CHECK_DB", f"sqlite:///{tmp_path / 'checks.db'}") + pulls = PullSurface(f"sqlite:///{tmp_path / 'pulls.db'}") + pulls.record( + PullRequest( + number=7, + title="Feature", + base="main", + head="feature", + state=PullRequestState.OPEN, + url="https://example.test/pr/7", + ) + ) + # Fresh runtime: check_surface left at its build_runtime default (None). + runtime, _store = _runtime(tmp_path, check_surface=None) + runtime.pull_surface = pulls + + responses = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "pull_request_get", + "arguments": {"number": "7"}, + }, + }, + ), + runtime, + ) + + pr = responses[0]["result"]["structuredContent"] + assert pr["number"] == 7 + assert pr["checks"][0]["check_name"] == "unit" + assert pr["checks"][0]["outcome"] == "fail" + + def test_check_list_reads_recorded_checks_by_commit_and_pr(tmp_path): checks = CheckSurface(f"sqlite:///{tmp_path / 'checks.db'}") checks.record( @@ -1339,6 +1531,20 @@ def test_build_runtime_loads_policy_cells_from_configured_path(tmp_path, monkeyp assert runtime.cell_registry.cell_for("ordinary.policy") == "chill" +def test_tool_registries_are_in_sync(): + # mcp.py hand-maintains three parallel name registries: the public schema + # (tool_definitions), the dispatch table (_TOOL_HANDLERS), and the agent- + # exposed set (_AGENT_TOOLS). They MUST agree. A handler without a schema + # entry is reachable-but-unvalidated (it accepts arbitrary arg keys); a + # schema entry without a handler advertises a tool that errors UNKNOWN_TOOL. + # The table-driven dispatch makes exactly this drift easy to introduce, so + # pin it directly rather than inferring it from per-tool listing tests. + from legis.mcp import _AGENT_TOOLS, _TOOL_HANDLERS, tool_definitions + + defined = {t["name"] for t in tool_definitions()} + assert defined == set(_TOOL_HANDLERS) == set(_AGENT_TOOLS) + + def test_git_rename_feed_get_is_listed(): from legis.mcp import tool_definitions @@ -1376,6 +1582,11 @@ def test_filigree_closure_gate_get_not_enabled_without_ledger(monkeypatch): # NotEnabledError is mapped to an error envelope, not raised. assert result["isError"] is True assert result["structuredContent"]["error_code"] == "CELL_NOT_ENABLED" + # Le1 (weft-f506e5f845): the recovery hint must name the concrete + # enablement path, not a vague "ask the operator". Every governance cell + # is wired behind LEGIS_HMAC_KEY in build_runtime. + next_action = result["structuredContent"]["next_action"] + assert "LEGIS_HMAC_KEY" in next_action def test_filigree_closure_gate_get_surfaces_integrity_failure(monkeypatch, tmp_path): @@ -1393,3 +1604,126 @@ def get_by_issue_id(self, issue_id): assert result["isError"] is True assert result["structuredContent"]["error_code"] == "AUDIT_INTEGRITY_FAILURE" + + +# --- roadmap 14: stdin JSON-RPC line-size bound --- + +def test_run_jsonrpc_rejects_oversized_line_and_stays_framed(tmp_path, monkeypatch): + # A single line over the bound is rejected with -32700 and does not consume + # the following request — framing realigns at the next newline. + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", "400") + runtime, _store = _runtime(tmp_path) + runtime.initialized = False + oversized = { + "jsonrpc": "2.0", "id": 99, "method": "tools/list", + "params": {"pad": "A" * 2000}, + } + responses = _run( + _messages( + {"jsonrpc": "2.0", "id": 1, "method": "initialize", + "params": {"protocolVersion": "2025-03-26"}}, + oversized, + {"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {}}, + ), + runtime, + ) + + assert responses[0]["id"] == 1 and "result" in responses[0] + assert responses[1]["id"] is None + assert responses[1]["error"]["code"] == -32700 + assert "maximum size" in responses[1]["error"]["message"] + # The request AFTER the oversized line is still parsed and answered. + assert responses[2]["id"] == 2 and "result" in responses[2] + + +def test_max_request_bytes_env_override_and_fallback(monkeypatch, caplog): + from legis.mcp import _DEFAULT_MAX_REQUEST_BYTES, _max_request_bytes + + monkeypatch.delenv("LEGIS_MCP_MAX_REQUEST_BYTES", raising=False) + assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", "4096") + assert _max_request_bytes() == 4096 + # Both the unparseable and the non-positive fat-finger fall back, but neither + # may do so silently — an operator lowering the bound must see why it was + # ignored. + for bad in ("not-an-int", "0", "-5"): + caplog.clear() + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", bad) + with caplog.at_level(logging.WARNING, logger="legis.mcp"): + assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + assert "LEGIS_MCP_MAX_REQUEST_BYTES" in caplog.text + + +def test_read_bounded_line_enforces_bytes_not_chars(): + # The bound is named in BYTES; readline() counts characters. A record that + # fits the char count but whose UTF-8 encoding exceeds the cap (multibyte + # content) must still overflow — otherwise the byte limit could be exceeded + # ~4×. The record AFTER it must stay framed. + from legis.mcp import _read_bounded_line + + multibyte = "中" * 200 # 200 chars, 600 UTF-8 bytes — under 400 chars, over 400 bytes + stream = io.StringIO(f"{multibyte}\n" + '{"next":true}\n') + + line, overflow = _read_bounded_line(stream, 400) + assert overflow is True + assert line.startswith("中") + + nxt, nxt_overflow = _read_bounded_line(stream, 400) + assert nxt_overflow is False + assert nxt == '{"next":true}\n' + + +def test_read_bounded_line_at_byte_boundary(): + # The bound counts the trailing newline (fail-safe off-by-one): a 399-byte + # data record + "\n" == 400 bytes passes; one more byte overflows. + from legis.mcp import _read_bounded_line + + ok_line, ok_overflow = _read_bounded_line(io.StringIO("x" * 399 + "\n"), 400) + assert ok_overflow is False + assert ok_line == "x" * 399 + "\n" + + _, over_overflow = _read_bounded_line(io.StringIO("x" * 400 + "\n"), 400) + assert over_overflow is True + + +def test_read_bounded_line_drains_oversized_multibyte_record(): + # A record longer than the *character* cap forces the drain loop (first + # branch) — exercise it with multibyte content and assert the next record + # stays framed (the existing multibyte test stays under the char cap and + # hits the second branch instead). + from legis.mcp import _read_bounded_line + + stream = io.StringIO("中" * 20 + "\n" + "{}\n") # 20 chars > 10-char cap + line, overflow = _read_bounded_line(stream, 10) + assert overflow is True + assert line.startswith("中") + + nxt, nxt_overflow = _read_bounded_line(stream, 10) + assert nxt == "{}\n" + assert nxt_overflow is False + + +def test_service_error_logs_unexpected_internal_error(caplog): + # An unexpected exception is surfaced to the caller as INTERNAL_ERROR; it must + # also be logged server-side (with the exception) so the operator/Sentry sees + # what the agent caller's payload alone would hide. + from legis.mcp import _service_error + + with caplog.at_level(logging.ERROR, logger="legis.mcp"): + result = _service_error(RuntimeError("kaboom")) + + assert result["structuredContent"]["error_code"] == "INTERNAL_ERROR" + assert any(r.levelno == logging.ERROR and r.exc_info for r in caplog.records) + + +def test_service_error_does_not_log_expected_typed_errors(caplog): + # Expected, typed service errors map to typed codes and must NOT spam the + # server log — only the unexpected INTERNAL_ERROR fall-through logs. + from legis.mcp import _service_error + from legis.service.errors import NotFoundError + + with caplog.at_level(logging.ERROR, logger="legis.mcp"): + result = _service_error(NotFoundError("nope")) + + assert result["structuredContent"]["error_code"] == "NOT_FOUND" + assert not caplog.records diff --git a/tests/policy/test_boundary_scan.py b/tests/policy/test_boundary_scan.py index aad91da..c2f58b9 100644 --- a/tests/policy/test_boundary_scan.py +++ b/tests/policy/test_boundary_scan.py @@ -1,12 +1,12 @@ from pathlib import Path -from legis.canonical import content_hash from legis.policy.boundary_scan import scan_policy_boundaries -from legis.policy.decorator import get_normalized_ast_str +from legis.policy.decorator import fingerprint_source def _test_fingerprint(source: str) -> str: - return content_hash(get_normalized_ast_str(source)) + # The canonical fingerprint both the gate and scanner compute (Q-L5). + return fingerprint_source(source) def _write_boundary_subject( diff --git a/tests/policy/test_decorator.py b/tests/policy/test_decorator.py index c95a087..a99eec1 100644 --- a/tests/policy/test_decorator.py +++ b/tests/policy/test_decorator.py @@ -1,6 +1,102 @@ +import ast +import importlib.util + import pytest -from legis.policy.decorator import PolicyBoundaryMetadata, policy_boundary +from legis.policy.decorator import ( + PolicyBoundaryMetadata, + fingerprint, + fingerprint_source, + policy_boundary, +) + + +# --- Q-L5: the runtime gate and the static scanner must agree --- + +def _static_fingerprint(module_source: str, name: str) -> str: + """Reproduce the static scanner's extraction: the FunctionDef segment + (decorators excluded) run through the shared canonicalization.""" + tree = ast.parse(module_source) + node = next( + n + for n in ast.walk(tree) + if isinstance(n, (ast.FunctionDef, ast.AsyncFunctionDef)) and n.name == name + ) + segment = ast.get_source_segment(module_source, node) or "" + return fingerprint_source(segment) + + +def _runtime_fingerprint(tmp_path, module_source: str, name: str) -> str: + """Reproduce the runtime gate's extraction: inspect.getsource of the live + function (decorators included).""" + path = tmp_path / "refmod.py" + path.write_text(module_source, encoding="utf-8") + spec = importlib.util.spec_from_file_location("refmod_ql5", path) + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + return fingerprint(getattr(mod, name)) + + +_DECORATED_TEST_MODULE = ( + "import functools\n" + "\n" + "def deco(f):\n" + " @functools.wraps(f)\n" + " def w(*a, **k):\n" + " return f(*a, **k)\n" + " return w\n" + "\n" + "@deco\n" + "def referenced_test():\n" + ' """exercises the boundary"""\n' + " assert True\n" +) + + +def test_runtime_and_static_fingerprints_agree_for_decorated_test(tmp_path): + # The crux of Q-L5: inspect.getsource includes the @deco line, while + # ast.get_source_segment of the FunctionDef does not — decorator-insensitive + # normalization makes the two paths converge. + runtime = _runtime_fingerprint(tmp_path, _DECORATED_TEST_MODULE, "referenced_test") + static = _static_fingerprint(_DECORATED_TEST_MODULE, "referenced_test") + assert runtime == static + + +def test_runtime_and_static_fingerprints_agree_for_class_method(tmp_path): + # Class methods are indented and may be decorated; dedent + decorator strip + # must still make the two extraction paths agree. + module = ( + "import functools\n" + "\n" + "def deco(f):\n" + " return f\n" + "\n" + "class TestThing:\n" + " @deco\n" + " def referenced_test(self):\n" + " assert 1 + 1 == 2\n" + ) + path = tmp_path / "refmod.py" + path.write_text(module, encoding="utf-8") + spec = importlib.util.spec_from_file_location("refmod_ql5_cls", path) + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + runtime = fingerprint(mod.TestThing.referenced_test) + static = _static_fingerprint(module, "referenced_test") + assert runtime == static + + +def test_fingerprint_source_is_crlf_invariant(): + lf = "def t():\n assert True\n" + crlf = lf.replace("\n", "\r\n") + assert fingerprint_source(lf) == fingerprint_source(crlf) + + +def test_fingerprint_source_unparsable_fragment_falls_back(): + # A non-parseable fragment hashes the normalized text rather than raising — + # both paths share this fallback, so they still agree. + frag = " assert broken(:\n" + assert isinstance(fingerprint_source(frag), str) def test_decorator_is_passthrough_and_attaches_metadata(): diff --git a/tests/policy/test_exemptions.py b/tests/policy/test_exemptions.py index 2ae7283..c9f576d 100644 --- a/tests/policy/test_exemptions.py +++ b/tests/policy/test_exemptions.py @@ -6,7 +6,6 @@ Exemption, ExemptionAllowlist, ExemptionError, - ExemptionRegistry, load_exemptions, ) diff --git a/tests/policy/test_honesty_gate.py b/tests/policy/test_honesty_gate.py index 58516f0..8dac7a1 100644 --- a/tests/policy/test_honesty_gate.py +++ b/tests/policy/test_honesty_gate.py @@ -7,9 +7,11 @@ ) -# A real, resolvable "test" function the gate will fingerprint. +# Fixture functions the gate fingerprints BY SOURCE — they are never executed, +# so the free `handler` name is intentional (it stands for the real boundary +# call the gate looks for); noqa keeps that deliberate undefined name. def fake_boundary_test(): - result = handler("payload") + result = handler("payload") # noqa: F821 assert result == "payload", "no-eval" @@ -20,7 +22,7 @@ def string_only_boundary_test(): def weak_policy_boundary_test(): - assert handler("payload") == "payload" + assert handler("payload") == "payload" # noqa: F821 assert "no-eval" == "no-eval" @@ -57,7 +59,7 @@ def test_gate_passes_with_a_pinned_unmodified_test(): def test_gate_parses_nested_test_sources_consistently(): def nested_boundary_test(): - result = handler("payload") + result = handler("payload") # noqa: F821 assert result == "payload", "no-eval" good = fingerprint(nested_boundary_test) diff --git a/tests/service/test_governance.py b/tests/service/test_governance.py index f3a22e4..10766cf 100644 --- a/tests/service/test_governance.py +++ b/tests/service/test_governance.py @@ -6,6 +6,7 @@ from legis.enforcement.protected import ProtectedGate, TamperError from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey +from legis.identity.resolver import IdentityResolutionStatus, LineageSnapshotStatus from legis.service.errors import AuditIntegrityError, InvalidArgumentError from legis.service.governance import ( compute_override_rate, @@ -18,11 +19,27 @@ class _FakeResult: + # Mirrors IdentityResolution, including the two mandatory str,Enum status + # axes. Defaults derive from ``alive`` via the same bijection the real type + # now enforces in __post_init__, so a contradictory fake can't sneak through. def __init__(self, entity_key, alive, content_hash, lineage_snapshot): self.entity_key = entity_key self.alive = alive self.content_hash = content_hash self.lineage_snapshot = lineage_snapshot + self.identity_resolution_status = { + True: IdentityResolutionStatus.RESOLVED, + False: IdentityResolutionStatus.NOT_ALIVE, + None: IdentityResolutionStatus.UNAVAILABLE, + }[alive] + if alive: + self.lineage_snapshot_status = ( + LineageSnapshotStatus.VERIFIED + if lineage_snapshot is not None + else LineageSnapshotStatus.UNAVAILABLE + ) + else: + self.lineage_snapshot_status = LineageSnapshotStatus.NOT_APPLICABLE class _FakeIdentity: @@ -265,6 +282,27 @@ def test_evaluate_override_rate_gate_scores_with_key(tmp_path): assert res.status in {GateStatus.PASS, GateStatus.PASS_WITH_NOTICE, GateStatus.FAIL} +def test_evaluate_override_rate_gate_ignores_soft_sniffs_on_simple_records(tmp_path): + # A chill/coached record can carry an arbitrary extra_extensions dict through + # the simple-tier engine. Such a record holding file_fingerprint/ast_path is + # NOT protected (the engine never writes protected_cell or a signature), so a + # keyless, non-protected deployment must score it rather than fail closed. + from legis.service.governance import evaluate_override_rate_gate + + store = AuditStore(f"sqlite:///{tmp_path / 'gov.db'}") + engine = EnforcementEngine(store, SystemClock()) # chill: no judge + engine.submit_override( + policy="some-policy", + entity_key=EntityKey.from_locator("src/x.py:f"), + rationale="r", + agent_id="a", + extensions={"file_fingerprint": "fp", "ast_path": "ap"}, + ) + records = store.read_all() + res = evaluate_override_rate_gate(records, hmac_key=None, protected_policies=frozenset()) + assert res.status in {GateStatus.PASS, GateStatus.PASS_WITH_NOTICE, GateStatus.FAIL} + + def test_sign_off_raises_not_enabled_when_gate_absent(): from legis.service.errors import NotEnabledError from legis.service.governance import sign_off diff --git a/tests/service/test_wardline.py b/tests/service/test_wardline.py new file mode 100644 index 0000000..9859e61 --- /dev/null +++ b/tests/service/test_wardline.py @@ -0,0 +1,140 @@ +"""Transport-agnostic Wardline scan-routing resolution. + +These pin the single governance decision — "is request-side routing allowed, +and is the cell-spec well-formed?" — that both the HTTP and MCP adapters now +delegate to instead of hand-copying (the duplication this resolver removed). +""" + +from __future__ import annotations + +import pytest + +from legis.service.errors import WardlineRoutingError +from legis.service.wardline import resolve_scan_routing +from legis.wardline.governor import WardlineCellPolicy +from legis.wardline.ingest import WardlineSeverity + + +def _resolve(**overrides): + base = dict( + server_cell=None, + server_cell_by_severity=None, + request_cell=None, + request_severity_map=None, + request_fail_on=None, + allow_request_routing=False, + ) + base.update(overrides) + return resolve_scan_routing(**base) + + +def test_server_cell_resolves_to_single_policy(): + r = _resolve(server_cell="surface_override") + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + assert r.cell_map is None and r.fail_on is None + assert r.cells == frozenset({WardlineCellPolicy.SURFACE_OVERRIDE}) + + +def test_server_cell_by_severity_resolves_to_cell_map(): + r = _resolve(server_cell_by_severity="CRITICAL=surface_override,INFO=surface_only") + assert r.policy is None + assert r.cell_map == { + WardlineSeverity.CRITICAL: WardlineCellPolicy.SURFACE_OVERRIDE, + WardlineSeverity.INFO: WardlineCellPolicy.SURFACE_ONLY, + } + + +def test_both_server_env_set_is_server_misconfigured(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(server_cell="surface_only", server_cell_by_severity="INFO=surface_only") + assert exc.value.kind == WardlineRoutingError.SERVER_MISCONFIGURED + + +def test_request_routing_under_server_ownership_is_rejected(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(server_cell="surface_only", request_cell="surface_override") + assert exc.value.kind == WardlineRoutingError.SERVER_OWNED + assert "server-owned" in str(exc.value) + + +def test_request_routing_without_optin_is_server_owned(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_cell="surface_override", allow_request_routing=False) + assert exc.value.kind == WardlineRoutingError.SERVER_OWNED + assert "server-owned" in str(exc.value) + + +def test_request_cell_resolves_when_optedin(): + r = _resolve(request_cell="surface_override", allow_request_routing=True) + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + + +def test_request_severity_map_resolves_when_optedin(): + r = _resolve( + request_severity_map={"CRITICAL": "surface_override"}, + allow_request_routing=True, + ) + assert r.cell_map == {WardlineSeverity.CRITICAL: WardlineCellPolicy.SURFACE_OVERRIDE} + + +def test_request_fail_on_with_cell_resolves_and_exposes_surface_only(): + r = _resolve( + request_cell="surface_override", request_fail_on="ERROR", + allow_request_routing=True, + ) + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + assert r.fail_on is WardlineSeverity.ERROR + # fail_on resolves per-finding to the gate cell or surface_only, so both may run. + assert r.cells == frozenset( + {WardlineCellPolicy.SURFACE_OVERRIDE, WardlineCellPolicy.SURFACE_ONLY} + ) + + +def test_fail_on_without_cell_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_fail_on="ERROR", + request_severity_map={"ERROR": "surface_only"}, + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_both_cell_and_map_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_cell="surface_only", + request_severity_map={"INFO": "surface_only"}, + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_neither_cell_nor_map_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_empty_request_severity_map_is_malformed(): + # The drift fix: HTTP already rejected an empty cell_by_severity; MCP silently + # accepted an empty severity_map (routed nothing). The resolver rejects it for + # both transports. + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_severity_map={}, allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_unknown_cell_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_cell="not_a_cell", allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_unknown_fail_on_severity_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_cell="surface_only", request_fail_on="SEVERE", + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED diff --git a/tests/store/test_audit_store.py b/tests/store/test_audit_store.py index 7c9fa85..6e8362c 100644 --- a/tests/store/test_audit_store.py +++ b/tests/store/test_audit_store.py @@ -1,8 +1,9 @@ +import logging import sqlite3 import pytest -from legis.store.audit_store import AuditStore +from legis.store.audit_store import AuditStore, _apply_sqlite_pragmas def db_path(tmp_path): @@ -69,7 +70,7 @@ def test_verify_integrity_passes_on_clean_chain(tmp_path): assert s.verify_integrity() is True -def test_verify_integrity_detects_out_of_band_tamper(tmp_path): +def test_verify_integrity_detects_out_of_band_tamper(tmp_path, caplog): s = make_store(tmp_path) s.append({"k": "a"}) s.append({"k": "b"}) @@ -84,10 +85,13 @@ def test_verify_integrity_detects_out_of_band_tamper(tmp_path): conn.commit() finally: conn.close() - assert s.verify_integrity() is False + with caplog.at_level(logging.ERROR, logger="legis.store.audit_store"): + assert s.verify_integrity() is False + # An investigator needs the offending seq, not a bare False. + assert "integrity check failed at seq=1" in caplog.text -def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path): +def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path, caplog): s = make_store(tmp_path) s.append({"k": "a"}) conn = raw_conn(tmp_path) @@ -101,7 +105,9 @@ def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path): finally: conn.close() - assert s.verify_integrity() is False + with caplog.at_level(logging.ERROR, logger="legis.store.audit_store"): + assert s.verify_integrity() is False + assert "integrity check failed" in caplog.text def test_audit_store_concurrent_writes(tmp_path): @@ -128,6 +134,96 @@ def run_appends(tid, count): assert s.verify_integrity() is True +def test_pragma_wal_actually_applied_on_file(tmp_path): + # The connect listener must put the on-disk DB into WAL mode. journal_mode is + # a persistent file-header property, so an *external* connection that never + # ran our listener still observes it — proof WAL truly applied to the file. + make_store(tmp_path) + conn = raw_conn(tmp_path) + try: + mode = conn.execute("PRAGMA journal_mode").fetchone()[0] + finally: + conn.close() + assert mode.lower() == "wal" + + +def test_pragma_busy_timeout_set_on_listener_connection(tmp_path): + # busy_timeout is per-connection (not persistent), so it must be read on a + # connection that went through the listener — i.e. one from the store engine. + s = make_store(tmp_path) + with s._engine.connect() as conn: + timeout = conn.exec_driver_sql("PRAGMA busy_timeout").scalar() + assert timeout == 5000 + + +class _FakeCursor: + def __init__(self, journal_mode): + self._journal_mode = journal_mode + self.closed = False + + def execute(self, _sql): + return self + + def fetchone(self): + return (self._journal_mode,) + + def close(self): + self.closed = True + + +class _FakeConn: + def __init__(self, journal_mode): + self.cursor_obj = _FakeCursor(journal_mode) + + def cursor(self): + return self.cursor_obj + + +class _RaisingCursor: + def __init__(self): + self.closed = False + + def execute(self, _sql): + raise sqlite3.OperationalError("PRAGMA rejected") + + def close(self): + self.closed = True + + +class _RaisingConn: + def __init__(self): + self.cursor_obj = _RaisingCursor() + + def cursor(self): + return self.cursor_obj + + +def test_apply_pragmas_warns_when_wal_not_applied(caplog): + # The silent failure the bare `except` never caught: PRAGMA journal_mode=WAL + # does NOT raise when WAL is unavailable — it returns the mode actually in + # force (e.g. 'delete'/'memory'). That must surface as a warning. + conn = _FakeConn("delete") + with caplog.at_level(logging.WARNING, logger="legis.store.audit_store"): + _apply_sqlite_pragmas(conn, "sqlite:///some.db") + assert any( + "wal" in r.getMessage().lower() for r in caplog.records + ), f"expected a WAL-not-applied warning; got {[r.getMessage() for r in caplog.records]}" + assert conn.cursor_obj.closed is True + + +def test_apply_pragmas_warns_with_exc_info_on_pragma_exception(caplog): + # A PRAGMA that genuinely raises must be logged (with exc_info), not swallowed, + # and the connection setup must still complete (cursor closed, no re-raise). + conn = _RaisingConn() + with caplog.at_level(logging.WARNING, logger="legis.store.audit_store"): + _apply_sqlite_pragmas(conn, "sqlite:///some.db") + assert caplog.records, "expected a warning when PRAGMA application raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + assert conn.cursor_obj.closed is True + + def test_verify_integrity_handles_non_finite_float_as_integrity_failure(tmp_path): # json.loads accepts Infinity/NaN, so the payload survives read_all's # decode guard, but content_hash -> canonical_json(allow_nan=False) raises diff --git a/tests/store/test_batch_read_free_invariant.py b/tests/store/test_batch_read_free_invariant.py new file mode 100644 index 0000000..5d84eef --- /dev/null +++ b/tests/store/test_batch_read_free_invariant.py @@ -0,0 +1,134 @@ +"""The transaction() read-free invariant is enforced and gate-path-proven (Q-M5). + +`AuditStore.transaction()` groups appends into one all-or-nothing batch behind a +held `BEGIN IMMEDIATE` write lock. Its contract is appends-only: a fresh- +connection read inside the batch would miss the uncommitted appends and contend +with the lock (`SQLITE_BUSY`). These tests pin that the store now *enforces* the +invariant (turning silent contention into a loud error), that the real gate +append paths driven through `route_findings` honour it, and that the batch is +genuinely all-or-nothing on a real on-disk SQLite file. +""" + +from __future__ import annotations + +import pytest + +from legis.clock import FixedClock +from legis.enforcement.engine import EnforcementEngine +from legis.enforcement.signoff import SignoffGate +from legis.identity.entity_key import EntityKey +from legis.store.audit_store import AuditStore +from legis.wardline.governor import WardlineCellPolicy, route_findings +from legis.wardline.ingest import active_defects + +_CLOCK = "2026-06-02T12:00:00+00:00" + + +def _on_disk_store(tmp_path, name="g.db") -> AuditStore: + # A real file, NOT sqlite:///:memory: and NOT shared-cache — so the held + # BEGIN IMMEDIATE genuinely locks a second connection out (the condition the + # invariant protects against). + return AuditStore(f"sqlite:///{tmp_path / name}") + + +def _scan(n: int) -> dict: + return { + "findings": [ + { + "rule_id": f"PY-WL-{100 + i}", + "message": f"untrusted reaches trusted #{i}", + "severity": "ERROR", + "kind": "defect", + "fingerprint": f"fp{i}", + "qualname": f"m.f{i}", + "properties": {"actual_return": "UNKNOWN_RAW"}, + "suppressed": "active", + } + for i in range(n) + ] + } + + +# --- the guard itself: a read inside a held batch raises, not contends --- + +@pytest.mark.parametrize( + "call", + [ + lambda s: s.read_all(), + lambda s: s.read_by_seq(1), + lambda s: s.verify_integrity(), + lambda s: s.get_latest_sequence_and_hash(), + ], +) +def test_read_inside_batch_raises_runtime_error(tmp_path, call): + store = _on_disk_store(tmp_path) + store.append({"event": "before"}) + with pytest.raises(RuntimeError, match="active transaction"): + with store.transaction(): + store.append({"event": "in-batch"}) + call(store) + + +def test_reads_work_again_after_batch_exits(tmp_path): + store = _on_disk_store(tmp_path) + with store.transaction(): + store.append({"event": "a"}) + store.append({"event": "b"}) + # Once the batch commits and the thread-local clears, reads are fine again. + assert len(store.read_all()) == 2 + assert store.verify_integrity() is True + + +# --- the real gate append paths, driven through route_findings' batch --- + +def test_surface_override_batch_is_read_free_on_disk(tmp_path): + # EnforcementEngine.submit_override is the append path here. If it (or + # anything it calls) issued a fresh-connection read inside the batch, the + # guard would raise; a clean completion proves the path is read-free. + engine = EnforcementEngine(_on_disk_store(tmp_path), FixedClock(_CLOCK)) + results = route_findings( + active_defects(_scan(3)), + policy=WardlineCellPolicy.SURFACE_OVERRIDE, + agent_id="agent-1", + resolve=lambda q: (EntityKey.from_locator(q or "unknown"), {}), + engine=engine, + ) + assert len(results) == 3 + # All three landed atomically and the chain is intact (reads outside batch). + assert len(engine.records()) == 3 + assert engine._store.verify_integrity() is True + + +def test_block_escalate_batch_is_read_free_on_disk(tmp_path): + # SignoffGate.request is the append path here. + gate = SignoffGate(_on_disk_store(tmp_path), FixedClock(_CLOCK)) + results = route_findings( + active_defects(_scan(3)), + policy=WardlineCellPolicy.BLOCK_ESCALATE, + agent_id="agent-1", + resolve=lambda q: (EntityKey.from_locator(q or "unknown"), {}), + signoff=gate, + ) + assert len(results) == 3 + assert len(gate.records()) == 3 + assert gate._store.verify_integrity() is True + + +# --- all-or-nothing on a real file: a mid-batch failure rolls everything back --- + +def test_batch_rolls_back_atomically_on_disk(tmp_path): + store = _on_disk_store(tmp_path) + store.append({"event": "committed-before-batch"}) + + with pytest.raises(RuntimeError, match="boom"): + with store.transaction(): + store.append({"event": "batch-1"}) + store.append({"event": "batch-2"}) + raise RuntimeError("boom") # mid-loop failure + + # The two in-batch appends rolled back; only the pre-batch record survives, + # and the hash chain is unbroken — proving real on-disk atomicity, not a + # half-written batch. + records = store.read_all() + assert [r.payload["event"] for r in records] == ["committed-before-batch"] + assert store.verify_integrity() is True diff --git a/tests/test_canonical.py b/tests/test_canonical.py index 100c2c2..e88cfb0 100644 --- a/tests/test_canonical.py +++ b/tests/test_canonical.py @@ -17,3 +17,19 @@ def test_content_hash_is_stable_and_hex(): def test_canonical_json_rejects_non_standard_float_values(): with pytest.raises(ValueError): canonical_json({"bad": float("nan")}) + + +def test_canonical_json_preserves_non_ascii(): + # ``ensure_ascii=False`` is a deliberate, load-bearing choice: a Wardline + # ``artifact_signature`` is an HMAC over these exact bytes, and Wardline's + # signer (wardline/core/legis.py) is a byte-for-byte Python replica using the + # same params. A non-ASCII finding message must therefore serialise to the + # literal character, not a ``\\uXXXX`` escape, or the cross-tool signature + # would diverge. This locks legis's own output; the cross-impl pin lives in + # Wardline's golden HMAC vector. Mirrors Wardline's + # ``test_canonical_json_is_sorted_tight_unicode``. + assert canonical_json({"b": 1, "a": "é"}) == '{"a":"é","b":1}' + # Round-trips through the UTF-8 encode content_hash uses. + assert canonical_json({"msg": "café—naïve"}).encode("utf-8").decode("utf-8") == ( + '{"msg":"café—naïve"}' + ) diff --git a/tests/test_cli.py b/tests/test_cli.py index a23a153..95e092f 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -39,6 +39,20 @@ def test_main_no_command_returns_2(): assert rc == 2 +def test_version_flag_prints_version_and_exits_zero(capsys): + import pytest + + from legis import __version__ + + with pytest.raises(SystemExit) as excinfo: + build_parser().parse_args(["--version"]) + # argparse's version action exits 0 after printing. + assert excinfo.value.code == 0 + out = capsys.readouterr().out + assert __version__ in out + assert "legis" in out + + def test_check_override_rate_exits_1_on_fail(tmp_path, capsys): from legis.clock import FixedClock from legis.enforcement.engine import EnforcementEngine diff --git a/tests/test_cli_install.py b/tests/test_cli_install.py new file mode 100644 index 0000000..1ad799c --- /dev/null +++ b/tests/test_cli_install.py @@ -0,0 +1,159 @@ +"""Tests for the install / session-context CLI surfaces and MCP-boot refresh.""" + +from __future__ import annotations + +import json + +from legis import install +from legis.cli import build_parser, main +from legis.install import INSTRUCTIONS_MARKER, SKILL_NAME + + +def test_install_all_creates_every_artifact(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + rc = main(["install"]) + assert rc == 0 + + assert INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + assert INSTRUCTIONS_MARKER in (tmp_path / "AGENTS.md").read_text() + assert (tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md").is_file() + assert (tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md").is_file() + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + assert "SessionStart" in settings["hooks"] + gitignore = (tmp_path / ".gitignore").read_text() + assert ".weft/legis/" in gitignore + + +def test_install_selective_gitignore_only(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = main(["install", "--gitignore"]) + assert rc == 0 + assert (tmp_path / ".gitignore").exists() + assert not (tmp_path / "CLAUDE.md").exists() + assert not (tmp_path / ".claude").exists() + + +def test_install_claude_md_only(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = main(["install", "--claude-md"]) + assert rc == 0 + assert (tmp_path / "CLAUDE.md").exists() + assert not (tmp_path / "AGENTS.md").exists() + + +def test_install_reports_failure_rc1_on_symlink(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + real = tmp_path / "real.md" + real.write_text("x") + (tmp_path / "CLAUDE.md").symlink_to(real) + rc = main(["install", "--claude-md"]) + assert rc == 1 + assert "FAIL" in capsys.readouterr().out + + +def test_install_renders_fail_and_continues_when_a_step_raises(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + + def boom(_root): + raise RuntimeError("step blew up") + + monkeypatch.setattr(install, "install_skills", boom) + rc = main(["install"]) + out = capsys.readouterr().out + # A raising step is rendered as a [FAIL] line, not a traceback that aborts + # the run and leaves the install half-applied... + assert "[FAIL] Claude Code skill: step blew up" in out + # ...and the steps after it still run. + assert (tmp_path / ".gitignore").exists() + assert rc == 1 + + +def test_session_context_silent_when_fresh(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + rc = main(["session-context"]) + assert rc == 0 + assert capsys.readouterr().out == "" + + +def test_session_context_prints_on_drift(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED\n") + rc = main(["session-context"]) + assert rc == 0 + assert "CLAUDE.md" in capsys.readouterr().out + + +def test_install_subcommand_parses_flags(): + args = build_parser().parse_args(["install", "--claude-md", "--hooks"]) + assert args.command == "install" + assert args.claude_md is True + assert args.hooks is True + assert args.agents_md is False + + +# --------------------------------------------------------------------------- +# MCP-boot refresh wiring +# --------------------------------------------------------------------------- + + +def test_mcp_boot_refreshes_drifted_instructions(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED ON BOOT\n") + + import legis.mcp as mcp_module + + monkeypatch.setattr(mcp_module, "main", lambda agent_id: 0) + + rc = main(["mcp", "--agent-id", "agent-1"]) + assert rc == 0 + assert "DRIFTED ON BOOT" in (tmp_path / "CLAUDE.md").read_text() + + +def test_mcp_boot_refresh_failure_does_not_break_startup(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + + import legis.hooks as hooks_module + import legis.mcp as mcp_module + + calls = [] + + def boom(_root): + raise RuntimeError("refresh exploded") + + monkeypatch.setattr(hooks_module, "refresh_instructions", boom) + monkeypatch.setattr(mcp_module, "main", lambda agent_id: calls.append(agent_id) or 0) + + rc = main(["mcp", "--agent-id", "agent-1"]) + assert rc == 0 + assert calls == ["agent-1"] + + +def test_mcp_boot_refresh_failure_is_logged_with_exc_info(tmp_path, monkeypatch, caplog): + # The boot refresh is the ONLY refresh trigger in a Codex-only repo with no + # SessionStart hook. A persistently failing refresh must be visible at the + # default level (WARNING), not swallowed at DEBUG — otherwise agents run on + # drifted instructions with no signal. Mirrors hooks.generate_session_context. + monkeypatch.chdir(tmp_path) + + import logging + + import legis.hooks as hooks_module + import legis.mcp as mcp_module + + def boom(_root): + raise RuntimeError("refresh exploded") + + monkeypatch.setattr(hooks_module, "refresh_instructions", boom) + monkeypatch.setattr(mcp_module, "main", lambda agent_id: 0) + + with caplog.at_level(logging.WARNING, logger="legis.cli"): + rc = main(["mcp", "--agent-id", "agent-1"]) + + assert rc == 0 + assert caplog.records, "expected a warning when boot refresh raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None diff --git a/tests/test_config.py b/tests/test_config.py new file mode 100644 index 0000000..4d1f52e --- /dev/null +++ b/tests/test_config.py @@ -0,0 +1,149 @@ +"""Store-location resolver: the federated ``.weft/legis`` subtree. + +These pin the contract from the weft config/store consolidation: + * machine-written DBs default under ``.weft/legis/`` (cwd-anchored, the same + notion the installer uses for project root); + * the operator-authored ``weft.toml`` ``[legis]`` table may relocate the + subtree but is enrich-only — absent, section-less, or malformed weft.toml + must still boot on built-in defaults (never load-bearing); + * computing a URL is pure (creates nothing); the directory materialises only + when a DB is actually opened, via ``ensure_sqlite_parent``. +""" + +from __future__ import annotations + +import pytest + +from legis import config + + +@pytest.fixture +def _clear_db_env(monkeypatch): + """Clear the per-DB ``LEGIS_*_DB`` overrides so a test can probe the lower + weft.toml / built-in-default precedence layers. The autouse suite fixture + (tests/conftest.py) sets these to isolate stores, and the resolvers now honour + them (highest precedence), so a default-layer assertion must drop them first. + A test's own monkeypatch runs after the autouse fixture, so this wins. + """ + for var in ( + "LEGIS_CHECK_DB", + "LEGIS_GOVERNANCE_DB", + "LEGIS_BINDING_DB", + "LEGIS_PULL_DB", + ): + monkeypatch.delenv(var, raising=False) + + +def test_all_four_db_urls_default_under_weft_legis(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + assert config.check_db_url() == "sqlite:///.weft/legis/legis-checks.db" + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + assert config.binding_db_url() == "sqlite:///.weft/legis/legis-binding.db" + assert config.pull_db_url() == "sqlite:///.weft/legis/legis-pulls.db" + + +def test_legis_db_env_var_takes_precedence_over_weft_toml_and_default(tmp_path, monkeypatch): + # The documented precedence (module docstring): a per-DB LEGIS_*_DB override + # wins over both the weft.toml store_dir and the built-in default. The + # resolvers must implement this themselves, so a bare call honours the env. + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text( + '[legis]\nstore_dir = "var/legis-state"\n', encoding="utf-8" + ) + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "sqlite:///explicit-gov.db") + monkeypatch.setenv("LEGIS_CHECK_DB", "sqlite:///explicit-check.db") + assert config.governance_db_url() == "sqlite:///explicit-gov.db" + assert config.check_db_url() == "sqlite:///explicit-check.db" + # An unset var still falls through to weft.toml store_dir for that DB. + monkeypatch.delenv("LEGIS_BINDING_DB", raising=False) + assert config.binding_db_url() == "sqlite:///var/legis-state/legis-binding.db" + + +def test_db_urls_use_builtin_defaults_with_no_weft_toml(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + assert not (tmp_path / "weft.toml").exists() + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_weft_toml_store_dir_relocates_the_subtree(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text( + '[legis]\nstore_dir = "var/legis-state"\n', encoding="utf-8" + ) + assert config.governance_db_url() == "sqlite:///var/legis-state/legis-governance.db" + assert config.check_db_url() == "sqlite:///var/legis-state/legis-checks.db" + + +def test_weft_toml_absolute_store_dir_yields_absolute_url(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + abs_dir = tmp_path / "srv" / "legis" + (tmp_path / "weft.toml").write_text( + f'[legis]\nstore_dir = "{abs_dir.as_posix()}"\n', encoding="utf-8" + ) + assert config.governance_db_url() == f"sqlite:///{abs_dir.as_posix()}/legis-governance.db" + + +def test_weft_toml_without_legis_section_uses_defaults(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text('[filigree]\ndb = "x"\n', encoding="utf-8") + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_malformed_weft_toml_is_not_load_bearing(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("this is = = not valid toml [[[", encoding="utf-8") + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_computing_db_url_creates_no_directories(_clear_db_env, tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + _ = config.governance_db_url() + _ = config.check_db_url() + _ = config.binding_db_url() + _ = config.pull_db_url() + assert not (tmp_path / ".weft").exists() + + +def test_ensure_sqlite_parent_creates_dir_for_relative_file_url(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + assert (tmp_path / ".weft" / "legis").is_dir() + + +def test_ensure_sqlite_parent_creates_dir_for_absolute_file_url(tmp_path): + target = tmp_path / "a" / "b" / "x.db" + config.ensure_sqlite_parent(f"sqlite:///{target.as_posix()}") + assert (tmp_path / "a" / "b").is_dir() + + +def test_ensure_sqlite_parent_is_noop_for_in_memory_and_non_sqlite(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite://") + config.ensure_sqlite_parent("sqlite:///:memory:") + config.ensure_sqlite_parent("postgresql://localhost/x") + assert list(tmp_path.iterdir()) == [] + + +def test_ensure_sqlite_parent_is_idempotent(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + assert (tmp_path / ".weft" / "legis").is_dir() + + +def test_suite_isolates_store_locations_to_tmp(): + """Regression guard for legis-3d295a6f7f: the autouse conftest fixture must + redirect every store env var off the repo-relative `.weft/legis/` default, + so a test that builds a default-path store can't leak a subtree into the + working tree.""" + import os + + for var in ( + "LEGIS_CHECK_DB", + "LEGIS_GOVERNANCE_DB", + "LEGIS_BINDING_DB", + "LEGIS_PULL_DB", + ): + val = os.environ.get(var, "") + assert val.startswith("sqlite:"), f"{var} not redirected: {val!r}" + assert "legis-store" in val, f"{var} not pointed at the isolated tmp dir: {val!r}" diff --git a/tests/test_doctor.py b/tests/test_doctor.py new file mode 100644 index 0000000..26b4003 --- /dev/null +++ b/tests/test_doctor.py @@ -0,0 +1,466 @@ +from __future__ import annotations + +import json + +from legis.cli import main as cli_main +from legis.doctor import ( + DoctorCheck, + check_gitignore, + check_hook, + check_instruction_block, + check_mcp_json, + check_skill_pack, + render_json, + render_text, + run_doctor, +) +from legis import install as legis_install + + +def test_doctorcheck_to_dict_omits_empty_message(): + assert DoctorCheck("a.b", "ok").to_dict() == {"id": "a.b", "status": "ok", "fixed": False} + assert DoctorCheck("a.b", "error", message="boom").to_dict() == { + "id": "a.b", + "status": "error", + "fixed": False, + "message": "boom", + } + + +def test_render_json_shape(): + checks = [DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")] + payload = json.loads(render_json(checks)) + assert payload["ok"] is False + assert payload["checks"][0] == {"id": "a", "status": "ok", "fixed": False} + assert payload["next_actions"] == ["b: bad"] + + +def test_render_text_lists_only_problems_when_healthy_says_ok(): + # all-ok: banner present, no problem lines + assert render_text([DoctorCheck("a", "ok")]) == "legis doctor: ok" + + # error present: no "ok" in headline, error listed + out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) + assert "b: error" in out + assert "legis doctor: ok" not in out + + # warn-only: banner present with warning count AND warn check is listed + out_warn = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "warn", message="heads up")]) + assert "legis doctor: ok" in out_warn + assert "b: warn" in out_warn + + +def test_run_doctor_healthy_after_repair(tmp_path, capsys): + # A project repaired via run_doctor renders healthy on re-check, exit 0. + run_doctor(tmp_path, repair=True, fmt="text") + capsys.readouterr() # discard repair output + rc = run_doctor(tmp_path, repair=False, fmt="text") + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_run_doctor_json_format(tmp_path, capsys): + run_doctor(tmp_path, repair=True, fmt="json") + capsys.readouterr() # discard repair output + rc = run_doctor(tmp_path, repair=False, fmt="json") + assert rc == 0 + payload = json.loads(capsys.readouterr().out) + assert payload["ok"] is True + assert payload["next_actions"] == [] + + +def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor", "--repair"]) + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_cli_doctor_json(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor", "--repair", "--format", "json"]) + assert rc == 0 + assert json.loads(capsys.readouterr().out)["ok"] is True + + +# --------------------------------------------------------------------------- +# check_mcp_json +# --------------------------------------------------------------------------- + + +def test_mcp_json_absent_is_error(tmp_path): + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + assert c.fixed is False + + +def test_mcp_json_repair_fixes_it(tmp_path): + c = check_mcp_json(tmp_path, repair=True) + assert c.status == "ok" + assert c.fixed is True + assert (tmp_path / ".mcp.json").exists() + + +def test_mcp_json_present_is_ok(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) + c = check_mcp_json(tmp_path, repair=False) + assert c.status == "ok" + assert c.fixed is False + + +def test_mcp_json_stale_command_is_error_then_repaired(tmp_path): + """An entry with a dead command path is stale and must trigger repair.""" + stale_entry = { + "mcpServers": { + "legis": { + "type": "stdio", + "command": "/nonexistent/legis-xyz", + "args": ["mcp", "--agent-id", "claude-code"], + "env": {}, + } + } + } + (tmp_path / ".mcp.json").write_text(json.dumps(stale_entry)) + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + + fixed = check_mcp_json(tmp_path, repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Direct unit tests for mcp_entry_is_current predicate +# --------------------------------------------------------------------------- + + +from legis.install import mcp_entry_is_current, register_mcp_json as _register_mcp_json + + +def test_mcp_entry_is_current_absent_file(tmp_path): + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_malformed_json(tmp_path): + (tmp_path / ".mcp.json").write_text("{not valid json") + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_non_dict_top_level(tmp_path): + (tmp_path / ".mcp.json").write_text('["just", "an", "array"]') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_missing_mcp_servers(tmp_path): + (tmp_path / ".mcp.json").write_text('{"other": {}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_mcp_servers_not_dict(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": "not a dict"}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_no_legis_entry(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": {"other": {}}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_legis_entry_not_dict(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": {"legis": "string"}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_args_without_mcp(tmp_path): + entry = {"mcpServers": {"legis": {"command": "legis", "args": ["serve"]}}} + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_empty_command(tmp_path): + entry = {"mcpServers": {"legis": {"command": "", "args": ["mcp"]}}} + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_dead_command_path(tmp_path): + entry = { + "mcpServers": { + "legis": { + "command": "/nonexistent/legis-xyz", + "args": ["mcp", "--agent-id", "claude-code"], + } + } + } + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_fresh_registered_entry(tmp_path): + """A freshly registered entry must read as current.""" + _register_mcp_json(tmp_path) + assert mcp_entry_is_current(tmp_path) is True + + +# --------------------------------------------------------------------------- +# Task 6: install-wiring checks (blocks, skills, hook, gitignore) +# --------------------------------------------------------------------------- + + +def test_instruction_block_absent_is_error(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.id == "install.claude_md" + assert c.status == "error" + + +def test_instruction_block_repair_creates_it(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert c.status == "ok" + assert c.fixed is True + assert legis_install.INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + + +def test_gitignore_absent_is_error_then_repaired(tmp_path): + assert check_gitignore(tmp_path, repair=False).status == "error" + fixed = check_gitignore(tmp_path, repair=True) + assert fixed.status == "ok" and fixed.fixed is True + assert ".weft/legis/" in (tmp_path / ".gitignore").read_text() + + +def test_skill_pack_absent_is_error(tmp_path): + assert check_skill_pack(tmp_path, ".claude", repair=False).status == "error" + + +def test_skill_pack_repair_installs(tmp_path): + c = check_skill_pack(tmp_path, ".claude", repair=True) + assert c.status == "ok" and c.fixed is True + + +# --------------------------------------------------------------------------- +# Task 6 (drift): stale block / stale skill pack are the headline behavior +# --------------------------------------------------------------------------- + + +def test_instruction_block_stale_token_is_error_then_repaired(tmp_path): + # A real block with a mutated marker token: marker present, token mismatch. + legis_install.inject_instructions(tmp_path / "CLAUDE.md") + path = tmp_path / "CLAUDE.md" + content = path.read_text() + fresh_token = legis_install._marker_token() + stale = content.replace(f":{fresh_token} -->", ":v0:deadbeef -->", 1) + assert stale != content # the token really was rewritten + path.write_text(stale) + assert legis_install._extract_marker_token(stale) != fresh_token + + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.status == "error" + + fixed = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + assert legis_install._extract_marker_token((tmp_path / "CLAUDE.md").read_text()) == fresh_token + + +def test_skill_pack_stale_fingerprint_is_error_then_repaired(tmp_path): + legis_install.install_skills(tmp_path) + pack = tmp_path / ".claude" / "skills" / legis_install.SKILL_NAME + # Mutate a file under the installed pack so its fingerprint diverges from source. + skill_md = pack / "SKILL.md" + skill_md.write_text(skill_md.read_text() + "\n\n") + + c = check_skill_pack(tmp_path, ".claude", repair=False) + assert c.status == "error" + + fixed = check_skill_pack(tmp_path, ".claude", repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Task 6: hook check +# --------------------------------------------------------------------------- + + +def test_hook_absent_is_error_then_repaired(tmp_path): + c = check_hook(tmp_path, repair=False) + assert c.id == "install.hook" + assert c.status == "error" + + fixed = check_hook(tmp_path, repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Task 7: config & store checks (weft.toml report-only, store dir, db overrides, legacy) +# --------------------------------------------------------------------------- + + +from legis.doctor import check_weft_toml, check_store_dir, check_db_overrides, check_legacy_stray_db + + +def test_weft_toml_absent_is_ok(tmp_path): + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_valid_legis_table_is_ok(tmp_path): + (tmp_path / "weft.toml").write_text('[legis]\nstore_dir = ".weft/legis"\n') + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_malformed_is_error_and_unchanged(tmp_path): + wt = tmp_path / "weft.toml" + wt.write_text("[legis]\nstore_dir = \n") # malformed TOML + before = wt.read_text() + c = check_weft_toml(tmp_path) + assert c.status == "error" + assert wt.read_text() == before # C-9(b): never written + + +def test_weft_toml_legis_not_a_table_is_error(tmp_path): + (tmp_path / "weft.toml").write_text('legis = "oops"\n') + assert check_weft_toml(tmp_path).status == "error" + + +def test_store_dir_writable_parent_is_ok(tmp_path): + assert check_store_dir(tmp_path).status == "ok" + + +def test_db_override_bad_url_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "::not a url::") + assert check_db_overrides(tmp_path).status == "error" + + +def test_legacy_stray_db_is_warn(tmp_path): + (tmp_path / "legis-governance.db").write_text("x") + assert check_legacy_stray_db(tmp_path).status == "warn" + + +# --------------------------------------------------------------------------- +# Task 8: governance integrity + runtime/sibling checks +# --------------------------------------------------------------------------- + + +from legis.doctor import check_audit_chain, check_hmac_key, check_sibling_url + + +def test_audit_chain_absent_db_is_ok(tmp_path): + c = check_audit_chain("store.governance_chain", "sqlite:///" + str(tmp_path / "nope.db")) + assert c.status == "ok" + # No-leak invariant: must NOT create the file + assert not (tmp_path / "nope.db").exists() + + +def test_audit_chain_intact_db_is_ok(tmp_path): + from legis.store.audit_store import AuditStore + + url = "sqlite:///" + str(tmp_path / "gov.db") + AuditStore(url) # creates schema + assert check_audit_chain("store.governance_chain", url).status == "ok" + + +def test_hmac_key_warn_when_protected_set_without_key(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + c = check_hmac_key(tmp_path) + assert c.status == "warn" + + +def test_hmac_key_never_prints_value(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "super-secret-value") + c = check_hmac_key(tmp_path) + assert c.status == "ok" + assert "super-secret-value" not in (c.message or "") + + +def test_sibling_url_invalid_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LOOMWEAVE_API_URL", "localhost:9620") # no scheme + c = check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL") + assert c.status == "error" + + +# --------------------------------------------------------------------------- +# Review follow-ups: root-anchored store_dir + empty-override precedence +# --------------------------------------------------------------------------- + + +from legis.doctor import _store_url + + +def test_store_dir_root_anchored_via_weft_toml(tmp_path, monkeypatch): + # --root != cwd, with a weft.toml that relocates the store. Resolution must + # honor root/weft.toml, not cwd's, and stay under root (review #1). + monkeypatch.chdir(tmp_path) # cwd has no weft.toml + # Clear the conftest store override so weft.toml resolution is exercised. + monkeypatch.delenv("LEGIS_GOVERNANCE_DB", raising=False) + root = tmp_path / "proj" + (root / "custom_store").mkdir(parents=True) + (root / "weft.toml").write_text('[legis]\nstore_dir = "custom_store"\n') + + c = check_store_dir(root) + assert c.status == "ok" + + # The audit-chain URL must point under root/custom_store, not cwd/.weft. + url = _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB") + assert (root / "custom_store" / "legis-governance.db").as_posix() in url + assert ".weft" not in url + + +def test_db_override_empty_string_is_error(tmp_path, monkeypatch): + # Present-but-empty override is a verbatim broken override, not "unset" + # (matches config precedence; review #3). + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "") + assert check_db_overrides(tmp_path).status == "error" + + +# --------------------------------------------------------------------------- +# Task 9: end-to-end --repair pipeline + invariant tests +# --------------------------------------------------------------------------- + + +def test_repair_makes_fresh_project_healthy(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + # Hermetic: an inherited sibling URL env var (valid or not) would otherwise + # leak into the repair → exit 0 assertion. Unset both so the check is "not + # configured" (ok), never a non-repairable error. + monkeypatch.delenv("LOOMWEAVE_API_URL", raising=False) + monkeypatch.delenv("FILIGREE_API_URL", raising=False) + # First run: unhealthy (no install artifacts, no .mcp.json). + assert run_doctor(tmp_path, repair=False, fmt="text") == 1 + # Repair run: install-wiring + .mcp.json get fixed; re-check is healthy. + assert run_doctor(tmp_path, repair=True, fmt="text") == 0 + # Third run, no repair: stays healthy. + assert run_doctor(tmp_path, repair=False, fmt="text") == 0 + + +def test_repair_never_writes_weft_toml(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("[legis]\nstore_dir = \n") # malformed + before = (tmp_path / "weft.toml").read_text() + run_doctor(tmp_path, repair=True, fmt="json") + assert (tmp_path / "weft.toml").read_text() == before + + +def test_json_output_has_no_secret(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "TOP-SECRET") + import contextlib + import io + buf = io.StringIO() + with contextlib.redirect_stdout(buf): + run_doctor(tmp_path, repair=False, fmt="json") + out = buf.getvalue() + assert "TOP-SECRET" not in out + # Prove the secret-bearing path actually ran: with both the protected policy + # and the key set, check_hmac_key reads the key and reports ok. Asserting the + # check is present (and ok) keeps this guard from passing vacuously if the + # key-reading check were ever removed. + payload = json.loads(out) + hmac_checks = [c for c in payload["checks"] if c["id"] == "runtime.hmac_key"] + assert hmac_checks and hmac_checks[0]["status"] == "ok" diff --git a/tests/test_hooks.py b/tests/test_hooks.py new file mode 100644 index 0000000..18d82ec --- /dev/null +++ b/tests/test_hooks.py @@ -0,0 +1,182 @@ +"""Tests for legis.hooks — drift refresh and SessionStart context.""" + +from __future__ import annotations + +import logging + +from legis import hooks, install +from legis.hooks import ( + generate_session_context, + refresh_instructions, +) +from legis.install import ( + SKILL_NAME, + inject_instructions, + install_codex_skills, + install_skills, +) + + +def test_refresh_noop_when_fresh(tmp_path): + inject_instructions(tmp_path / "CLAUDE.md") + inject_instructions(tmp_path / "AGENTS.md") + assert refresh_instructions(tmp_path) == [] + + +def test_refresh_updates_drifted_block_in_both_files(tmp_path, monkeypatch): + inject_instructions(tmp_path / "CLAUDE.md") + inject_instructions(tmp_path / "AGENTS.md") + + # Simulate drift: the bundled content now hashes differently. + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED BODY\n") + messages = refresh_instructions(tmp_path) + + assert any("CLAUDE.md" in m for m in messages) + assert any("AGENTS.md" in m for m in messages) + assert "DRIFTED BODY" in (tmp_path / "CLAUDE.md").read_text() + assert "DRIFTED BODY" in (tmp_path / "AGENTS.md").read_text() + + +def test_refresh_updates_on_version_bump_with_identical_content(tmp_path, monkeypatch): + # Pins the documented "automatic versioning" contract: a package-version + # bump re-injects even when instructions.md is byte-identical. This is the + # only test that would catch a regression collapsing freshness to hash-only. + inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_version", lambda: "9.9.9") + messages = refresh_instructions(tmp_path) + assert any("CLAUDE.md" in m for m in messages) + assert "v9.9.9:" in (tmp_path / "CLAUDE.md").read_text() + + +def test_refresh_reinstalls_drifted_codex_skill_pack(tmp_path): + install_codex_skills(tmp_path) + skill = tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md" + source = skill.read_text() + skill.write_text(source + "\nLOCAL EDIT\n") + + messages = refresh_instructions(tmp_path) + + assert any("Codex skill pack" in m for m in messages) + assert skill.read_text() == source + + +def test_refresh_skips_file_without_marker(tmp_path): + (tmp_path / "CLAUDE.md").write_text("# plain file, no legis marker\n") + assert refresh_instructions(tmp_path) == [] + assert "legis:instructions" not in (tmp_path / "CLAUDE.md").read_text() + + +def test_refresh_skips_absent_files(tmp_path): + # Neither CLAUDE.md nor AGENTS.md exists and no skills installed. + assert refresh_instructions(tmp_path) == [] + + +def test_refresh_reinstalls_drifted_skill_pack(tmp_path): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + source = skill.read_text() + # Corrupt the installed copy so its fingerprint diverges from source. + skill.write_text(source + "\nLOCAL EDIT THAT MUST BE OVERWRITTEN\n") + + messages = refresh_instructions(tmp_path) + + assert any("skill pack" in m for m in messages) + assert skill.read_text() == source + + +def test_refresh_does_not_create_skill_pack_when_absent(tmp_path): + # No skill installed → refresh must not create one. + refresh_instructions(tmp_path) + assert not (tmp_path / ".claude" / "skills" / SKILL_NAME).exists() + + +def test_generate_session_context_returns_none_when_fresh(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + inject_instructions(tmp_path / "CLAUDE.md") + assert generate_session_context() is None + + +def test_generate_session_context_returns_messages_on_drift(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED\n") + context = generate_session_context() + assert context is not None + assert "CLAUDE.md" in context + + +def test_refresh_auto_fire_preserves_coresident_foreign_block(tmp_path): + """SessionStart drift-refresh must not wipe a co-resident sibling block. + + This is the "deletes with no user action" path (hooks.py refresh → + inject_instructions): a stale/unclosed legis block whose token has drifted + triggers re-injection, and the bounded scan must spare the wardline block. + """ + md = tmp_path / "CLAUDE.md" + # Open marker carries a stale token (drift), but the block is NOT closed — + # so the legacy truncate-to-EOF path would delete the wardline block below. + md.write_text( + "\n" + "legis body, block NOT closed\n" + "\n" + "wardline body\n" + "\n" + ) + messages = refresh_instructions(tmp_path) + content = md.read_text() + assert any("CLAUDE.md" in m for m in messages) # drift was acted on + assert "wardline body" in content + assert "" in content + + +def test_refresh_warns_when_drift_reinjection_fails(tmp_path, monkeypatch, caplog): + """A *detected-drift* re-injection that fails must not be dropped silently. + + ``inject_instructions`` returns ``(False, reason)`` (it does not raise) for a + recoverable refusal such as a symlinked target, so the upstream ``except`` in + the session-context path never sees it. If the refresh swallows the ``False``, + agents run on drifted instructions with zero operator signal. + """ + real = tmp_path / "real.md" + inject_instructions(real) + link = tmp_path / "CLAUDE.md" + link.symlink_to(real) + # Drift so the refresh attempts a re-injection (which then fails on the symlink). + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED BODY\n") + + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + messages = refresh_instructions(tmp_path) + + assert not any("CLAUDE.md" in m for m in messages) # no false success + assert "CLAUDE.md" in caplog.text + assert "symlink" in caplog.text.lower() + + +def test_refresh_warns_when_skill_reinstall_fails(tmp_path, monkeypatch, caplog): + """A failed skill-pack re-install on drift must warn, not silently no-op.""" + install.install_skills(tmp_path) + # Drift the installed pack so the refresh attempts a reinstall. + next( + (tmp_path / ".claude" / "skills" / install.SKILL_NAME).rglob("*.md") + ).write_text("DRIFTED\n") + monkeypatch.setattr(hooks, "install_skills", lambda _root: (False, "swap failed")) + + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + messages = refresh_instructions(tmp_path) + + assert not any("skill" in m.lower() for m in messages) # no false success + assert "swap failed" in caplog.text + + +def test_generate_session_context_swallows_errors(tmp_path, monkeypatch, caplog): + monkeypatch.chdir(tmp_path) + + def boom(_root): + raise OSError("disk gone") + + monkeypatch.setattr(hooks, "refresh_instructions", boom) + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + assert generate_session_context() is None + # Swallowing must not be silent — a regression dropping the warning would + # hide a broken freshness check. + assert "Instruction freshness check failed" in caplog.text diff --git a/tests/test_install.py b/tests/test_install.py new file mode 100644 index 0000000..19e0ed4 --- /dev/null +++ b/tests/test_install.py @@ -0,0 +1,823 @@ +"""Tests for legis.install — instruction injection, skills, hooks, gitignore.""" + +from __future__ import annotations + +import json +import logging +import os +import stat + +import pytest + +from legis import install +from legis.install import ( + INSTRUCTIONS_MARKER, + SKILL_NAME, + UnsafeInstallPathError, + _build_instructions_block, + _extract_marker_token, + _instructions_hash, + _instructions_text, + _instructions_version, + _marker_token, + _skill_tree_fingerprint, + ensure_gitignore, + inject_instructions, + install_claude_code_hooks, + install_codex_skills, + install_skills, + reject_symlink, +) + + +# --------------------------------------------------------------------------- +# Instructions block primitives +# --------------------------------------------------------------------------- + + +def test_instructions_text_is_nonempty_and_marker_free(): + text = _instructions_text() + assert text.strip() + # The body must not contain markers; they are added programmatically. + assert INSTRUCTIONS_MARKER not in text + assert "/legis:instructions" not in text + + +def test_instructions_hash_is_stable_8_hex(): + h = _instructions_hash() + assert len(h) == 8 + assert all(c in "0123456789abcdef" for c in h) + assert h == _instructions_hash() + + +def test_instructions_version_prefers_dist_metadata(): + import importlib.metadata + + # Prefers installed distribution metadata; falls back to legis.__version__. + # (In a dev venv the editable dist metadata can lag the source __version__; + # in a real release they agree. Assert the documented preference, not a + # hardcoded string.) + try: + expected = importlib.metadata.version("legis") + except importlib.metadata.PackageNotFoundError: + from legis import __version__ + + expected = __version__ + assert _instructions_version() == expected + assert _instructions_version() # non-empty + + +def test_instructions_version_falls_back_to_dunder(monkeypatch): + import importlib.metadata + + def _raise(_name): + raise importlib.metadata.PackageNotFoundError("legis") + + monkeypatch.setattr(install.importlib.metadata, "version", _raise) + from legis import __version__ + + assert _instructions_version() == __version__ + + +def test_build_block_has_open_and_close_markers(): + block = _build_instructions_block() + assert block.startswith(f"{INSTRUCTIONS_MARKER}:{_marker_token()} -->") + assert block.rstrip().endswith("") + assert _instructions_text() in block + + +def test_extract_marker_token_round_trips_the_writer(): + # The freshness check's reader must parse the exact marker the writer emits. + # Driving it off the real `_build_instructions_block()` output (not a + # hand-written marker) is what keeps the reader from silently desyncing if + # the marker format ever changes — both live in install.py now. + assert _extract_marker_token(_build_instructions_block()) == _marker_token() + + +def test_extract_marker_token_ignores_the_close_marker_and_absence(): + # The close marker (``) carries no token and must + # not be mistaken for the open marker; absent any marker yields None. + assert _extract_marker_token("") is None + assert _extract_marker_token("no marker here") is None + + +# --------------------------------------------------------------------------- +# inject_instructions +# --------------------------------------------------------------------------- + + +def test_inject_creates_missing_file(tmp_path): + target = tmp_path / "CLAUDE.md" + ok, msg = inject_instructions(target) + assert ok + assert "Created" in msg + content = target.read_text() + assert INSTRUCTIONS_MARKER in content + assert "" in content + + +def test_inject_appends_to_existing_file_without_marker(tmp_path): + target = tmp_path / "AGENTS.md" + target.write_text("# My project\n\nExisting guidance.\n") + ok, msg = inject_instructions(target) + assert ok + assert "Appended" in msg + content = target.read_text() + assert "Existing guidance." in content + assert content.index("Existing guidance.") < content.index(INSTRUCTIONS_MARKER) + + +def test_inject_replaces_existing_block_preserving_surrounding_text(tmp_path, monkeypatch): + target = tmp_path / "CLAUDE.md" + target.write_text("TOP\n\n") + inject_instructions(target) + # Append trailing user content after the block. + target.write_text(target.read_text() + "\nBOTTOM\n") + + monkeypatch.setattr(install, "_instructions_text", lambda: "NEW BODY CONTENT\n") + ok, msg = inject_instructions(target) + assert ok + assert "Updated" in msg + content = target.read_text() + assert "TOP" in content + assert "BOTTOM" in content + assert "NEW BODY CONTENT" in content + # Exactly one block remains. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert content.count("") == 1 + + +def test_inject_idempotent_when_content_unchanged(tmp_path): + target = tmp_path / "CLAUDE.md" + inject_instructions(target) + first = target.read_text() + inject_instructions(target) + assert target.read_text() == first + + +def test_inject_repairs_block_with_missing_end_marker(tmp_path): + target = tmp_path / "CLAUDE.md" + # Open marker but no close marker, plus trailing junk. + target.write_text(f"HEAD\n{INSTRUCTIONS_MARKER}:vX:dead -->\norphan body no close\n") + ok, msg = inject_instructions(target) + assert ok + content = target.read_text() + assert "HEAD" in content + assert "orphan body no close" not in content + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "" in content + + +def test_inject_rejects_symlink_target(tmp_path): + real = tmp_path / "real.md" + real.write_text("x") + link = tmp_path / "CLAUDE.md" + link.symlink_to(real) + ok, msg = inject_instructions(link) + assert ok is False + assert "symlink" in msg.lower() + + +# --------------------------------------------------------------------------- +# inject_instructions — foreign-block safety (peer of filigree-bcbd4d66fd) +# --------------------------------------------------------------------------- + +_WARDLINE_BLOCK = ( + "\n" + "wardline body\n" + "\n" +) + + +def test_inject_malformed_block_preserves_coresident_foreign_block(tmp_path): + """An unclosed legis block must NOT truncate a sibling block that follows it.""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body, block NOT closed\n" + + _WARDLINE_BLOCK + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # The foreign block survives intact. + assert "wardline body" in content + assert "" in content + assert "" in content + # Exactly one well-formed legis block remains; the orphan body is gone. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "block NOT closed" not in content + assert content.count("") == 1 + + +def test_inject_shape2_sandwich_preserves_foreign_block(tmp_path, caplog): + """Unclosed-first / closed-later legis must not splice over a sandwiched sibling. + + The stale second legis block surviving beyond the foreign fence must also be + surfaced as a warning (refinement 4), not silently shipped as a split brain. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "first legis body (unclosed)\n" + + _WARDLINE_BLOCK + + f"{INSTRUCTIONS_MARKER}:vY:beef -->\n" + "second legis body\n" + "\n" + ) + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + # Stale duplicate beyond the foreign fence is surfaced, not silent. + assert "duplicate that could not be canonicalised" in caplog.text + + +def test_inject_uppercase_namespace_sibling_survives(tmp_path): + """A sibling block with an upper-cased namespace is still a boundary (refinement 1).""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body no close\n" + "\n" + "wardline body\n" + "\n" + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + + +def test_instructions_body_has_no_fence_token(): + """Pin: the shipped body must not contain a ``:instructions`` fence (refinement 2). + + The bounded scan runs across legis's own body; a fence token there would + misroute the common well-formed path into bounded recovery. + """ + assert ":instructions" not in _instructions_text() + + +def test_inject_marker_text_inside_foreign_block_not_mistaken_for_own(tmp_path): + """A legis marker quoted *inside* a sibling block is not legis's own anchor. + + The literal ```` can legitimately appear inside + another tool's block (a quoted example, documentation). A bare substring anchor + would splice there and gut the sibling. The anchor must respect foreign block + spans, so this file has *no* legis block of its own → append, sibling untouched. + """ + target = tmp_path / "CLAUDE.md" + foreign_block = ( + "\n" + f"See example: {INSTRUCTIONS_MARKER}:v0:0000 -->\n" + "WARDLINE BODY MUST SURVIVE\n" + "\n" + ) + target.write_text("HEAD\n" + foreign_block) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # The sibling block is preserved verbatim — not gutted, not spliced into. + assert foreign_block in content + assert "WARDLINE BODY MUST SURVIVE" in content + # Exactly one well-formed legis block was appended, after the sibling close. + assert content.count("") == 1 + assert content.rindex(INSTRUCTIONS_MARKER) > content.index( + "" + ) + + +def test_inject_reinject_preserves_foreign_block_placed_before_legis(tmp_path): + """A sibling block *before* the legis block survives re-injection on drift. + + The shared-file layout where wardline installs before legis is realistic; the + in-place replace must not reach backwards past ``start`` into a preceding block. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + + _WARDLINE_BLOCK + + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "stale legis body\n" + "\n" + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + assert "" in content + # The legis block was replaced in place (stale body gone), exactly one remains. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "stale legis body" not in content + # The sibling still precedes the legis block. + assert content.index("") < content.index( + INSTRUCTIONS_MARKER + ) + + +def test_inject_bounded_recovery_is_idempotent(tmp_path): + """Repairing a malformed block next to a foreign one is byte-stable on re-run (refinement 3).""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body no close\n" + + _WARDLINE_BLOCK + ) + inject_instructions(target) + first = target.read_text() + inject_instructions(target) + second = target.read_text() + assert first == second + assert "wardline body" in second + + +def test_inject_into_empty_file_produces_clean_single_block(tmp_path): + """An existing zero-byte file gets a clean block, not leading blank lines.""" + target = tmp_path / "CLAUDE.md" + target.write_text("") + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert content.count(INSTRUCTIONS_MARKER) == 1 + # No leading blank-line artifact: the block starts at byte 0. + assert content.startswith(INSTRUCTIONS_MARKER) + + +def test_inject_crlf_file_preserves_foreign_block(tmp_path): + """A CRLF-terminated shared file: the sibling block still survives recovery.""" + target = tmp_path / "CLAUDE.md" + target.write_bytes( + ( + "HEAD\r\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\r\n" + "legis body, block NOT closed\r\n" + "\r\n" + "wardline body\r\n" + "\r\n" + ).encode("utf-8") + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + assert content.count(INSTRUCTIONS_MARKER) == 1 + + +def test_inject_two_clean_legis_blocks_canonicalises_first_keeps_second(tmp_path, caplog): + """Two well-formed legis blocks: the first is canonicalised, the second is kept. + + Bounding at the first own close (not EOF) is deliberate — it preserves any + trailing content legis does not own, so a second block in the tail is surfaced + via a warning rather than silently deleted. Collapsing would require a deletion + window over the bytes between the two blocks, which may be user content. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "first legis body\n" + "\n" + f"{INSTRUCTIONS_MARKER}:vY:beef -->\n" + "second legis body\n" + "\n" + ) + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # First block canonicalised (stale body gone); second block NOT deleted. + assert "first legis body" not in content + assert "second legis body" in content + # The surviving duplicate is surfaced, not silent. + assert caplog.records + + +# --------------------------------------------------------------------------- +# _atomic_write_text +# --------------------------------------------------------------------------- + + +def test_atomic_write_preserves_existing_mode(tmp_path): + target = tmp_path / "CLAUDE.md" + target.write_text("seed") + os.chmod(target, 0o640) + inject_instructions(target) + mode = stat.S_IMODE(target.stat().st_mode) + assert mode == 0o640 + + +def test_reject_symlink_raises_on_symlink(tmp_path): + real = tmp_path / "r" + real.write_text("x") + link = tmp_path / "l" + link.symlink_to(real) + with pytest.raises(UnsafeInstallPathError): + reject_symlink(link) + + +@pytest.mark.parametrize("payload", ["", " \n\t \n"]) +def test_atomic_write_refuses_empty_content(tmp_path, payload): + """Refuse-to-empty guard (filigree-04bad2a2bf parity): never truncate a file to nothing.""" + target = tmp_path / "CLAUDE.md" + target.write_text("populated content\n") + with pytest.raises(ValueError, match="empty"): + install._atomic_write_text(target, payload) + # The populated file is left untouched. + assert target.read_text() == "populated content\n" + + +# --------------------------------------------------------------------------- +# Skill pack +# --------------------------------------------------------------------------- + + +def test_install_skills_copies_pack(tmp_path): + ok, msg = install_skills(tmp_path) + assert ok + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + assert skill.is_file() + assert "legis-workflow" in skill.read_text() + + +def test_install_codex_skills_targets_agents_dir(tmp_path): + ok, _ = install_codex_skills(tmp_path) + assert ok + assert (tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md").is_file() + + +def test_install_skills_idempotent(tmp_path): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + first = skill.read_text() + ok, _ = install_skills(tmp_path) + assert ok + assert skill.read_text() == first + + +def test_skill_tree_fingerprint_changes_with_content(tmp_path): + root = tmp_path / "pack" + root.mkdir() + (root / "a.md").write_text("one") + fp1 = _skill_tree_fingerprint(root) + (root / "a.md").write_text("two") + fp2 = _skill_tree_fingerprint(root) + assert fp1 != fp2 + + +# --------------------------------------------------------------------------- +# Hook registration +# --------------------------------------------------------------------------- + + +def _session_commands(settings: dict) -> list[str]: + cmds: list[str] = [] + for block in settings.get("hooks", {}).get("SessionStart", []): + for hook in block.get("hooks", []): + cmds.append(hook.get("command", "")) + return cmds + + +def test_install_hooks_fresh(tmp_path): + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + cmds = _session_commands(settings) + assert any(c.endswith("session-context") for c in cmds) + + +def test_install_hooks_idempotent_no_duplicate(tmp_path): + install_claude_code_hooks(tmp_path) + install_claude_code_hooks(tmp_path) + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + cmds = [c for c in _session_commands(settings) if c.endswith("session-context")] + assert len(cmds) == 1 + + +def test_install_hooks_upgrades_bare_command(tmp_path, monkeypatch): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + {"hooks": {"SessionStart": [{"hooks": [{"type": "command", "command": "legis session-context"}]}]}} + ) + ) + # Force a resolved binary path so the bare command must be upgraded. + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/opt/bin/legis"]) + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + settings = json.loads((claude / "settings.json").read_text()) + cmds = _session_commands(settings) + assert "/opt/bin/legis session-context" in cmds + assert cmds.count("/opt/bin/legis session-context") == 1 + + +def test_install_hooks_backs_up_malformed_settings(tmp_path, caplog): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text("{ this is not json") + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + assert (claude / "settings.json.bak").is_file() + settings = json.loads((claude / "settings.json").read_text()) + assert any(c.endswith("session-context") for c in _session_commands(settings)) + # The reset is not silent: the user is told a backup was written. + assert ".bak" in msg + assert ".bak" in caplog.text + + +def test_install_hooks_does_not_reuse_scoped_block(tmp_path): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + { + "hooks": { + "SessionStart": [ + {"matcher": "resume", "hooks": [{"type": "command", "command": "legis session-context"}]} + ] + } + } + ) + ) + install_claude_code_hooks(tmp_path) + settings = json.loads((claude / "settings.json").read_text()) + # A new unscoped block must be added — the scoped one does not cover cold start. + blocks = settings["hooks"]["SessionStart"] + unscoped = [b for b in blocks if "matcher" not in b or b.get("matcher") in (None, "*")] + assert unscoped + assert any(h["command"].endswith("session-context") for b in unscoped for h in b["hooks"]) + + +# --------------------------------------------------------------------------- +# _hook_cmd_matches +# --------------------------------------------------------------------------- + + +@pytest.mark.parametrize( + "command,expected", + [ + ("legis session-context", True), + ("/usr/local/bin/legis session-context", True), + ("/path/python -P -m legis session-context", True), + ("/path/python -m legis session-context", True), + ("echo legis session-context", False), + ("legis serve", False), + ], +) +def test_hook_cmd_matches(command, expected): + assert install._hook_cmd_matches(command, "legis session-context") is expected + + +# --------------------------------------------------------------------------- +# register_mcp_json +# --------------------------------------------------------------------------- + + +def test_register_mcp_json_creates_file_with_legis_entry(tmp_path): + from legis.install import register_mcp_json, _legis_mcp_entry + + ok, msg = register_mcp_json(tmp_path) + assert ok, msg + data = json.loads((tmp_path / ".mcp.json").read_text()) + entry = data["mcpServers"]["legis"] + assert entry["type"] == "stdio" + assert entry["args"][0] == "mcp" + assert "--agent-id" in entry["args"] + + +def test_register_mcp_json_preserves_sibling_entries(tmp_path): + from legis.install import register_mcp_json + + (tmp_path / ".mcp.json").write_text( + json.dumps({"mcpServers": {"filigree": {"command": "x", "type": "stdio"}}}) + ) + ok, _ = register_mcp_json(tmp_path) + assert ok + data = json.loads((tmp_path / ".mcp.json").read_text()) + assert "filigree" in data["mcpServers"] + assert "legis" in data["mcpServers"] + + +def test_register_mcp_json_idempotent(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) + first = (tmp_path / ".mcp.json").read_text() + register_mcp_json(tmp_path) + assert (tmp_path / ".mcp.json").read_text() == first + + +def test_legis_mcp_entry_module_fallback_splits_command_and_args(monkeypatch): + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/usr/bin/python3", "-P", "-m", "legis"]) + entry = install._legis_mcp_entry("claude-code") + assert entry["command"] == "/usr/bin/python3" + assert entry["args"] == ["-P", "-m", "legis", "mcp", "--agent-id", "claude-code"] + + +def test_register_mcp_json_explicit_agent_id_wins_over_existing(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path, "claude-code") + register_mcp_json(tmp_path, "new-bot") + data = json.loads((tmp_path / ".mcp.json").read_text()) + args = data["mcpServers"]["legis"]["args"] + i = args.index("--agent-id") + assert args[i + 1] == "new-bot" + + +def test_register_mcp_json_default_preserves_existing_agent_id(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path, "operator-pick") + register_mcp_json(tmp_path) # default (None) → preserve operator choice + data = json.loads((tmp_path / ".mcp.json").read_text()) + args = data["mcpServers"]["legis"]["args"] + i = args.index("--agent-id") + assert args[i + 1] == "operator-pick" + + +def test_register_mcp_json_non_dict_top_level_is_rejected_unchanged(tmp_path): + from legis.install import register_mcp_json + + mcp = tmp_path / ".mcp.json" + mcp.write_text("[]") + ok, msg = register_mcp_json(tmp_path) + assert ok is False + assert "not a JSON object" in msg + assert mcp.read_text() == "[]" + + +# --------------------------------------------------------------------------- +# .gitignore +# --------------------------------------------------------------------------- + + +def test_ensure_gitignore_creates_file(tmp_path): + ok, msg = ensure_gitignore(tmp_path) + assert ok + content = (tmp_path / ".gitignore").read_text() + assert ".weft/legis/" in content + + +def test_ensure_gitignore_appends_missing_rules(tmp_path): + (tmp_path / ".gitignore").write_text("*.db\n") + ok, msg = ensure_gitignore(tmp_path) + assert ok + content = (tmp_path / ".gitignore").read_text() + assert "*.db" in content + assert ".weft/legis/" in content + + +def test_ensure_gitignore_idempotent(tmp_path): + ensure_gitignore(tmp_path) + first = (tmp_path / ".gitignore").read_text() + ok, msg = ensure_gitignore(tmp_path) + assert ok + assert "already" in msg + assert (tmp_path / ".gitignore").read_text() == first + + +# --------------------------------------------------------------------------- +# Command resolution and safe-path edges +# --------------------------------------------------------------------------- + + +def test_find_legis_command_prefers_binary_on_path(monkeypatch): + monkeypatch.setattr(install.shutil, "which", lambda _name: "/opt/bin/legis") + assert install._find_legis_command() == ["/opt/bin/legis"] + + +def test_find_legis_command_module_fallback(monkeypatch): + monkeypatch.setattr(install.shutil, "which", lambda _name: None) + cmd = install._find_legis_command() + assert cmd[-3:] == ["-P", "-m", "legis"] + + +def test_project_path_rejects_symlinked_component(tmp_path): + real_dir = tmp_path / "real_dir" + real_dir.mkdir() + link_dir = tmp_path / ".claude" + link_dir.symlink_to(real_dir, target_is_directory=True) + with pytest.raises(UnsafeInstallPathError): + install.project_path(tmp_path, ".claude", "settings.json") + + +def test_ensure_project_dir_creates_and_returns_dir(tmp_path): + created = install.ensure_project_dir(tmp_path, ".claude", "skills") + assert created.is_dir() + assert created == tmp_path / ".claude" / "skills" + + +def test_install_skills_reports_missing_source(tmp_path, monkeypatch): + empty = tmp_path / "no_skills_here" + empty.mkdir() + monkeypatch.setattr(install, "_get_skills_source_dir", lambda: empty) + ok, msg = install_skills(tmp_path) + assert ok is False + assert "not found" in msg + + +def test_upgrade_hook_commands_tolerates_non_dict_settings(): + assert install._upgrade_hook_commands({"hooks": []}, "legis session-context", "x") is False + assert install._upgrade_hook_commands({}, "legis session-context", "x") is False + + +def test_has_unscoped_session_start_hook_tolerates_non_dict(): + assert install._has_unscoped_session_start_hook({"hooks": "nope"}, "legis session-context") is False + assert install._has_unscoped_session_start_hook({}, "legis session-context") is False + + +def test_install_hooks_leaves_user_scoped_block_command_untouched(tmp_path, monkeypatch): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + { + "hooks": { + "SessionStart": [ + {"matcher": "resume", "hooks": [{"type": "command", "command": "legis session-context"}]} + ] + } + } + ) + ) + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/opt/bin/legis"]) + install_claude_code_hooks(tmp_path) + blocks = json.loads((claude / "settings.json").read_text())["hooks"]["SessionStart"] + + scoped = [b for b in blocks if b.get("matcher") == "resume"][0] + # The user's portable bare command must NOT be pinned to a venv path. + assert scoped["hooks"][0]["command"] == "legis session-context" + # legis still adds its own unscoped block with the resolved command. + unscoped = [b for b in blocks if "matcher" not in b or b.get("matcher") in (None, "*")] + assert any(h["command"] == "/opt/bin/legis session-context" for b in unscoped for h in b["hooks"]) + + +def test_install_hooks_backs_up_nested_corrupt_structure(tmp_path): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text(json.dumps({"hooks": "important user data", "keep": 1})) + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + bak = claude / "settings.json.bak" + assert bak.is_file() + assert "important user data" in bak.read_text() + settings = json.loads((claude / "settings.json").read_text()) + assert settings.get("keep") == 1 # sibling key preserved + assert any(c.endswith("session-context") for c in _session_commands(settings)) + # The recovery of the corrupt nested structure is surfaced, not silent. + assert ".bak" in msg + + +def test_install_skills_restores_original_on_genuine_swap_failure(tmp_path, monkeypatch): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + original = skill.read_text() + + real_rename = os.rename + calls = {"n": 0} + + def flaky_rename(src, dst): + calls["n"] += 1 + if calls["n"] == 2: # the staging -> target swap + raise OSError("simulated swap failure") + return real_rename(src, dst) + + monkeypatch.setattr(install.os, "rename", flaky_rename) + ok, msg = install_skills(tmp_path) + + assert ok is False + assert "swap failed" in msg + # The previously installed pack must survive a genuine swap failure. + assert skill.is_file() + assert skill.read_text() == original + + +def test_inject_append_keeps_marker_off_users_last_line(tmp_path): + target = tmp_path / "CLAUDE.md" + target.write_text("# Project\nlast line no newline") # no trailing newline + inject_instructions(target) + content = target.read_text() + assert "last line no newline\n" in content + idx = content.index(INSTRUCTIONS_MARKER) + assert content[idx - 1] == "\n" + + +def test_ensure_gitignore_present_among_other_rules_not_duplicated(tmp_path): + # legis's rule already present alongside unrelated rules → nothing to add. + (tmp_path / ".gitignore").write_text("*.db\n.weft/legis/\n") + ok, msg = ensure_gitignore(tmp_path) + assert ok + assert "already" in msg # detected as present, not re-appended + content = (tmp_path / ".gitignore").read_text() + assert content.count(".weft/legis/") == 1 # not duplicated diff --git a/tests/test_weft_signing.py b/tests/test_weft_signing.py new file mode 100644 index 0000000..a69163b --- /dev/null +++ b/tests/test_weft_signing.py @@ -0,0 +1,83 @@ +"""The shared Weft-component transport-HMAC seam. + +These pin the single wire definition that ``identity/loomweave_client`` and +``filigree/client`` both delegate to, and guard against the two channels +silently re-diverging (the duplication this module was extracted to remove). +""" + +from __future__ import annotations + +import hashlib +import hmac + +from legis.filigree.client import sign_filigree_request +from legis.identity.loomweave_client import sign_loomweave_request +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + + +def test_weft_body_bytes_is_compact_sorted_ascii(): + # The signed bytes are compact, key-sorted, and ASCII-escaped — deliberately + # NOT canonical.canonical_json (ensure_ascii=False), which would change the + # signed bytes and break the cross-tool HMAC contract. + assert weft_body_bytes({"b": 1, "a": "x"}) == b'{"a":"x","b":1}' + assert weft_body_bytes({"k": "é"}) == b'{"k":"\\u00e9"}' # escaped, not raw utf-8 + assert weft_body_bytes(None) == b"" + + +def test_weft_path_and_query_carries_query_and_defaults_root(): + assert weft_path_and_query("https://h/api/x?e=1") == "/api/x?e=1" + assert weft_path_and_query("https://h/api/x") == "/api/x" + assert weft_path_and_query("https://h") == "/" + + +def test_sign_weft_request_matches_explicit_hmac_contract(): + key = b"weft-key" + body = {"locator": "python:function:m.f"} + headers = sign_weft_request( + "loomweave", key, "POST", "https://h/api/v1/identity/resolve", body, + timestamp=1_900_000_000, nonce="nonce-1", + ) + body_hash = hashlib.sha256(weft_body_bytes(body)).hexdigest() + message = ( + f"POST\n/api/v1/identity/resolve\n{body_hash}\n1900000000\nnonce-1" + ).encode("utf-8") + expected = hmac.new(key, message, hashlib.sha256).hexdigest() + assert headers == { + "X-Weft-Component": f"loomweave:{expected}", + "X-Weft-Timestamp": "1900000000", + "X-Weft-Nonce": "nonce-1", + } + + +def test_both_channels_share_one_seam_differing_only_by_component(): + # Anti-drift guard: for identical inputs the Loomweave and Filigree channels + # must produce the SAME signature — only the component namespace differs. If + # a future change to one channel's canonicalization slips in, this fails. + key, method, url = b"weft-key", "POST", "https://h/api/issue/I-1/x?q=1" + body = {"entity_id": "loomweave:eid:abc", "content_hash": "h"} + kwargs = dict(timestamp=1_700_000_000, nonce="cafef00d") + + loom = sign_loomweave_request(key, method, url, body, **kwargs) + fil = sign_filigree_request(key, method, url, body, **kwargs) + + assert loom["X-Weft-Component"].startswith("loomweave:") + assert fil["X-Weft-Component"].startswith("filigree:") + # Strip the namespace prefix -> the HMACs are byte-identical. + assert loom["X-Weft-Component"].split(":", 1)[1] == fil["X-Weft-Component"].split(":", 1)[1] + assert loom["X-Weft-Timestamp"] == fil["X-Weft-Timestamp"] + assert loom["X-Weft-Nonce"] == fil["X-Weft-Nonce"] + + +def test_weft_hmac_key_from_env_prefers_channel_then_shared(monkeypatch): + monkeypatch.delenv("LEGIS_CHAN_KEY", raising=False) + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") is None + monkeypatch.setenv("LEGIS_HMAC_KEY", "shared") + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") == b"shared" + monkeypatch.setenv("LEGIS_CHAN_KEY", "channel") + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") == b"channel" # channel-specific wins diff --git a/tests/wardline/test_ingest.py b/tests/wardline/test_ingest.py index 75572ca..bcddfb5 100644 --- a/tests/wardline/test_ingest.py +++ b/tests/wardline/test_ingest.py @@ -1,7 +1,13 @@ +import json + import pytest +from legis.canonical import canonical_json, content_hash from legis.wardline.ingest import ( TRUST_TIERS, + ArtifactStatus, + ScanOutcome, + Suppressed, WardlineFinding, WardlinePayloadError, WardlineSeverity, @@ -9,6 +15,36 @@ ) +def test_str_enum_axes_are_byte_identical_to_bare_strings_on_the_wire(): + # The load-bearing compat contract: a str,Enum serializes EXACTLY like its + # bare string through json.dumps and canonical_json (so wire payloads and the + # content-hashed audit chain are unchanged). Pin it directly so a future + # Python/enum change that alters str,Enum serialization fails here loudly, + # not silently downstream. + cases = [ + (ScanOutcome.ROUTED, "ROUTED"), + (ScanOutcome.SKIPPED_DIRTY_TREE, "SKIPPED_DIRTY_TREE"), + (ArtifactStatus.VERIFIED, "verified"), + (ArtifactStatus.DIRTY, "dirty"), + (ArtifactStatus.UNVERIFIED, "unverified"), + (Suppressed.ACTIVE, "active"), + (Suppressed.WAIVED, "waived"), + (Suppressed.SUPPRESSED, "suppressed"), + (Suppressed.BASELINED, "baselined"), + (Suppressed.JUDGED, "judged"), + ] + for member, raw in cases: + assert member == raw + assert json.dumps({"k": member}) == json.dumps({"k": raw}) + assert canonical_json({"k": member}) == canonical_json({"k": raw}) + assert content_hash({"k": member}) == content_hash({"k": raw}) + # The back-compat alias and the error's reason still equal the bare string + # that callers/boundaries imported and serialized before the enum existed + # (both are bound by the module-level import block below). + assert SKIPPED_DIRTY_TREE == "SKIPPED_DIRTY_TREE" + assert WardlineDirtyTreeError.reason == "SKIPPED_DIRTY_TREE" + + def _finding(**over): base = {"rule_id": "PY-WL-101", "message": "m", "severity": "ERROR", "kind": "defect", "fingerprint": "fp1", "qualname": "m.f", @@ -109,3 +145,128 @@ def test_unknown_suppression_state_is_still_rejected(): scan = {"findings": [_finding(fingerprint="x", suppressed="haunted")]} with pytest.raises(WardlinePayloadError, match="unsupported suppression state"): active_defects(scan) + + +# --- dirty-tree dev artifact (P0 dev path + P1 typed amber SKIPPED_DIRTY_TREE) --- +# +# wardline `scan --format legis --allow-dirty` emits an UNSIGNED dev artifact +# marked `dirty: true` (signing stays clean-tree-only). legis must: +# - keyless dev: govern it, but record the dirty marker honestly; +# - CI posture (key configured): NOT conflate "dirty dev tree" with a +# tampered/malformed payload (a generic red). Default to a typed amber +# SKIPPED_DIRTY_TREE; govern unsigned only under an explicit dev-mode opt-in. +# The relaxation is scoped to exactly `dirty is True AND signature absent` — a +# signed (or clean) payload still verifies normally, so a real tamper stays red. + +from legis.enforcement.signing import sign # noqa: E402 +from legis.wardline.ingest import ( # noqa: E402 + SKIPPED_DIRTY_TREE, + WardlineDirtyTreeError, + verify_wardline_artifact, + wardline_artifact_fields, +) + +_KEY = b"wardline-artifact-key" + + +def _artifact(*, dirty=None, signed=False, key=_KEY, **over): + scan = { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "findings": [], + } + if dirty is not None: + scan["dirty"] = dirty + scan.update(over) + if signed: + scan["artifact_signature"] = sign(wardline_artifact_fields(scan), key) + return scan + + +def test_dirty_error_is_not_a_generic_payload_error(): + # The amber skip must be DISTINGUISHABLE from the generic red at the + # boundary — so it is not a WardlinePayloadError (which maps to 422 / + # INVALID_ARGUMENT). It carries a typed reason instead. + assert not issubclass(WardlineDirtyTreeError, WardlinePayloadError) + assert WardlineDirtyTreeError.reason == SKIPPED_DIRTY_TREE + + +def test_keyless_dirty_artifact_governs_with_honest_dirty_status(): + # Keyless local dev is already permissive; the only change is that the + # dirty marker is recorded honestly so a dirty dev scan is distinguishable + # from a clean unsigned one. + prov = verify_wardline_artifact(_artifact(dirty=True), None) + assert prov["artifact_status"] == "dirty" + assert prov["commit_sha"] == "a" * 40 + + +def test_keyless_clean_unsigned_artifact_stays_unverified(): + prov = verify_wardline_artifact(_artifact(), None) + assert prov["artifact_status"] == "unverified" + + +def test_ci_dirty_without_devmode_is_typed_amber_skip_not_red(): + # P1: key configured, dirty + unsigned, dev-mode OFF -> typed amber skip, + # NOT a generic WardlinePayloadError red. + with pytest.raises(WardlineDirtyTreeError) as exc: + verify_wardline_artifact(_artifact(dirty=True), _KEY, allow_dirty=False) + assert exc.value.reason == SKIPPED_DIRTY_TREE + + +def test_ci_dirty_with_devmode_governs_unsigned_as_dirty(): + # P0: key configured, dirty + unsigned, dev-mode ON -> govern unsigned, + # recorded honestly as dirty (never "verified"). + prov = verify_wardline_artifact(_artifact(dirty=True), _KEY, allow_dirty=True) + assert prov["artifact_status"] == "dirty" + assert "artifact_signature" not in prov + assert prov["scanner_identity"] == "wardline@1.0.0rc1" + + +def test_devmode_does_not_relax_a_tampered_signature(): + # Security row: dirty + a PRESENT-but-invalid signature is tampering, not a + # dev tree. Relaxation is scoped to UNSIGNED only, so this stays red even + # with dev-mode on. + scan = _artifact(dirty=True) + scan["artifact_signature"] = "hmac-sha256:v2:" + "0" * 64 # forged + with pytest.raises(WardlinePayloadError, match="does not verify"): + verify_wardline_artifact(scan, _KEY, allow_dirty=True) + + +def test_devmode_does_not_relax_a_clean_unsigned_artifact(): + # Security row: dev-mode relaxes ONLY dirty+unsigned, never "any unsigned". + # A clean (dirty absent/false) unsigned artifact still requires a signature. + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(dirty=False), _KEY, allow_dirty=True) + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(), _KEY, allow_dirty=True) + + +def test_dirty_marker_must_be_strict_boolean_true(): + # The scan dict is attacker-controlled. A truthy non-True dirty value + # (string "true", 1) must NOT trip the dev relaxation — it falls through to + # normal verification (red when a key is configured and it is unsigned). + for bogus in ("true", "True", 1, [1]): + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(dirty=bogus), _KEY, allow_dirty=True) + + +def test_signed_dirty_artifact_verifies_normally(): + # A validly-signed payload that also carries dirty:true is trusted via its + # signature (only the key-holder can produce it); the dirty marker does not + # hijack the signed path into a skip. + scan = _artifact(dirty=True, signed=True) + prov = verify_wardline_artifact(scan, _KEY, allow_dirty=False) + assert prov["artifact_status"] == "verified" + + +def test_ci_posture_missing_provenance_field_is_red(): + # Key configured, clean (not dirty), but a required provenance field is + # absent -> generic red BEFORE signature verification is even attempted. This + # is the non-dirty CI branch that demands signed scanner/rule-set/commit/tree + # provenance; a scan missing any of them is malformed, not an amber skip. + scan = _artifact() # all four provenance fields present, unsigned + del scan["tree_sha"] + with pytest.raises(WardlinePayloadError, match="missing required field"): + verify_wardline_artifact(scan, _KEY) diff --git a/uv.lock b/uv.lock index f8f4e34..c1797e1 100644 --- a/uv.lock +++ b/uv.lock @@ -355,7 +355,7 @@ wheels = [ [[package]] name = "legis" -version = "1.0.0rc3" +version = "1.0.0rc4" source = { editable = "." } dependencies = [ { name = "fastapi" }, @@ -371,6 +371,7 @@ dev = [ { name = "mypy" }, { name = "pytest" }, { name = "pytest-cov" }, + { name = "ruff" }, { name = "types-pyyaml" }, ] @@ -389,6 +390,7 @@ dev = [ { name = "mypy", specifier = ">=1.19" }, { name = "pytest", specifier = ">=8.0" }, { name = "pytest-cov", specifier = ">=5.0" }, + { name = "ruff", specifier = ">=0.8" }, { name = "types-pyyaml", specifier = ">=6.0" }, ] @@ -716,6 +718,31 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" }, ] +[[package]] +name = "ruff" +version = "0.15.16" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/a6/bd/5f7ec371001337d8fa61701c186ff8b613ecac1651848c5950f4c4d5f2e9/ruff-0.15.16.tar.gz", hash = "sha256:d05e78d38c78caf020b03789e25106c93017db5a0cb6e2819885018c61343b78", size = 4714267, upload-time = "2026-06-04T16:33:09.974Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/0c/42/53ef1c3953f157956db9bf7861e3bc50b9b887ce93300aa48cdba8336fe6/ruff-0.15.16-py3-none-linux_armv6l.whl", hash = "sha256:6ac3c0b3969cc6cf6b158c4e2f8f682acb58e7d700d8a44b65ecdc72d66ab0b2", size = 10709025, upload-time = "2026-06-04T16:32:51.935Z" }, + { url = "https://files.pythonhosted.org/packages/93/9a/a79159346f19134a956607754e57d8d128f7a4c00f4ad2f7514d224c172c/ruff-0.15.16-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:197c207ed75ffba54a0dec23db4aa939a27a3053073e085e0042433cbdc58e4a", size = 11063550, upload-time = "2026-06-04T16:32:42.24Z" }, + { url = "https://files.pythonhosted.org/packages/bc/72/3ce2ac000a5299ec238e01f51397b3b653c93b077d9b1bfe8715bb895f20/ruff-0.15.16-py3-none-macosx_11_0_arm64.whl", hash = "sha256:3a39fec45ab316cc23e7558f23fea4a70403ddb5648ea9a4a3854a16973d0071", size = 10421345, upload-time = "2026-06-04T16:32:37.251Z" }, + { url = "https://files.pythonhosted.org/packages/b0/c2/cc7fad3ec9169373f5b6a18f1917b91080feec40c3f9658334a1d28e2f03/ruff-0.15.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ba93191d79003116b95128c9d306e045200fdbd0bccb782b110f3cd1d4abc5cf", size = 10757217, upload-time = "2026-06-04T16:32:54.722Z" }, + { url = "https://files.pythonhosted.org/packages/69/d2/3474009eaa0a65b31fa7152a2fad5e2f050c640ceb1e6b02ee6922e94c82/ruff-0.15.16-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c6ee4b90520630120ef032aa5cc10db483852dff950e78b1d717e2993a61ac8d", size = 10507035, upload-time = "2026-06-04T16:33:05.343Z" }, + { url = "https://files.pythonhosted.org/packages/ca/81/b7ae6ccbd11f0c8dc3d5d67fc4be9b57ff57ca86ba56152021378e1277f2/ruff-0.15.16-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4e4215bc938bc3c8215c1472c1aa437e310fee20cd427335fec9d7e609563628", size = 11255291, upload-time = "2026-06-04T16:32:49.49Z" }, + { url = "https://files.pythonhosted.org/packages/d9/e1/46e526f1a7cc90857ce6ddf25fbb77eb6568651ac38d71b033af07076dd5/ruff-0.15.16-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:7c8d26be963b090f10e29abc8b3e74a2a321f6fa34e02424e30b5af89350ecbb", size = 12124922, upload-time = "2026-06-04T16:33:07.821Z" }, + { url = "https://files.pythonhosted.org/packages/1a/da/5c791b088b596b24d0deb967fa28ae02ad751a140c0b9ea81c5ab915d6c0/ruff-0.15.16-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f198cf4123602a2280ed46c307bcbafe41758d6fee5b456b6b6058ca1514b3b4", size = 11332186, upload-time = "2026-06-04T16:33:02.971Z" }, + { url = "https://files.pythonhosted.org/packages/72/11/5da87abe20047c8962361473923ebb2f62b595250126aadfad8c20649c1e/ruff-0.15.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb27515fa6240fb586ae82b901a59e67d24acff86f2190b433dc542fe0435aeb", size = 11373541, upload-time = "2026-06-04T16:32:47.007Z" }, + { url = "https://files.pythonhosted.org/packages/fe/2a/8554754c23a854ae3fd6b507e36ad61ddb121e298c6d5d617dec94ed0f14/ruff-0.15.16-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:a267c46ba1593fc26b8eecbea050b39d40c0b6bb7781ee11c90a02cd10032951", size = 11353014, upload-time = "2026-06-04T16:32:34.795Z" }, + { url = "https://files.pythonhosted.org/packages/62/25/62ea41529ec89f742ea3fed9cb1059c72877ec7cf9b9e99ac9cf3294d1d9/ruff-0.15.16-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:528c68f39a91498a8d50e91ff5985df3d105782bab49cc378e73ac26bff083e8", size = 10737467, upload-time = "2026-06-04T16:32:26.348Z" }, + { url = "https://files.pythonhosted.org/packages/90/17/334d3ad9de4d40f9dd58fdd09e35ce64553bb501e2f19a839e2fb6be14fc/ruff-0.15.16-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:7ed55c58950df60589a9a7a5d2f8fa5f54ebd287163be805adfe6ee95a9de123", size = 10521910, upload-time = "2026-06-04T16:32:32.54Z" }, + { url = "https://files.pythonhosted.org/packages/4d/bd/3ac7c6ae77a885c1004b3dda2446ea401768d24f851c14b4ad4b24f6639c/ruff-0.15.16-py3-none-musllinux_1_2_i686.whl", hash = "sha256:d482feaf51512b50f9790ceb417a56a61dd1e9d9bf967662b9ed27c01b34f53a", size = 10979190, upload-time = "2026-06-04T16:32:57.492Z" }, + { url = "https://files.pythonhosted.org/packages/33/d7/609546e6a413c3f216fbf2a50c928f97c80939154f6a0503114094a86191/ruff-0.15.16-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:1e15bc8c94513dae2a40cc9ef07c94fdd4ecc9e29dabebeebe170f952322c9e3", size = 11477014, upload-time = "2026-06-04T16:32:44.687Z" }, + { url = "https://files.pythonhosted.org/packages/74/0d/f2cd247ad32633a5c36e97141a2c21b11c6279f7957bc2ff360b1e08fddd/ruff-0.15.16-py3-none-win32.whl", hash = "sha256:580378f7bd4aa25f72e74aa54948a9622f142b1e509521dd10902e886681cc1e", size = 10735541, upload-time = "2026-06-04T16:32:30.145Z" }, + { url = "https://files.pythonhosted.org/packages/8b/9e/02e845ef151b1dee585e55c4739f8e1734ae1d9f1221dff65761c162208b/ruff-0.15.16-py3-none-win_amd64.whl", hash = "sha256:408256017284eddf98fff77b29aa4fb30f586042d535b2d9befc6512f400aaec", size = 11843403, upload-time = "2026-06-04T16:32:39.76Z" }, + { url = "https://files.pythonhosted.org/packages/15/19/016553f86f207450aebebc2b2b5088d086b901cc8186c02ac4284db3bd88/ruff-0.15.16-py3-none-win_arm64.whl", hash = "sha256:8cd61783afb39638a7133ef0d2dfb1e91277593962f81b5a8423eb0b888a6121", size = 11134555, upload-time = "2026-06-04T16:33:00.136Z" }, +] + [[package]] name = "sqlalchemy" version = "2.0.50"