Skip to content

fix(scan): require explicit trust for .wardline/judged.yaml suppressions#24

Closed
tachyon-beep wants to merge 1 commit into
mainfrom
codex/propose-fix-for-judged.yaml-vulnerability
Closed

fix(scan): require explicit trust for .wardline/judged.yaml suppressions#24
tachyon-beep wants to merge 1 commit into
mainfrom
codex/propose-fix-for-judged.yaml-vulnerability

Conversation

@tachyon-beep

Copy link
Copy Markdown
Collaborator

Motivation

  • Repository-controlled .wardline/judged.yaml was being applied by scan before gating, allowing untrusted judged records to suppress real defects and clear --fail-on in CI.
  • The change aims to preserve the local trusted workflow (where judged suppressions are useful) while protecting the CI/trust boundary by default.

Description

  • Make run_scan() ignore load_judged(root / ".wardline" / "judged.yaml") unless the caller sets trust_judged_suppressions=True (default False), so judged suppressions are opt-in for trusted checkouts via an explicit operator decision.
  • Add a new CLI option --trust-judged-suppressions to wardline scan and thread it through the initial scan and the post-autofix rescan so local/trusted workflows are preserved when requested.
  • Keep judge workflows unchanged for triage/persist paths by calling run_scan(..., trust_judged_suppressions=True) from run_judge() so judge --write and triage flows still consult/write judged records as expected.
  • Strengthen load_judged() to require a verdict field and reject any record where verdict != "FALSE_POSITIVE" to avoid accepting forged/non-FP entries as suppressions.
  • Update unit tests to reflect the trust-boundary semantics and to add coverage for non-FALSE_POSITIVE judged records.

Testing

  • Ran byte-compile checks with PYTHONPATH=src python -m py_compile ... on the modified modules, which succeeded.
  • Ran style/lint checks with uv run ruff check ..., which passed.
  • Attempted targeted pytest runs (e.g. uv run --extra scanner pytest tests/unit/core/test_judged.py tests/unit/cli/test_cli.py::test_judge_write_then_scan_gate_requires_trust_flag tests/unit/cli/test_cli.py::test_scan_with_fix_rescan_preserves_strict_defaults -q) but the test run was blocked by environment/network limitations while fetching extras (jsonschema) and missing runtime packages (yaml), so full automated tests could not be completed in this environment.

Codex Task

tachyon-beep added a commit that referenced this pull request Jun 5, 2026
…ail-on) (#28)

Close a HIGH-severity CI-gate bypass. `wardline scan --fail-on` applied
repository-controlled suppressions (`.wardline/baseline.yaml`, `wardline.yaml`
waivers, `.wardline/judged.yaml`) to findings BEFORE evaluating the gate, so a
malicious PR could commit a suppression keyed to its own new defect's
fingerprint and clear the gate. All three sources are committed repo content
and equally exploitable. Reproduced live (baselining the sole ERROR zeroed the
gate).

Secure-by-default model (combines #24 + #25):
- `gate_decision` now evaluates a separate UNSUPPRESSED population
  (`ScanResult.gate_findings`). baseline/waiver/judged still ANNOTATE the
  emitted findings (`suppressed=…` stays visible) but no longer clear the gate.
- The gate population is built with apply_suppressions over EMPTY baseline +
  waivers + judged, NOT `list(raw)`, so the lineless-DEFECT→non-gating-FACT
  downgrade is preserved (no spurious gate trips).
- `--new-since <ref>` (operator-supplied, unforgeable) scopes BOTH the emitted
  and gate populations — the secure CI ratchet.
- `--trust-suppressions` (CLI) / `trust_suppressions` (run_scan, MCP scan tool),
  default False, restores the local ratchet / judge DX for trusted checkouts
  (None sentinel → gate falls back to the suppressed findings). `run_judge`
  passes True so judge/triage/persist are unchanged.
- `load_judged` now requires `verdict: FALSE_POSITIVE` (rejects a hand-edited
  TRUE_POSITIVE / missing verdict smuggled in as a silent suppression).

BREAKING (noted in CHANGELOG, acceptable at 0.x): baseline-gated CI goes
green→red on upgrade until `--new-since` or `--trust-suppressions` is added.
Docs updated (suppression.md): the secure CI ratchet is `--new-since`.

Combines and supersedes #24 (judged-only) and #25 (no escape hatch + a
lineless-DEFECT gate bug). Full suite green (2394 passed), ruff + mypy clean.

Co-authored-by: John Morrissey <john@wardline.dev>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tachyon-beep

Copy link
Copy Markdown
Collaborator Author

Fixed ourselves in 16a4d005 (PR #28, merged to main) — combined with #25.

The vulnerability is real: a repo-committed .wardline/judged.yaml FALSE_POSITIVE record was applied before the --fail-on gate, so it could suppress a real defect and clear CI. But judged is one of three identical repo-controlled suppression vectors — .wardline/baseline.yaml and wardline.yaml waivers flow through the exact same path (run_scanapply_suppressionsgate_trips skips non-ACTIVE) and were left fully exploitable by this PR's judged-only scope. So we couldn't merge this as the fix.

#28 closes all three vectors: by default the gate evaluates the unsuppressed population; baseline/waiver/judged still annotate findings but don't clear the gate. The secure CI ratchet is the operator-supplied --new-since <merge-base>; --trust-suppressions (default off) restores the local ratchet for trusted checkouts.

Your contributions were kept, not discarded:

Full suite 2406 passed, ruff + mypy clean, repro confirms the gate trips on a suppressed defect by default. Thanks — closing in favour of #28.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant