compat: wire the 42-module parity matrix into CI with per-module trend

Part of #793 (axis 3 — Node.js platform built-in modules with real semantics).

## What already exists (committed, not yet CI-wired)

A behavioral parity suite was built and committed over the last week:

- **`test-files/test_parity_<module>.ts`** — 42 files, one per supported `node:*` module. Every API in the parity reference becomes a `console.log("label:", expr)` line under per-section try/catch. The byte-for-byte diff against `node --experimental-strip-types` IS the per-module gap report. Commits: `d0e84724`, `08c9004a`, `1f2f44e5`.
- **`docs/runtime-parity.md`** — leaf-level inventory of the entire Node + Bun stdlib API surface (~50 modules, ~4,200 rows: `| API | Node.js | Bun | Notes |`), sourced from the authoritative Node docs + Bun compat table.
- **`docs/runtime-parity-gaps.md`** — that inventory diffed against Perry's compile-time coverage (manifest + `Expr::*` HIR variants + `js_*` FFI exports across `perry-runtime` / `perry-stdlib` / 35 `perry-ext-*` crates). 2,417 true gaps.
- **`scripts/gen_parity_tests.py`** + **`scripts/parity-skiplist.toml`** — regenerate skeletons from the inventory; skiplist documents architecturally-out-of-scope modules (vm/inspector/v8/etc.) with rationale.
- **`run_parity_tests.sh --filter parity_`** — runs just the parity matrix; auto-`PERRY_ALLOW_UNIMPLEMENTED=1` so unimplemented APIs surface as runtime divergence instead of hard compile errors; strips DEP/Experimental Node warnings from the diff.

This is ~70% of the axis-3 instrumentation #793 is asking for. It is currently run **ad-hoc locally** and has repeatedly been cleaned out of the working tree between sessions (now committed, so durable).

## What this sub-issue adds

Wire it into CI as a tracked, per-module signal — distinct from #794 (per-category thresholds) and #800 (Node's own test corpus):

1. **CI job** runs `./run_parity_tests.sh --filter parity_` on every push to main.
2. **Per-module trend artifact** — emit a table of `module | node_lines | perry_lines | diff_lines | status` so a regression in any single module is visible (not buried in a global %). The diff-line count per module is already computed locally; just persist it as a CI artifact + job annotation.
3. **Baseline lock** — store the current per-module diff counts as `test-parity/parity_matrix_baseline.json`; CI fails if any module's diff grows (regression) and prints a celebratory note when one reaches 0 (full byte-for-byte PASS).
4. **Hard per-test output cap** — see the robustness note on #796: one test (`test_parity_timers_promises`, root cause #712) emitted 5.7M identical lines and the runner's O(n²) bash `normalize_output` burned ~3 hours. The CI job MUST cap per-test stdout (e.g. 50k lines) and the normalizer must be linear, or pathological output DOSes the run.

## Current baseline (v0.5.910 sweep, for reference)

- **PASS (diff = 0):** `os`, `tty`, `async_local_storage` (+ `argon2`, `ethers` as expected-output).
- **One fix from PASS:** `path` (diff = 4, blocked only on #810 — `path.posix`/`win32` sub-namespace).
- **Near-pass (diff < 35):** `string_decoder`, `async_hooks`, `tls`, `readline`.
- **18 compile-fails:** modules with `import * as X from "node:X"` where Perry has no stdlib backing — strict-import policy (#629 / v0.5.835) correctly rejects them; these need real native bindings, not test changes.
- **Large surface gaps:** `http` (307), `fs` (207), `stream` (141), `zlib` (130), `net` (115), `buffer` (93).

Trend across sweeps v0.5.713 → v0.5.910: 0 → 5 PASS as #623/#629/#630/#631/#648/#649/#650/#681/#682/#712/#741/#742 landed.

## Done when

- A CI job produces the per-module diff table as an annotation on every main push.
- `parity_matrix_baseline.json` is committed and regressions fail CI.
- Per-test output is hard-capped and `normalize_output` is linear-time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compat: wire the 42-module parity matrix into CI with per-module trend #812

What already exists (committed, not yet CI-wired)

What this sub-issue adds

Current baseline (v0.5.910 sweep, for reference)

Done when

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

compat: wire the 42-module parity matrix into CI with per-module trend #812

Description

What already exists (committed, not yet CI-wired)

What this sub-issue adds

Current baseline (v0.5.910 sweep, for reference)

Done when

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions