fix(equiv): Linux-canonical skip on non-Linux (#213)#216
Merged
Conversation
Equivalence tests are now declared Linux-canonical per
docs/superpowers/specs/2026-05-26-linux-canonical-equiv-policy-design.md.
Each of the 7 test_equivalence.py files (eqlib/trlib/tilib/fplib/
wrlib/wrxlib/totlib) gains:
IS_LINUX = sys.platform.startswith("linux")
@unittest.skipUnless(
IS_LINUX,
"Equivalence tests are Linux-canonical. ... See "
"docs/baseline-policy.md.",
)
class TestEquivalence(...):
placed outside the existing decorator chain (the previous lib-so-
exists / importable / RUN_OK decorators stay as-is, after the
Linux check).
Verification on the dev rig (macOS Homebrew GCC 15.2.0):
$ pytest -p no:cacheprovider --forked --timeout=120 \
--timeout-method=signal \
python/{eqlib,trlib,tilib,fplib,wrlib,wrxlib,totlib}/tests/test_equivalence.py
15 skipped in 0.72s
The 4 #213 failures (fp_dt1 RPCT, fp_iter01 RPCT, wrx_demo,
wrx_iter01 pwr_tot) are gone — replaced with documented skips
per the policy.
Linux CI (python-tests.yml:323 whole-tree pytest) continues to run
all 7 suites on Ubuntu gfortran 13.x; the IS_LINUX predicate is
True there so behavior is unchanged.
docs/baseline-policy.md and the memory update follow in the next
two commits.
Spec: docs/superpowers/specs/2026-05-26-linux-canonical-equiv-policy-design.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New top-level docs/baseline-policy.md describes:
- What the 1e-10 contract asserts (within-platform reproducibility)
and what it doesn't (cross-platform bit-equivalence).
- The canonical platform: Ubuntu CI runner with gfortran 13.x.
- Non-Linux behavior (skip; WSL + Linux Docker count as Linux per
sys.platform).
- How macOS dev verifies locally (Linux container).
- How a future contributor promotes a new platform to canonical,
explicitly enumerating the two structural prerequisites
(graphics-stubs gap + Python-fixture-parity gap) that would
need to close first.
- Cross-references to memory + the SUPERSEDED platform-keyed
design history.
The 7 test_equivalence.py skipUnless messages (prior commit) point
here.
Spec: docs/superpowers/specs/2026-05-26-linux-canonical-equiv-policy-design.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex pre-push review LOW: prose said `sys.platform == 'linux'`
but the actual code uses `sys.platform.startswith("linux")`.
Tightened to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex 2026-05-26 PR-216 review MED: line 45 referenced \`memory/reference_clavius_baseline_regen.md\` as if it were a repo-relative path. Memory files actually live outside the repo at \$CLAUDE_MEMORY_DIR; the convention in this repo's other docs is to use the slug alone (e.g. lines 111/128/131 of this same file). Aligned line 45 to that convention to avoid Bugbot dead-link false positives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit adb5af7. Configure here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #213. Ends the macOS-only
1e-10equivalence false-positive class without skipping for invisibility reasons. Equivalence tests are declared Linux-canonical: Ubuntu CI runner with gfortran 13.x is the authoritative environment; non-Linux platforms skip with an explicit, policy-documented reason.What's in this PR (4 commits)
test(equiv): Linux-canonical skipUnless on all 7 modules—IS_LINUX = sys.platform.startswith("linux")+@unittest.skipUnless(IS_LINUX, ...)outermost decorator onTestEquivalencein eqlib/trlib/tilib/fplib/wrlib/wrxlib/totlib. Skip message referencesdocs/baseline-policy.md.docs: baseline-policy.md — Linux-canonical equivalence— new top-level policy doc (~140 lines) covering: what the 1e-10 contract asserts/does-not-assert, canonical platform, non-Linux behavior, local-verification via Linux container, prerequisites to promote a new platform to canonical.docs(baseline-policy): link follow-up issue #215— replaces gh-search placeholder with Python fixture parity for the 6-8 dead-baseline equivalence cases #215 link.docs(baseline-policy): align predicate prose with code— Codex pre-push LOW fix: prose now sayssys.platform.startswith("linux")matching the actual code.Why Linux-canonical, not platform-keyed
The platform-keyed approach (committed as design history in
docs/superpowers/specs/2026-05-26-platform-keyed-baselines-design.md, marked SUPERSEDED) was rejected after Codex execution-blocker review surfaced two structural prerequisites:tothastot_static_stubs.f90). Memoryreference_clavius_baseline_regen.mdalready documents this.<case>_params.py. Filed as Python fixture parity for the 6-8 dead-baseline equivalence cases #215.Both gaps would need to close before per-platform baseline regen is feasible. Until then: Linux-canonical is the principled position.
Verification
macOS dev rig (Homebrew GCC 15.2.0):
The 4 #213 failures (
fp_dt1RPCT,fp_iter01RPCT,wrx_demo,wrx_iter01.pwr_tot) are gone — replaced with documented skips per the policy.Linux CI (
python-tests.yml:323whole-tree pytest) continues to exercise all 7 equivalence suites on Ubuntu gfortran 13.x —IS_LINUXis True there, behavior unchanged.Memory + index
feedback_equivalence_must_pass.md(controller-side) extended with a "Two skip classes" section distinguishing invisibility skip (FORBIDDEN, the existing rule) from principled platform-scoped skip (ALLOWED when canonical-env CI is documented). This PR is the reference example. Cargo-cult guardrail included.Acceptance criteria (spec §8)
test_equivalence.pyhave@skipUnless(IS_LINUX, ...).docs/baseline-policy.mdexists.Issue references
Closes #213.
Follow-up: #215.
Spec
docs/superpowers/specs/2026-05-26-linux-canonical-equiv-policy-design.md(Codex 3-round SHIP IT).🤖 Generated with Claude Code
Note
Low Risk
Test gating and documentation only; Linux CI still runs full equivalence; no production or auth paths change.
Overview
Linux-canonical equivalence replaces macOS
1e-10baseline failures with an explicit, documented skip on non-Linux hosts.All seven
python/*/tests/test_equivalence.pymodules now gateTestEquivalencewithIS_LINUX = sys.platform.startswith("linux")and an outermost@unittest.skipUnless(IS_LINUX, …)that points reviewers todocs/baseline-policy.md. On Ubuntu CI (gfortran 13.x), behavior is unchanged; on macOS and other non-Linux dev machines, those suites skip instead of failing on libm/compiler drift.The new
docs/baseline-policy.mdstates what the1e-10contract does and does not cover, names Linux CI as canonical, explains WSL/Docker vs host OS, and documents how to verify locally and what must happen before promoting another platform (graphics stubs, Python fixtures, #215).Reviewed by Cursor Bugbot for commit adb5af7. Bugbot is set up for automated code reviews on this repo. Configure here.