test(mutants): bounded catches for repo-wide survivors + run the lane under nextest#151
Merged
Merged
Conversation
… under nextest
Repo-wide ratchet shard 0 failed on 5 TIMEOUT mutants while scoring 94% (above
the 75% floor): genuine livelock mutants in the writer / visibility / frontier
paths. Root cause is that the mutation lane was the ONLY test lane in the house
still on raw `cargo test`, where one hung test never lets the shared test binary
exit, so it masks every fast-failing assertion and killable mutants read as
TIMEOUT survivors (our policy correctly treats a timeout as a failure, not as
caught).
Two-part fix.
1. Run mutants under `--test-tool nextest` with the `ci` profile (per-test
process isolation + terminate-after), aligning the lane with every other
test lane (run_nextest_ci). A mutation-induced livelock is now reaped as a
bounded per-test timeout and the fast assertion convicts the mutant first.
- .cargo/mutants.toml: test_tool = nextest
- plan.rs / mod.rs fixtures: --test-tool nextest
- run.rs: NEXTEST_PROFILE=ci (its slow-timeout overrides keep the unmutated
baseline from tripping terminate-after)
- lanes.rs: correct the stale "cargo-mutants treats a timeout as caught" note
2. Bounded assertion catches converting each survivor TIMEOUT/MISSED -> CAUGHT
by a sub-millisecond assertion instead of a 203s hang:
- writer_queue_len Some(0): a NON-gated direct-read test so the
--no-default-features lane (where the dangerous-test-hooks catcher is not
compiled) still kills it
- SequenceGate publish-at-frontier (`>` vs `>=`)
- unfenced single-append global visibility frontier (`global_seq + 1` vs `* 1`)
- recreate_restart_segment -> None
- MonotonicClock::process_boot_ns delegation (MISSED)
- sim workload next_seq `+=` vs `*=` op-trace digest pin (MISSED)
- cursor stop_and_join error propagation vs Ok(()) (MISSED)
prepared-batch-items (empty-slice) needs no new test: the existing
prepared_batch_dedupes_entity_and_scope_strings already asserts on items() and
was only masked by a sibling hang, which nextest now unmasks.
Verified locally: all 7 new tests pass on real code (cargo test -p batpak
--all-features), the writer-queue-len test compiles under --no-default-features,
and the 34 xtask mutants fixture/policy tests stay green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NHio8XCrH89gdEcycCumr6
This comment has been minimized.
This comment has been minimized.
…of timing out
The first nextest run of the cured lane still hit 7 timeouts (5 -> 7) and took
2h. Root cause: I pinned NEXTEST_PROFILE=ci, whose `fail-fast = false` is the
exact anti-pattern cargo-mutants warns against for mutation. With fail-fast off,
nextest runs the WHOLE suite per mutant, so a sibling test the mutation
livelocked keeps the run alive until cargo-mutants' outer timeout — re-creating
the cargo-test masking one level up, and the fast assertion that already caught
the mutant is never seen.
Local proof: applying `append.rs global_seq + 1 -> * 1` by hand and running
`cargo nextest run` with `fail-fast = true` convicts the mutant in 0.158s of
test time (exit non-zero after 48/1585 tests, cancelling the rest) — it never
even reaches the livelock test. cargo-mutants' docs confirm it honors the
profile's fail-fast and recommend keeping it on for mutation.
Fix: a dedicated `[profile.mutants]` in .config/nextest.toml —
- fail-fast = true (the lever: convict on the first failing test)
- slow-timeout terminate-after (backstop for a pure-hang mutant no assertion
catches: the hung test is reaped as a per-test timeout-failure)
- the known-slow-surface overrides mirrored from the ci profile so the
unmutated baseline (which runs every test, fail-fast never triggering) cannot
trip terminate-after
pinned via NEXTEST_PROFILE=mutants in the mutants runner.
This converts the TIMEOUT survivors to fast assertion catches AND fixes the 2h
wall-clock (caught mutants exit at first failure, in ms). The MISSED cures from
the prior commit already took (94% -> 100%, 0 missed); this closes the TIMEOUT
side.
Validated locally: the `mutants` profile parses and runs
(`cargo nextest run --profile mutants` => "nextest profile: mutants", test PASS).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NHio8XCrH89gdEcycCumr6
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
The repo-wide ratchet smoke shard 0 went red on its first complete cloud run (now that Blacksmith is fast enough to finish the 85-min lane). It scored 94% — well above the 75% floor — but hard-failed on 5 TIMEOUT mutants: genuine livelock mutants in the writer / visibility / frontier paths.
Root cause (the deeper issue)
The bounded tests alone would not have fixed this. The mutation lane was the only test lane in the house still on raw
cargo test. Undercargo test, one test hanging on a mutation never lets the shared lib-unittests binary exit, so it masks every fast-failing assertion in that binary, and a killable mutant reads as a TIMEOUT survivor (our policy correctly treatstimed_out > 0as a failure, not as caught).lanes.rseven carried a stale comment betting "cargo-mutants treats a timeout as caught" — false under that policy.The fix (two parts)
1. Run mutants under nextest.
--test-tool nextest+ theciprofile'sterminate-after(pinned viaNEXTEST_PROFILE=ciin the runner). Per-test process isolation means a livelock is reaped as a bounded per-test timeout and the fast assertion convicts the mutant first. This aligns the mutation lane with every other test lane (run_nextest_ci) and fixes hang-masking for the other 47 shards too..cargo/mutants.toml,plan.rs,mod.rsfixtures:--test-tool nextestrun.rs:NEXTEST_PROFILE=ci(its slow-timeout overrides keep the unmutated baseline from tripping terminate-after)lanes.rs: corrected the stale timeout-vs-caught note2. Seven bounded assertion catches — each survivor flips TIMEOUT/MISSED → CAUGHT by a sub-millisecond assertion instead of a 203s hang:
writer_queue_lenSome(0)--no-default-featureslane (where the existing catcher isdangerous-test-hooks-gated) kills it tooSequenceGate::publish_on_lanes>→>=visibleglobal_seq + 1→* 1visible_sequence()== 1 after first eventrecreate_restart_segment→NoneSomeMonotonicClock::process_boot_nsnext_seq += 1→*= 1stop_and_join→Ok(())prepared-batch-items(empty-slice) needs no new test: the existingprepared_batch_dedupes_entity_and_scope_stringsalready asserts onitems(); it was only masked by a sibling hang, which nextest now unmasks.Verification (local, before push)
cargo test -p batpak --all-features(incl. the sim op-trace golden constant).--no-default-features.traceability-check: ok,structural-check: ok(overclaim, triangulation, capability-snapshot, …).What proves it in CI
The four required gates (ci-fast, meta-gate, gauntlet, Windows) gate the merge. The thing that actually proves the 5 TIMEOUTs are gone is the non-required Mutation smoke (repo-wide ratchet) lane — watch that one go green (and faster, with the 5×203s timeouts eliminated).
🤖 Generated with Claude Code