test(mutation): round 1 — kill surviving mutants across 0.9.0 diff-scoped seams#149
Conversation
…oped seams The diff-scoped mutation-smoke gate (now running after the harness + fail-closed fixes) found 334 surviving mutants across 13 seams: 0.9.0's cumulative mutation-coverage debt, exposed once the gate could run on this branch. Round 1 adds targeted mutation-killing tests per seam (written multi-agent), each verified to compile and pass locally with spot-checked kills via manual mutation; full workspace clippy -D warnings is clean. - syncbat: subscription_runtime/registry, subscription streams, builder/receipt/ operation_status surfaces (new tests/mutation_kill_*.rs + inline modules). - netbat: transport stream_tcp/tcp (+ rigor-bound expect_err assertions). - integrity: structural/assurance/meta_gate/public_surface/typed_waivers. - core: store/mod, write/writer (via #[path] sidecar to stay under the 850 cap), platform alloc/fs, delivery/cursor. A few equivalent/unkillable mutants are documented in agent notes and fall under the gate's 15% tolerance. CI's mutation re-run confirms the kills; residual survivors are cured in round 2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Warning Review limit reached
Next review available in: 31 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (18)
WalkthroughThis PR adds mutation-kill tests and a few equivalent-mutant comments across core store, netbat transport, syncbat runtime, and integrity tooling. It also extracts accept-error classification in netbat’s TCP stream transport. ChangesCore Store Coverage
Netbat Transport Coverage
Syncbat Runtime Coverage
Integrity Tool Coverage
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| assert!( | ||
| failures.is_empty(), | ||
| "visibility_epoch mismatches: {failures:?}" | ||
| ); | ||
| } | ||
|
|
||
| #[test] | ||
| fn park_for_data_blocks_until_the_timeout_when_no_publish_races() { | ||
| // The `park_for_data -> ()` no-op mutant returns instantly; the real | ||
| // implementation must block (roughly) for the supplied timeout when the | ||
| // epoch never advances. | ||
| let index = fresh_index(); | ||
| let cursor = Cursor::new(Region::all(), index); | ||
| let epoch = cursor.visibility_epoch(); | ||
|
|
||
| let timeout = Duration::from_millis(200); | ||
| let start = Instant::now(); | ||
| cursor.park_for_data(epoch, timeout); | ||
| let elapsed = start.elapsed(); | ||
|
|
There was a problem hiding this comment.
Unconditional 200 ms wall-clock block in the normal test suite
park_for_data is called with a 200 ms timeout and the test waits for the full call to return before asserting. This isn't gated behind any feature flag, so it runs on every cargo test invocation — not just during the mutation-testing harness cycle. On a test suite that runs this file many times (e.g., different filter combinations, parallel shards) the 200 ms floor accumulates. Consider reducing the timeout to 50 ms (assert >= 20 ms) or guarding this test with a dedicated slow-mutation-tests feature flag so it can be opted out of during rapid iteration.
Prompt To Fix With AI
This is a comment left during a code review.
Path: bpk-lib/crates/core/src/store/delivery/cursor.rs
Line: 349-368
Comment:
**Unconditional 200 ms wall-clock block in the normal test suite**
`park_for_data` is called with a 200 ms timeout and the test waits for the full call to return before asserting. This isn't gated behind any feature flag, so it runs on every `cargo test` invocation — not just during the mutation-testing harness cycle. On a test suite that runs this file many times (e.g., different filter combinations, parallel shards) the 200 ms floor accumulates. Consider reducing the timeout to 50 ms (assert `>= 20 ms`) or guarding this test with a dedicated `slow-mutation-tests` feature flag so it can be opted out of during rapid iteration.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| mod cursor_helper_tests { | ||
| use super::{ | ||
| operation_status_subscription_id_hash, projection_subscription_id_hash, | ||
| receipt_stream_subscription_id_hash, subscription_id_hash, EventStreamCursorV1, | ||
| ReceiptStreamCursorV1, SubscriptionRuntimeError, | ||
| }; | ||
|
|
||
| #[test] | ||
| fn event_cursor_decode_rejects_beginning_with_nonzero_hlc_only() { | ||
| // Beginning cursor (kind byte 0x00) with hlc_wall_ms != 0 but global_sequence == 0. | ||
| // Real `||` rejects when either numeric field is nonzero; a `&&` mutant would accept. |
There was a problem hiding this comment.
Hardcoded byte offsets are silently wrong if the cursor format changes
bytes[24..32] is written directly into the encoded beginning cursor to corrupt hlc_wall_ms. If the EventStreamCursorV1 encoding ever changes (field reordering, an extra header byte, a different hash width), the index will silently target the wrong bytes and the test could stop killing the intended mutant without any compilation error. The same pattern repeats at bytes[40..48] in the ReceiptStreamCursorV1 beginning test and the EntityStreamCursorV1 test in entity_cursor.rs. A thin helper that accepts a field name and computes its offset — or at minimum a constant named BEGINNING_HLC_WALL_MS_OFFSET — would make the coupling to the wire format explicit and catch regressions at a glance.
Prompt To Fix With AI
This is a comment left during a code review.
Path: bpk-lib/crates/syncbat/src/subscription_runtime/cursor.rs
Line: 821-831
Comment:
**Hardcoded byte offsets are silently wrong if the cursor format changes**
`bytes[24..32]` is written directly into the encoded beginning cursor to corrupt `hlc_wall_ms`. If the `EventStreamCursorV1` encoding ever changes (field reordering, an extra header byte, a different hash width), the index will silently target the wrong bytes and the test could stop killing the intended mutant without any compilation error. The same pattern repeats at `bytes[40..48]` in the `ReceiptStreamCursorV1` beginning test and the `EntityStreamCursorV1` test in `entity_cursor.rs`. A thin helper that accepts a field name and computes its offset — or at minimum a constant named `BEGINNING_HLC_WALL_MS_OFFSET` — would make the coupling to the wire format explicit and catch regressions at a glance.
How can I resolve this? If you propose a fix, please make it concise.| fn real_fs_sync_file_all_surfaces_platform_errors() -> Result<(), Box<dyn Error>> { | ||
| let file = std::fs::File::open("/dev/null")?; | ||
| let fs = RealFs; | ||
| assert!( | ||
| fs.sync_file_all(&file, std::path::Path::new("/dev/null")) | ||
| .is_err(), | ||
| "RealFs::sync_file_all must surface the platform sync error, not report success" | ||
| ); | ||
| Ok(()) | ||
| } | ||
|
|
||
| #[test] | ||
| fn real_fs_sync_parent_dir_surfaces_missing_parent() -> Result<(), Box<dyn Error>> { | ||
| // A path whose parent directory does not exist cannot have its directory | ||
| // entry fsynced; the wrapper must error rather than report success. | ||
| let fs = RealFs; | ||
| let missing = std::path::Path::new("/batpak-mutation-kill-nonexistent-parent-xyz/leaf"); | ||
| assert!( | ||
| fs.sync_parent_dir(missing).is_err(), | ||
| "RealFs::sync_parent_dir must surface an error when the parent dir is absent" | ||
| ); | ||
| Ok(()) | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Missing
#[cfg(unix)] guard on sync_parent_dir error-path test
real_fs_sync_file_with_mode_surfaces_platform_errors and real_fs_sync_file_all_surfaces_platform_errors both carry #[cfg(unix)], but the real_fs_sync_parent_dir_surfaces_missing_parent test immediately below does not, despite relying on a POSIX-style absolute path (/batpak-mutation-kill-nonexistent-parent-xyz/leaf). On Windows that path is interpreted as drive-relative, which happens to still produce an error, but the platform intent is inconsistent. Adding the same #[cfg(unix)] guard would make the scoping explicit and match the surrounding tests.
Prompt To Fix With AI
This is a comment left during a code review.
Path: bpk-lib/crates/core/src/store/platform/fs.rs
Line: 607-631
Comment:
**Missing `#[cfg(unix)]` guard on `sync_parent_dir` error-path test**
`real_fs_sync_file_with_mode_surfaces_platform_errors` and `real_fs_sync_file_all_surfaces_platform_errors` both carry `#[cfg(unix)]`, but the `real_fs_sync_parent_dir_surfaces_missing_parent` test immediately below does not, despite relying on a POSIX-style absolute path (`/batpak-mutation-kill-nonexistent-parent-xyz/leaf`). On Windows that path is interpreted as drive-relative, which happens to still produce an error, but the platform intent is inconsistent. Adding the same `#[cfg(unix)]` guard would make the scoping explicit and match the surrounding tests.
How can I resolve this? If you propose a fix, please make it concise.…ents across all 13 seams Round 1 already killed the large majority of the 334 diff-scoped survivors; round 2 verifies every remaining mutant in the saved per-seam worklists and closes the gaps (~16 new kill-tests, ~8 documented equivalents): - netbat: extract classify_accept_error helper + test killing the accept-loop error-classification mutants (4). - syncbat subscription-runtime: assert SessionError.code (not just message) for the ack-invalid match arm (2 tests). - core long tail: 11 new kill-tests — platform-backend (reflink_impl/is_finished), frontier-append-gate (timeout accessor), hash-chain-replay (source_refs), lane-branch (RegionFilterError display), repo-wide ratchet (approver, topology not-found propagate, fork-frame short-read, index hlc-for-sequence, restore single-violated fallback, emitted-frontier timeout, footer trailer-only, simclock boot marker). The reflink_impl mutation test is a #[path] sidecar (fs_reflink_mutation_tests.rs) to keep fs.rs's inline test island under the non-overridable 200-nonblank cap. Genuinely-equivalent mutants (unreachable Store<Closed>, empty-batch flush arm, restore partition count, recovery-manifest header boundary, schema_version const) are documented inline and fall under the gate's 15% per-seam tolerance. Every seam's survivor list is retired; authoritative proof via a proof=mutants seam-smoke re-run on this branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 04bc3c32e6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| #[test] | ||
| fn real_fs_sync_parent_dir_surfaces_missing_parent() -> Result<(), Box<dyn Error>> { |
There was a problem hiding this comment.
Gate the missing-parent sync test to Unix
On non-Unix targets the platform sync_parent_dir_io path intentionally skips directory fsync and returns Ok(()), so this new test's fs.sync_parent_dir(missing).is_err() assertion fails on Windows/non-Unix even though that is the expected production behavior. The two preceding /dev/null sync-error tests are already #[cfg(unix)]; this one needs the same guard or a platform-specific assertion to keep the cross-platform test lane green.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (7)
bpk-lib/tools/integrity/src/assurance.rs (1)
454-461: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winKeep mutation-rationale out of rustdoc.
These
///lines describe mutation-test reasoning rather than an invariant ofunleveled_files, so they become generated docs for production code. Please move this to non-doc//comments or keep the explanation next to the killing test instead. As per coding guidelines, "Encode behavior in types, tests, and explicit invariants ... rather than planning prose" and "Avoid repeating high-level arcs ... in docstrings, comments, or user-facing docs in place of real invariants."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/tools/integrity/src/assurance.rs` around lines 454 - 461, The mutation-test rationale currently lives in rustdoc-style `///` comments in `unleveled_files`, which will be exposed as production documentation instead of implementation notes. Move this explanation to plain `//` comments or relocate it near the relevant test that kills the mutant, keeping the rustdoc focused on the actual invariant and behavior of `unleveled_files`.Source: Coding guidelines
bpk-lib/crates/netbat/tests/mutation_kill_netbat-transport.rs (1)
1-5: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winReplace the file banner with a concrete invariant or drop it.
This header reads like planning/coverage prose, not a testable fact. Please either remove it or restate it as a concrete invariant the suite enforces. As per coding guidelines, "Avoid repeating high-level arcs (product names, transport stacks, scope statements) as ceremony in docstrings, comments, or user-facing docs in place of real invariants" and "Encode behavior in types, tests, and explicit invariants (hash rules, ordering, report fields, error cases) rather than planning prose".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/netbat/tests/mutation_kill_netbat-transport.rs` around lines 1 - 5, The file banner in mutation_kill_netbat-transport is planning prose rather than an enforceable invariant. Remove the header comment entirely, or replace it with a concrete, testable statement that reflects what the suite actually asserts (for example, a specific limit, ordering, or error/report behavior) and keep it tied to the test’s public API checks instead of scope/coverage language.Source: Coding guidelines
bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-core-surfaces.rs (1)
1-6: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDrop the mutation-planning banner.
The test name and assertion already carry the invariant; this
PROVES/CATCHES/SEEDEDblock is PR-only context that will drift faster than the test itself. As per coding guidelines, "Avoid repeating high-level arcs (product names, transport stacks, scope statements) as ceremony in docstrings, comments, or user-facing docs in place of real invariants" and "Encode behavior in types, tests, and explicit invariants (hash rules, ordering, report fields, error cases) rather than planning prose".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-core-surfaces.rs` around lines 1 - 6, Remove the mutation-planning banner from this test module and keep only the invariant-bearing test content; the `mutation_kill_syncbat-core-surfaces` comment block with `PROVES/CATCHES/SEEDED` is PR-only ceremony that should be deleted. Update the top-of-file doc comment near `StoreOperationStatusSink::record_started` so it no longer repeats high-level scope or planning prose, and rely on the test name and assertions to express the behavior.Source: Coding guidelines
bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-streams.rs (2)
1-6: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winTrim the mutation-planning module docs.
This banner explains PR context instead of a lasting invariant; the tests below already encode the behavior being protected. As per coding guidelines, "Avoid repeating high-level arcs (product names, transport stacks, scope statements) as ceremony in docstrings, comments, or user-facing docs in place of real invariants" and "Encode behavior in types, tests, and explicit invariants (hash rules, ordering, report fields, error cases) rather than planning prose".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-streams.rs` around lines 1 - 6, Trim the module-level test banner in the syncbat mutation test file so it states only the lasting invariant being verified, not PR context or mutation-planning prose. Update the top comment near the test module entry to remove the “PROVES/CATCHES” narrative and any references to the review scope, while keeping a brief invariant description of the ACK range/regress checks and delivery index advancement covered by the tests.Source: Coding guidelines
153-271: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftCollapse the per-route ACK scenarios into a shared harness.
The receipt/entity/event/projection/operation-status sections all restate the same ACK-range, regression, and delivery-index invariants with only route setup differing. A small helper or macro would keep that contract in one place and make later session changes less likely to desync the suites.
Also applies to: 310-413, 448-539, 580-638, 697-755
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-streams.rs` around lines 153 - 271, Collapse the repeated ACK-range, regression, and delivery-index checks into one shared test harness used by the receipt/entity/event/projection/operation-status suites, keeping the route-specific setup separate. Extract the common assertions and session-driving flow from the existing test functions in mutation_kill_syncbat-streams.rs into a helper or macro that accepts the route/open setup and reuses the same invariant checks. Keep the unique route setup points and symbols like open, collect, send_ack, first_event, first_error_message, and event_indices so each suite can invoke the shared contract without duplicating logic.bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rs (1)
1-18: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winRemove the missed-list from the module docs.
The long "Everything else ... NOT duplicated here" section is PR planning, not a stable invariant, and the bullets will drift as coverage moves. As per coding guidelines, "Do not include bulleted 'must not own X / Y / Z' lists in code comments or documentation that duplicate planning notes without adding a testable fact" and "Encode behavior in types, tests, and explicit invariants (hash rules, ordering, report fields, error cases) rather than planning prose".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rs` around lines 1 - 18, Remove the PR-planning “missed list” section from the module docs in the mutation_kill_syncbat-subscription-runtime test file and keep only the stable, testable invariant about invalid-ACK terminal errors and the mutant it catches. Update the top-of-file comment near the doc block so it references the relevant session.rs behavior via ack_invalid_error without enumerating other suites or “NOT duplicated here” bullets, since that coverage map is expected to drift.Source: Coding guidelines
bpk-lib/crates/core/tests/mutation_kill_core-store.rs (1)
393-413: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winTighten this cold-start assertion to the scan-frame path.
Because the injector filter also matches
IndexFooterDecode,result.is_err()can pass even if reopen fails beforeinject_scan_frameis exercised. Please make the test prove the per-frame path was the reason for the failure, not just that some error occurred.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bpk-lib/crates/core/tests/mutation_kill_core-store.rs` around lines 393 - 413, The cold-start test currently passes if any matched injection point fails, including IndexFooterDecode, so it does not prove the scan-frame path was exercised. Update the mutation_kill_core-store test around Store::open and the fault injector setup to make the assertion specific to the per-frame scan path by isolating or verifying the inject_scan_frame-related failure instead of accepting any error. Keep the references to CountdownInjector::new, with_filter, and Store::open, and ensure the test fails only when the scan-frame path is skipped or wrong.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bpk-lib/crates/core/src/store/platform/fs.rs`:
- Around line 622-631: The RealFs::sync_parent_dir test is relying on a
hard-coded absolute path that may unexpectedly exist on some runners, making the
missing-parent precondition flaky. Update
real_fs_sync_parent_dir_surfaces_missing_parent to create a fresh temporary
directory and point the leaf path at a child whose parent has not been created,
so the test always exercises the missing-parent case regardless of host state.
Keep the assertion against RealFs::sync_parent_dir and ensure the setup is
self-contained within the test.
In `@bpk-lib/crates/core/tests/mutation_kill_core-store.rs`:
- Around line 105-106: The test in mutation_kill_core-store is discarding the
result of handle.join(), so a reactor/worker error can be missed and the test
still passes. Update the teardown after handle.stop() to assert that join
succeeds instead of ignoring it, using the existing handle variable in the test
body so failures from the reactor are surfaced.
In `@bpk-lib/crates/syncbat/src/builder.rs`:
- Around line 387-405: The test currently verifies duplicate detection and
descriptor insertion via CoreBuilder::register_boxed and
Core::contains_operation, but it does not confirm that the boxed handler is
actually stored in the built Core. Update the test to also assert the handler is
retained by either checking the built core’s handler map for the registered name
or by invoking the registered operation through the built core and expecting the
boxed handler to run, so a regression where only the descriptor is inserted will
be caught.
In `@bpk-lib/crates/syncbat/src/operation_status.rs`:
- Around line 311-335: The open-attempt test in has_open_attempt only validates
completed_count, so it can miss regressions if failed_count or denied_count stop
being counted as terminal. Extend
has_open_attempt_compares_started_against_terminal_counts to add one assertion
case for failed_count and one for denied_count in OperationStatusView, verifying
that when started_count matches each terminal bucket sum, has_open_attempt
returns false.
In `@bpk-lib/crates/syncbat/src/subscription_runtime/cursor.rs`:
- Around line 828-848: The decode rejection coverage in
EventStreamCursorV1::decode and ReceiptStreamCursorV1::decode only checks
beginning cursors with a nonzero hlc_wall_ms, so it misses the case where
global_sequence is nonzero. Add complementary tests for both cursor types that
keep hlc_wall_ms at zero while setting global_sequence to a nonzero value, and
assert the decoders still return SubscriptionRuntimeError::CursorInvalid to
ensure beginning-cursor validation checks both numeric fields.
In `@bpk-lib/crates/syncbat/src/subscription_runtime/entity_cursor.rs`:
- Around line 203-212: The test
entity_cursor_decode_rejects_beginning_with_nonzero_hlc_only only covers a
nonzero hlc_wall_ms case, so it can miss a decode bug where global_sequence is
no longer validated. Extend this test in EntityStreamCursorV1::decode by adding
a second invalid beginning cursor case with hlc_wall_ms set to zero and
global_sequence set to a nonzero value, and assert it still returns
SubscriptionRuntimeError::CursorInvalid. This should mirror the existing
beginning-cursor rejection check in entity_cursor.rs.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-registry.rs`:
- Around line 107-118: The event_category accessor test only covers some
non-EventCategory variants, so add assertions for the remaining variants that
should return None as well. Update
event_category_accessor_returns_real_category_and_none to exercise
OperationStatus and EntityStream in addition to Projection and ReceiptStream,
using the existing ec, proj, rcpt, and any other relevant constructors so
regressions in any variant-specific path are caught.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rs`:
- Around line 167-171: The test is accessing a non-public error field directly;
update the assertions in the subscription-runtime mutation test to use the
public SessionError::code() accessor instead of error.code. Locate the
comparisons around the out-of-range ACK and cursor-invalid checks in the test
file, and keep the same expected codes while switching to the accessor exposed
by SessionError in the error module.
In `@bpk-lib/tools/integrity/src/typed_waivers.rs`:
- Around line 437-467: The test fixture in
check_with_today_fails_on_an_expired_committed_waiver is accidentally relying on
a missing ADR and may fail for the wrong reason. Update the temp repo setup so
the ADR referenced by the waiver’s adr field is actually present and resolvable
before calling check_with_today, keeping the assertion focused on the
expired-waiver path rather than ADR resolution. Use the existing
check_with_today and WAIVERS_REL setup to add the needed ADR anchor/file in the
temp repo.
---
Nitpick comments:
In `@bpk-lib/crates/core/tests/mutation_kill_core-store.rs`:
- Around line 393-413: The cold-start test currently passes if any matched
injection point fails, including IndexFooterDecode, so it does not prove the
scan-frame path was exercised. Update the mutation_kill_core-store test around
Store::open and the fault injector setup to make the assertion specific to the
per-frame scan path by isolating or verifying the inject_scan_frame-related
failure instead of accepting any error. Keep the references to
CountdownInjector::new, with_filter, and Store::open, and ensure the test fails
only when the scan-frame path is skipped or wrong.
In `@bpk-lib/crates/netbat/tests/mutation_kill_netbat-transport.rs`:
- Around line 1-5: The file banner in mutation_kill_netbat-transport is planning
prose rather than an enforceable invariant. Remove the header comment entirely,
or replace it with a concrete, testable statement that reflects what the suite
actually asserts (for example, a specific limit, ordering, or error/report
behavior) and keep it tied to the test’s public API checks instead of
scope/coverage language.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-core-surfaces.rs`:
- Around line 1-6: Remove the mutation-planning banner from this test module and
keep only the invariant-bearing test content; the
`mutation_kill_syncbat-core-surfaces` comment block with `PROVES/CATCHES/SEEDED`
is PR-only ceremony that should be deleted. Update the top-of-file doc comment
near `StoreOperationStatusSink::record_started` so it no longer repeats
high-level scope or planning prose, and rely on the test name and assertions to
express the behavior.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-streams.rs`:
- Around line 1-6: Trim the module-level test banner in the syncbat mutation
test file so it states only the lasting invariant being verified, not PR context
or mutation-planning prose. Update the top comment near the test module entry to
remove the “PROVES/CATCHES” narrative and any references to the review scope,
while keeping a brief invariant description of the ACK range/regress checks and
delivery index advancement covered by the tests.
- Around line 153-271: Collapse the repeated ACK-range, regression, and
delivery-index checks into one shared test harness used by the
receipt/entity/event/projection/operation-status suites, keeping the
route-specific setup separate. Extract the common assertions and session-driving
flow from the existing test functions in mutation_kill_syncbat-streams.rs into a
helper or macro that accepts the route/open setup and reuses the same invariant
checks. Keep the unique route setup points and symbols like open, collect,
send_ack, first_event, first_error_message, and event_indices so each suite can
invoke the shared contract without duplicating logic.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rs`:
- Around line 1-18: Remove the PR-planning “missed list” section from the module
docs in the mutation_kill_syncbat-subscription-runtime test file and keep only
the stable, testable invariant about invalid-ACK terminal errors and the mutant
it catches. Update the top-of-file comment near the doc block so it references
the relevant session.rs behavior via ack_invalid_error without enumerating other
suites or “NOT duplicated here” bullets, since that coverage map is expected to
drift.
In `@bpk-lib/tools/integrity/src/assurance.rs`:
- Around line 454-461: The mutation-test rationale currently lives in
rustdoc-style `///` comments in `unleveled_files`, which will be exposed as
production documentation instead of implementation notes. Move this explanation
to plain `//` comments or relocate it near the relevant test that kills the
mutant, keeping the rustdoc focused on the actual invariant and behavior of
`unleveled_files`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dece69f8-5507-4e07-9a61-d6550ef96897
📒 Files selected for processing (41)
bpk-lib/crates/core/src/coordinate/mod.rsbpk-lib/crates/core/src/pipeline/bypass.rsbpk-lib/crates/core/src/store/cold_start/rebuild.rsbpk-lib/crates/core/src/store/cold_start/rebuild/topology.rsbpk-lib/crates/core/src/store/delivery/cursor.rsbpk-lib/crates/core/src/store/fork_report.rsbpk-lib/crates/core/src/store/gate.rsbpk-lib/crates/core/src/store/index/restore/tests.rsbpk-lib/crates/core/src/store/index/tests.rsbpk-lib/crates/core/src/store/mod.rsbpk-lib/crates/core/src/store/platform/alloc.rsbpk-lib/crates/core/src/store/platform/fs.rsbpk-lib/crates/core/src/store/platform/fs_reflink_mutation_tests.rsbpk-lib/crates/core/src/store/platform/spawn.rsbpk-lib/crates/core/src/store/reactor_typed.rsbpk-lib/crates/core/src/store/read_walk.rsbpk-lib/crates/core/src/store/segment/recovery_manifest.rsbpk-lib/crates/core/src/store/segment/sidx/footer.rsbpk-lib/crates/core/src/store/sim/clock.rsbpk-lib/crates/core/src/store/write/writer.rsbpk-lib/crates/core/src/store/write/writer_mutation_tests.rsbpk-lib/crates/core/tests/mutation_kill_core-store.rsbpk-lib/crates/netbat/src/transport/stream_tcp.rsbpk-lib/crates/netbat/src/transport/tcp.rsbpk-lib/crates/netbat/tests/mutation_kill_netbat-transport.rsbpk-lib/crates/syncbat/src/builder.rsbpk-lib/crates/syncbat/src/operation_status.rsbpk-lib/crates/syncbat/src/operation_status_sink.rsbpk-lib/crates/syncbat/src/receipt.rsbpk-lib/crates/syncbat/src/subscription_runtime/cursor.rsbpk-lib/crates/syncbat/src/subscription_runtime/entity_cursor.rsbpk-lib/crates/syncbat/src/subscription_runtime/envelope.rsbpk-lib/crates/syncbat/tests/mutation_kill_syncbat-core-surfaces.rsbpk-lib/crates/syncbat/tests/mutation_kill_syncbat-registry.rsbpk-lib/crates/syncbat/tests/mutation_kill_syncbat-streams.rsbpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rsbpk-lib/tools/integrity/src/assurance.rsbpk-lib/tools/integrity/src/meta_gate_tests.rsbpk-lib/tools/integrity/src/public_surface.rsbpk-lib/tools/integrity/src/structural.rsbpk-lib/tools/integrity/src/typed_waivers.rs
| assert_eq!( | ||
| error.code, MALFORMED_STREAM_FRAME, | ||
| "out-of-range ACK must be a malformed-frame fault, not cursor_invalid" | ||
| ); | ||
| assert_ne!(error.code, CURSOR_INVALID); |
There was a problem hiding this comment.
🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win
Use SessionError::code() here; error.code is not the public API.
Line 168 and Line 210 compare error.code, but the public accessor exposed in bpk-lib/crates/syncbat/src/error.rs:162-164 is code(). In an integration test under tests/, direct field access will not compile if the field remains private.
Suggested fix
- error.code, MALFORMED_STREAM_FRAME,
+ error.code(), MALFORMED_STREAM_FRAME,
"out-of-range ACK must be a malformed-frame fault, not cursor_invalid"
);
- assert_ne!(error.code, CURSOR_INVALID);
+ assert_ne!(error.code(), CURSOR_INVALID);
@@
- error.code, MALFORMED_STREAM_FRAME,
+ error.code(), MALFORMED_STREAM_FRAME,
"cursor-mismatch ACK must be a malformed-frame fault, not cursor_invalid"
);
- assert_ne!(error.code, CURSOR_INVALID);
+ assert_ne!(error.code(), CURSOR_INVALID);Also applies to: 209-213
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bpk-lib/crates/syncbat/tests/mutation_kill_syncbat-subscription-runtime.rs`
around lines 167 - 171, The test is accessing a non-public error field directly;
update the assertions in the subscription-runtime mutation test to use the
public SessionError::code() accessor instead of error.code. Locate the
comparisons around the out-of-range ACK and cursor-invalid checks in the test
file, and keep the same expected codes while switching to the accessor exposed
by SessionError in the error module.
…ests Applied 9 of 10 review comments (CodeRabbit/Greptile/Codex); comment 1 skipped as a mistaken premise — the bot pointed at error.rs's SessionError::code() accessor, but the test uses subscription_runtime::SessionError, which exposes a `code` field and no accessor (.code() does not compile; error.code is already the public API). - fs.rs missing-parent test: #[cfg(unix)] gate + tempdir (drop hardcoded path). - core mutation_kill: assert reactor stop_and_join() instead of discarding join. - builder dup-detection: also assert the boxed handler is retained. - operation_status open-attempt: cover failed_count + denied_count terminals. - cursor/entity_cursor decode-rejection: add nonzero-global_sequence mirror cases. - registry event_category: exercise OperationStatus + EntityStream variants too. - typed_waivers expiry fixture: plant a resolvable ADR so it isolates expiry. - delivery/cursor park test: 200ms->50ms timeout (still kills the no-op mutant). - subscription cursor tests: named wire-offset consts + put_u64_be helper. cargo test (batpak/syncbat/netbat/integrity) + clippy --workspace + structural-check all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
…false equivalence Round 2 labeled rebuild.rs:249 (`tail_count > 0` -> `>= 0`) an equivalent mutant; the seam smoke counted it MISSED and the 3-mutant hash-chain-replay seam failed. The claim was FALSE: when tail_count==0, the +1 to chunk_count is observable for any restore with >=2 entries — RoutingSummary::from_sorted_entries partitions the sorted entries into chunk_count slices (skipping only empties), so 4 entries yield 1 chunk (original) vs 2 (mutant), surfacing as RestorePlan.routing.chunk_count. The prior guard test used a 0-entry fixture, so it asserted 0==0 and passed under the mutation. Rewrote build_snapshot_plan_keeps_chunk_count_when_tail_is_empty to use 4 entries (receipt_extensions_hydrated, tail_count==0, input single-chunk) asserting routing.chunk_count==1 — verified to FAIL (left:2 right:1) under the applied mutation and pass on HEAD; the companion _adds_chunk_when_tail_is_present guards the +1 direction. Replaced the false "EQUIVALENT-MUTANT byte-identical" comment with the truthful observable-difference explanation. No lanes.rs exclusion existed for this seam. cargo test -p batpak + clippy -D warnings + structural-check all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
…ed), 2 genuine
After rebuild.rs:249 proved a false "equivalent" (degenerate 0-entry fixture), re-audited
all 6 remaining round-2 documented-equivalents with the same skepticism. 4 were
false/incomplete and are now killed with tests proven to fail under the mutation; 2 are
genuinely equivalent (comments confirmed airtight). The two false "unreachable" claims
were the same excuse pattern as rebuild.rs.
- operation_status.rs:166 schema_version()->1: GENUINE (nullary const fn == 1; existing
schema_version_reports_one self-guards a future bump). No change.
- assurance.rs: the commented !entry_matches_path term IS equivalent (matches_derived is
unconditionally true for all repo_roots), but the SIBLING !matches_derived mutant was
killable -> killed (unleveled_files_is_empty_over_committed_manifest); comment is now a
precise two-term mutation map.
- structural.rs:25 run()->Ok(()): FALSE ("no injectable root" excuse; the comment even
misstated process-CWD). run() writes target/gauntlet-receipts/structural-source-lints.json
which the Ok(()) mutant skips -> killed by run_executes_gates_and_writes_the_structural_lints_receipt.
- mod.rs:310 Closed::writer_queue_len: FALSE (Store<Closed> is never built, but Closed is a
public ZST and the sealed trait is callable on the marker) -> killed by
closed_state_reports_no_writer_queue (asserts None).
- reactor_typed.rs:598 flush-guard->true: GENUINE (ReactionBatch::flush early-returns the
empty batch with zero side effects). No change.
- recovery_manifest.rs:177 <8 -> ==8: return-equivalent for all inputs, but the <8 guard
short-circuits before consuming the source while the mutant consumes it -> killed by
undersized_frame_rejected_without_consuming_the_source (source cursor position pinned).
Every round-2 equivalence claim is now proven-genuine or killed. cargo test + clippy
-D warnings + structural-check all green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
cargo-deny's advisory lane (run in ci-fast) started failing on RUSTSEC-2026-0190 — "Unsoundness in anyhow::Error::downcast_mut()" — a fresh advisory published against anyhow 1.0.102; the deny config rejects `unsound`. Unrelated to the mutation cure (it fails any PR opened today). 1.0.103 is the patched release; `cargo deny check advisories` is back to `advisories ok`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
…o-evidence failure The diff-scoped seam smoke mutates only production lines IN THE PR DIFF. On a tests-mostly PR, the one mutable line a diff incidentally touches can yield only an UNVIABLE (uncompilable) mutant — cargo-mutants' `x > 0 -> x >= 0` on a usize is a tautology that -D warnings rejects (unused_comparisons). The policy then bailed "unviable-only output cannot pass as evidence", a false red (it bit hash-chain-replay + platform-backend on #149) even though an uncompilable mutant is no coverage gap. Treat unviable-only the same as the existing zero-mutant case: a legitimate PASS, but ONLY for a real PR diff (diff_scoped && DiffScope::PrDiff). A manual workflow_dispatch /local run (DiffScope::None) or a non-diff-scoped lane still hard-fails on unviable-only, so a manual mutation proof cannot skip a critical-seam gate. Timeouts and real missed mutants are unaffected. Red fixture: diff_scoped_lane_with_unviable_only_manual_run_still_bails (None -> bails). Green fixture: diff_scoped_lane_with_unviable_only_pr_diff_passes (PrDiff -> ok). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
…ests Applied 9 of 10 review comments (CodeRabbit/Greptile/Codex); comment 1 skipped as a mistaken premise — the bot pointed at error.rs's SessionError::code() accessor, but the test uses subscription_runtime::SessionError, which exposes a `code` field and no accessor (.code() does not compile; error.code is already the public API). - fs.rs missing-parent test: #[cfg(unix)] gate + tempdir (drop hardcoded path). - core mutation_kill: assert reactor stop_and_join() instead of discarding join. - builder dup-detection: also assert the boxed handler is retained. - operation_status open-attempt: cover failed_count + denied_count terminals. - cursor/entity_cursor decode-rejection: add nonzero-global_sequence mirror cases. - registry event_category: exercise OperationStatus + EntityStream variants too. - typed_waivers expiry fixture: plant a resolvable ADR so it isolates expiry. - delivery/cursor park test: 200ms->50ms timeout (still kills the no-op mutant). - subscription cursor tests: named wire-offset consts + put_u64_be helper. cargo test (batpak/syncbat/netbat/integrity) + clippy --workspace + structural-check all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NqMCRfzv6DLtBoiQp61xAe
Splits the mutation-coverage cure out of the 0.9.0 integration PR (#136), as planned.
The diff-scoped mutation-smoke gate (running after the harness + fail-closed
detector fixes landed in #136) surfaced 334 surviving mutants across 13
seams — 0.9.0's cumulative mutation-coverage debt, exposed once the gate could
actually run. This is round 1: targeted mutation-killing tests per seam
(authored multi-agent), each verified to compile and pass locally with
spot-checked kills via manual mutation; full-workspace
clippy -D warningsclean.Seams covered:
builder/receipt/operation_status surfaces (new
tests/mutation_kill_*.rs+inline modules).
expect_errassertions).#[path]sidecar to stay under the850-line cap), platform alloc/fs, delivery/cursor.
A few equivalent/unkillable mutants are documented in-line and fall under the
gate's 15% tolerance. CI's mutation re-run confirms the kills; residual
survivors are cured in round 2+ on follow-up PRs until all 13 seams reach the
85% threshold.
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests
Greptile Summary
This PR adds mutation-kill test coverage across 13 seams to reduce the 334 surviving mutants surfaced after the 0.9.0 mutation-smoke gate became functional. It also contains one non-trivial production change: extracting
classify_accept_errorinstream_tcp.rsto make the previously untestableInterrupted/Fatalaccept-loop arms reachable from unit tests, and adds a policy relaxation inpolicy.rsallowing a diff-scoped lane to pass when all executed mutants are unviable (uncompilable under-D warnings) — gated strictly onDiffScope::PrDiff.core(store, write/writer, delivery/cursor, platform/alloc/fs/spawn, segment/footer/recovery-manifest, coordinate, pipeline),netbat(stream_tcp, tcp, limits, stream_frame),syncbat(builder, operation_status, receipt, subscription_runtime cursors, registry, streams), andintegrity(assurance, structural, public_surface, typed_waivers) — each test pins an observable behavior to kill a specific named surviving mutant.policy.rsrelaxation: unviable-only mutant executions are now allowed to pass on real PR diffs (DiffScope::PrDiff && lane.diff_scoped), correctly matching the existing zero-mutant pass gate; manualworkflow_dispatch/local runs still fail-closed on unviable-only output, with complementary tests pinning both legs of the new branch.Confidence Score: 5/5
Safe to merge. The only production-logic changes are a pure refactor (
classify_accept_errorextraction instream_tcp.rs) and a policy relaxation for unviable-only mutant runs on PR diffs — both narrow, well-tested, and scoped to the mutation-testing harness.The change is overwhelmingly test additions. The single production refactor (
classify_accept_error) is semantically equivalent to the original match guards. The policy relaxation is correctly gated on two independent conditions (diff_scoped && DiffScope::PrDiff) with the fail-closed path (manual runs, non-diff-scoped lanes) preserved and pinned by a complementary test. All three concerns from the previous review thread (hardcoded byte offsets, missing#[cfg(unix)]guard, 200ms wall-clock block) are addressed in this PR.No files require special attention.
Important Files Changed
diff_scoped && DiffScope::PrDiff; manual/non-diff-scoped runs still bail. Both legs pinned by new tests inmod.rs.classify_accept_errorpure function from match guards to makeInterrupted/Fatalarms unit-testable; semantically equivalent to original. Inlinetestsmodule covers all three outcomes plus control-loop retry/disconnect paths.ack_invalid_errorarm insession.rs; verifies both out-of-range index and cursor-mismatch ACK faults map tomalformed_stream_framenotcursor_invalid.<vs<=distinction and a test verifying a magic-bearing TRAILER_SIZE-exact file is parsed and rejected as corrupt (not short-circuited as Ok(None)).undersized_frame_rejected_without_consuming_the_sourcetest that pins cursor position to prove the< 8guard short-circuits I/O; kills the< 8to== 8mutant.#[path]) targetingfail_if_exitedandclose_channel_and_join; uses aFixedJobstub to avoid threading races.put_u64_behelper; tests cover `visibility_epoch,park_for_data, andpull_batchempty-arm behavior.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[assert_mutation_policy called] --> B{executed == 0?} B -- yes --> C{diff_scoped AND PrDiff?} C -- yes --> D[OK: no diff-touched mutants] C -- no --> E[bail: no mutants executed] B -- no --> F{timed_out > 0?} F -- yes --> G[bail: timeout hard-fail] F -- no --> H{score_pct is Some?} H -- yes --> I{score_pct >= min_catch_pct?} I -- yes --> J[OK: threshold met] I -- no --> K[bail: below threshold] H -- no: scored == 0 --> L{diff_scoped AND PrDiff? NEW} L -- yes --> M[OK: unviable-only on PR diff - new in this PR] L -- no --> N[bail: no scoreable mutants - manual run or non-diff-scoped lane]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[assert_mutation_policy called] --> B{executed == 0?} B -- yes --> C{diff_scoped AND PrDiff?} C -- yes --> D[OK: no diff-touched mutants] C -- no --> E[bail: no mutants executed] B -- no --> F{timed_out > 0?} F -- yes --> G[bail: timeout hard-fail] F -- no --> H{score_pct is Some?} H -- yes --> I{score_pct >= min_catch_pct?} I -- yes --> J[OK: threshold met] I -- no --> K[bail: below threshold] H -- no: scored == 0 --> L{diff_scoped AND PrDiff? NEW} L -- yes --> M[OK: unviable-only on PR diff - new in this PR] L -- no --> N[bail: no scoreable mutants - manual run or non-diff-scoped lane]Reviews (7): Last reviewed commit: "fix(mutants-gate): unviable-only in a re..." | Re-trigger Greptile