T-304: regression test for T-300 dirty-set incremental rescore#248
Merged
Conversation
Add an enduring Tier-2 guard for the EW/IW/NA/NA-IW dirty-set incremental rescore wired into the tbr_search SPR accept path (src/ts_tbr.cpp ~1138-1180). The DEBUG_RESCORE / DEBUG_NA_RESCORE / DEBUG_NNI_RESCORE cross-checks were removed (5b210fd, 44a4ebe, 2be8228), leaving no guard; an earlier incremental attempt regressed with a systematic delta=-3 (reverted b7303ee). The test drives many accepted SPR moves (small tips, weak signal, high maxHits) across all four scoring regimes and asserts the search-reported score equals an independent full recomputation (result$score == ts_score(result_tree, ds)). Fails if the dirty-set rescore ever drifts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ms609
added a commit
that referenced
this pull request
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
T-304 (P2) — enduring regression test for the T-300 dirty-set rescore
The EW+NA dirty-set incremental rescores wired into the
tbr_searchSPR acceptpath (
src/ts_tbr.cpp:1138-1180) had no enduring guard: the originalDEBUG_RESCORE/DEBUG_NA_RESCORE/DEBUG_NNI_RESCOREcross-checks wereremoved (5b210fd, 44a4ebe, 2be8228), and an earlier incremental attempt
regressed with a systematic
delta = -3and had to be reverted (b7303ee).What this adds (test-only)
tests/testthat/test-ts-tbr-dirty-rescore.R— a Tier-2 test that drives manyaccepted SPR moves (small tip counts, weak phylogenetic signal, high
maxHits) and asserts the search-reported score equals an independent fullrecomputation:
result$score == ts_score(result_tree, ds). Covers all fourscoring regimes exercised by the dirty-set code paths:
is_spr && !has_na, equal weights)is_spr && !has_na, implied weights viacompute_weighted_score)is_spr && has_na,fitch_na_dirty_*+ pass-3, includesew_offset)is_spr && has_na, implied weights)Modelled on
tests/testthat/test-ts-spr-state-restore.R. Verified locally thatthe searches accept long chains of SPR moves (e.g. 36 → 29/30), so the
dirty-set path is genuinely exercised; the test fails if the incremental
rescore ever drifts from the authoritative full score.
No production code change.
Found by /red-team area 8 (2026-05-26).
GHA agent-check: https://github.com/ms609/TreeSearch/actions/runs/27526933840
🤖 Generated with Claude Code