Skip to content

T-304: regression test for T-300 dirty-set incremental rescore#248

Merged
ms609 merged 5 commits into
cpp-searchfrom
feature/t304-tbr-dirty-rescore-test
Jun 15, 2026
Merged

T-304: regression test for T-300 dirty-set incremental rescore#248
ms609 merged 5 commits into
cpp-searchfrom
feature/t304-tbr-dirty-rescore-test

Conversation

@ms609

@ms609 ms609 commented Jun 15, 2026

Copy link
Copy Markdown
Owner

T-304 (P2) — enduring regression test for the T-300 dirty-set rescore

The EW+NA dirty-set incremental rescores wired into the tbr_search SPR accept
path (src/ts_tbr.cpp:1138-1180) had no enduring guard: the original
DEBUG_RESCORE / DEBUG_NA_RESCORE / DEBUG_NNI_RESCORE cross-checks were
removed (5b210fd, 44a4ebe, 2be8228), and an earlier incremental attempt
regressed with a systematic delta = -3 and had to be reverted (b7303ee).

What this adds (test-only)

tests/testthat/test-ts-tbr-dirty-rescore.R — a Tier-2 test that drives many
accepted SPR moves
(small tip counts, weak phylogenetic signal, high
maxHits) and asserts the search-reported score equals an independent full
recomputation: result$score == ts_score(result_tree, ds). Covers all four
scoring regimes exercised by the dirty-set code paths:

  • EW (is_spr && !has_na, equal weights)
  • IW (is_spr && !has_na, implied weights via compute_weighted_score)
  • NA (is_spr && has_na, fitch_na_dirty_* + pass-3, includes ew_offset)
  • NA-IW (is_spr && has_na, implied weights)

Modelled on tests/testthat/test-ts-spr-state-restore.R. Verified locally that
the searches accept long chains of SPR moves (e.g. 36 → 29/30), so the
dirty-set path is genuinely exercised; the test fails if the incremental
rescore ever drifts from the authoritative full score.

No production code change.

Found by /red-team area 8 (2026-05-26).

GHA agent-check: https://github.com/ms609/TreeSearch/actions/runs/27526933840

🤖 Generated with Claude Code

Add an enduring Tier-2 guard for the EW/IW/NA/NA-IW dirty-set incremental
rescore wired into the tbr_search SPR accept path (src/ts_tbr.cpp ~1138-1180).
The DEBUG_RESCORE / DEBUG_NA_RESCORE / DEBUG_NNI_RESCORE cross-checks were
removed (5b210fd, 44a4ebe, 2be8228), leaving no guard; an earlier
incremental attempt regressed with a systematic delta=-3 (reverted b7303ee).

The test drives many accepted SPR moves (small tips, weak signal, high
maxHits) across all four scoring regimes and asserts the search-reported
score equals an independent full recomputation (result$score ==
ts_score(result_tree, ds)). Fails if the dirty-set rescore ever drifts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ms609 added a commit that referenced this pull request Jun 15, 2026
@ms609 ms609 merged commit bb42134 into cpp-search Jun 15, 2026
1 of 7 checks passed
@ms609 ms609 deleted the feature/t304-tbr-dirty-rescore-test branch June 15, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant