T-306: gate HSJ/XFORM SPR/NNI accept-paths to full_rescore#249
Conversation
The SPR accept path in tbr_search() (src/ts_tbr.cpp) and the accept path in nni_search() (src/ts_search.cpp) updated best_score with a Fitch-only (EW) or IW/profile incremental delta. For HSJ and XFORM scoring, score_tree() additionally adds a topology-dependent hierarchy-DP (HSJ) or Sankoff (XFORM) term that the delta omits, so the search's internal accept/reject decisions tracked the wrong objective (best_score drifted from the authoritative score). Final user-visible scores stayed correct because run_single_replicate always recomputes via score_tree() before pooling, so this was a silent search-quality regression. Fix: gate the incremental fast path on ds.scoring_mode being one of EW/IW/XPIWE/PROFILE; HSJ and XFORM fall back to full_rescore() in both accept paths (same scoring-mode classification as the T-275/T-303 guards). The NNI rejection-restore path likewise rebuilds the postorder for HSJ/XFORM instead of a stale incremental downpass. Adds a Tier-2 regression test (test-ts-t306-accept-guard.R) driving the full HSJ and XFORM driven search on a brute-forceable 7-tip dataset and asserting the search reaches the true global optimum, that every returned tree's independent score equals the reported score (no best_score desync), and that identical seeds are deterministic. Documents the HSJ/XFORM full-rescore fallback in vignettes/search-algorithm.Rmd. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
R CMD check reported
This PR touches only Local targeted tests on the built branch pass: the new Recommend merging the MaxMin/WideSample CI fix (example dependency + WORDLIST) on |
Avoid a direct phangorn::phyDat call in the regression test helper; use TreeTools::MatrixToPhyDat (already loaded) instead. Test still passes (42 assertions): the dataset retains a sharp HSJ optimum that is a strict subset of the Fitch optimum, and the search reaches the true optimum. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… T-308 secondary) (#250) * T-307: Fix HSJ primary absent_state (was "-" index, not "0") The driven HSJ pipeline hard-coded absent_state = 0L, which for the usual levels c("-","0","1") is the index of the inapplicable token "-", not the primary's "absent" state "0" (index 1). Primaries coded "0" were therefore treated as PRESENT, so gain/loss of the structure was never counted; the HSJ score was insensitive to primary presence/absence (verified: alpha=0 single-absent-tip scored 0 instead of 1). - Add hsj_absent_state(dataset): 0-based index of the "0" token in `levels` (falls back to "-"), so the value tracks the dataset's level ordering instead of being hard-coded. build_tip_labels is level-order generic, so a constant was fragile regardless of which constant. - Use it at every driven call site: MaximizeParsimony, Morphy (resample), TreeLength.phylo and TreeLength.list. Reconcile the test helper to compute it the same way (was hard-coded 1L). - C++ score_hierarchy_block: treat the primary as absent when its label is the absent_state OR the inapplicable token, mirroring recode_hierarchy's `pri == "0" || pri == "-"` and fixing nested hierarchies where a primary may itself be inapplicable. - Regression tests: alpha=0 primary sensitivity, driven-vs-direct agreement, hsj_absent_state across level orderings, primary-scoring level invariance. Separate from T-306 (PR #249); discovered during that work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Add WideSample false positives to WORDLIST agent-check was failing on cpp-search before this branch: the spelling test errored on DropAdd, FarFirst, Fathi, MMDP, MaxMin, Porumbel, Resende, Sayyady, WideSample, medoid, pkg — author surnames, algorithm acronyms and identifiers from recent WideSample work that were never added to the wordlist. Added via spelling::update_wordlist(); restored config/maximin/ warmup that the tool would otherwise have pruned. Unrelated to the HSJ fix but required to get this branch's CI green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Declare 'highs' in Suggests R/WideSample.R conditionally uses highs via requireNamespace() and in \dontrun/guarded examples, but it was never declared, so R CMD check --as-cran (windows job, error-on="warning") failed with "'library' or 'require' call not declared from: 'highs'". Pre-existing on cpp-search from recent WideSample work; required to get this branch's CI green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * T-308: Make HSJ secondary dissimilarity invariant to level ordering fitch_label_char()'s uppass resolved ambiguous internal nodes to the LOWEST SET BIT. Which token that bit denotes depends on the phyDat `levels` ordering, so differently-ordered levels produced different secondary reconstructions, different per-branch mismatch counts d, and therefore different scores (the repro dataset scored 2.5 vs 2.0 at hsj_alpha=1). A parsimony-style score must not depend on the arbitrary internal ordering of levels. Resolve instead toward the best-supported token in the node's own subtree (max tip count, ties broken by smallest supporting tip index). Both keys are properties of the tokens and the tree, not of the bit encoding, giving a strict, deterministic, order-independent resolution that is still a valid MPR — so the dissimilarity stays non-zero (the concern that motivated adding the uppass in T-134). Only the primary term was order-safe before (T-307); this closes the secondary term. Verified invariant across all level permutations on the repro plus a 1200-case randomised battery (multistate secondaries, "?" missing data, random trees, alpha in {0,0.5,1}). Existing HSJ value assertions are unchanged. Regression test extended to hsj_alpha in {0.5, 1} across all orderings, plus a multistate-secondary case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * T-308: make HSJ absent-state helper private (.HSJAbsentState) `hsj_absent_state` (added with T-307) breached the package naming conventions: R functions are BigCamelCase and private helpers are dot-prefixed (r-conventions). It was also @export-ed with a public man page despite being internal C++-bridge plumbing that returns a 0-based token index. Rename to `.HSJAbsentState`, drop @export and the man page, and document it with a plain comment like the package's other private helpers (.ParseOneBlock, .HierarchicalResampleWeights). Call sites in MaximizeParsimony.R, Morphy.R, tree_length.R and the test bridge updated. No behaviour change; the function is still reachable via ::: for tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Rename hierarchy/HSJ module to package naming conventions The whole hierarchical-scoring module was written in snake_case, breaching the package convention (functions BigCamelCase; private functions dot- prefixed; variables camelCase). Following the .HSJAbsentState fix, sweep the rest of the module: Exported helpers -> BigCamelCase (snake_case names removed): recode_hierarchy -> RecodeHierarchy (public) hierarchy_from_names -> HierarchyFromNames (public) validate_hierarchy -> ValidateHierarchy (exported-internal) hierarchy_chars -> HierarchyChars (exported-internal) hierarchy_controlling -> HierarchyControlling (exported-internal) Pure C++-bridge plumbing -> private, unexported (dot-prefixed), documented with plain comments like the package's other private helpers: build_tip_labels -> .BuildTipLabels hierarchy_to_blocks -> .HierarchyToBlocks non_hierarchy_weights -> .NonHierarchyWeights (.HSJAbsentState already done) Also: nested helpers (.recode_block -> .RecodeBlock, .flatten_block -> .FlattenBlock); the public `char_names` arg -> `charNames`; and local variables in the dedicated module files (CharacterHierarchy.R, recode_hierarchy.R) camelCased. The internal resample-weights list fields were renamed to camelCase too (`non_hierarchy_weights` -> `nonHierarchyWeights`, `block_counts` -> `blockCounts`), updated in producer, consumer and tests. Call sites updated across R/, tests/, the vignette and the Shiny app; man pages and NAMESPACE regenerated via roxygen2. NEWS records the breaking rename. All hierarchy/HSJ/xform/recode/resample/tree_length tests pass; spelling clean. Local variables in the large shared scoring functions (tree_length.R, Morphy.R, MaximizeParsimony.R) keep their existing `tip_data`/`adj_weight` style — those bridge locals are pervasive across non-hierarchy branches too, so renaming them belongs to a separate cleanup. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
T-306 (P3) — HSJ/XFORM SPR/NNI accept-paths omit hierarchy DP contribution from
best_scoreProblem
The SPR accept path in
tbr_search()(src/ts_tbr.cpp) and the accept path innni_search()(src/ts_search.cpp) updatedbest_scorewith a Fitch-only (EW) or IW/profile incremental delta. For HSJ and XFORM scoring,score_tree()additionally adds a topology-dependent hierarchy-DP (HSJ) / Sankoff (XFORM) term that the delta omits, sobest_scoredrifted from the authoritative score and the search's internal accept/reject decisions tracked the wrong objective.Final user-visible scores stayed correct (
run_single_replicatealways recomputes viascore_tree()before pooling), so this was a silent search-quality regression, not a wrong-output bug.Fix
Gate the incremental fast path on
ds.scoring_mode ∈ {EW, IW, XPIWE, PROFILE}; HSJ and XFORM fall back tofull_rescore()in both accept paths — the same scoring-mode classification used by the T-275/T-303 guards. The NNI rejection-restore path likewise rebuilds the postorder for HSJ/XFORM rather than running a stale incremental downpass.hsj_score()/score_tree()callfitch_score_ew()internally, so the fallback leaves the Fitch state arrays in the same state as the EWfull_rescorebranch already exercised for TBR rerooting moves — safe for continued iteration.The standalone
spr_search()alreadyfull_rescores on accept, so it was unaffected; scope matches the task (only the two named paths needed changes).Tests
New Tier-2 test
tests/testthat/test-ts-t306-accept-guard.Rdrives the full HSJ and XFORM driven search on a brute-forceable 7-tip dataset (945 unrooted trees) and asserts:TreeLength);best_scoredesync);The HSJ optimum on the test dataset is a strict subset of the Fitch optimum (3 ⊊ 15 of 945), so reaching it exercises the hierarchy-aware scoring, not just the Fitch component.
Note on discrimination: the bug is silent at the reported-score level — an empirical sweep confirmed that a buggy build still reaches the optimum from every Fitch-optimal "trap" start tree, because the driven pipeline's TBR rerooting moves (always
full_rescore) recover the optimum regardless. A black-boxMaximizeParsimony()test therefore cannot discriminate buggy from fixed (matching the T-303 sibling finding). The test instead locks in that the newfull_rescorefallback yields a correct, consistent, deterministic search.Validation
test-ts-hsj(37),test-ts-xform(68),test-ts-tbr-search(28),test-ts-iw(86),test-ts-spr-nni-opt(45),test-ts-nni-iw-rescore(20) — no regressions.agent-checkdispatched: run 27528341506.search-algorithm.Rmdupdated to document the HSJ/XFORM full-rescore fallback (per the AGENTS.md scoring-change rule).Sibling task T-303 (sector path, same bug family) is handled separately in a different file set.
🤖 Generated with Claude Code