Skip to content

Fix --scfl gap detection for models with rate heterogeneity#142

Open
roblanf wants to merge 1 commit intomasterfrom
roblanf
Open

Fix --scfl gap detection for models with rate heterogeneity#142
roblanf wants to merge 1 commit intomasterfrom
roblanf

Conversation

@roblanf
Copy link
Member

@roblanf roblanf commented Mar 18, 2026

Summary

  • Fixes --scfl giving incorrect results with any model using multiple rate categories (+G, +R, +I+G, +I+R, mixture models)
  • The gap detection filter in computeSubtreeAncestralState() used a fixed threshold (sum > 1.0) that broke for ncat_mix > 1 and for deep subtrees with scaled partial likelihoods
  • The fix uses scale_num to distinguish real data from gaps at internal nodes, handling both normal and SAFE_NUMERIC modes

Test plan

  • Empirical validation on reporter's dataset (10 models × 4 methods)
  • Simulation study (8-taxon tree, 1,100 runs across 4 models, 5 methods, 11 concordance levels, 5 gap levels)
  • sCFL = sCFL --safe (max |diff| = 0.01)
  • Single-category models unchanged (old == new)
  • Multi-category models fixed (sN improved from ~1 to ~590 informative sites)

Validation scripts and full results: https://github.com/roblanf/iqtree3/tree/issue-137/tests/scfl_simulation

Full report of empirical and simulation test results: #137 (comment)

Fixes #137

🤖 Generated with Claude Code

The gap detection filter in computeSubtreeAncestralState() used a
fixed threshold (sum > 1.0) that broke for models with multiple rate
categories (ncat_mix > 1) and for deep subtrees with scaled partial
likelihoods. This caused --scfl to incorrectly filter nearly all
sites as gaps, producing wrong sCF values for any model using +G, +R,
+I+G, +I+R, or mixture models.

The fix uses scale_num to distinguish real data from gaps at internal
nodes: scaled sites (scale_num > 0) are always real data, while
all-gap subtrees are detected by sum > ncat_mix with no scaling.
Leaf nodes retain the original threshold. Both normal and
SAFE_NUMERIC scale_num layouts are handled.

Fixes #137

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roblanf
Copy link
Member Author

roblanf commented Mar 19, 2026

@bqminh can you take a look at this - would be great to get it into the latest binaries ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iqtree3 gives weird (incorrect?) sCF values compared to iqtree2

1 participant