feat(pro): shap_to_feat_imp + CPPPlot sample= (SHAP per-sample plumbing) by breimanntools · Pull Request #323 · breimanntools/aaanalysis

breimanntools · 2026-07-01T03:03:06Z

Part of #305 / #313, refreshed onto current master.

What this adds

shap_to_feat_imp (pro) — convert a per-sample SHAP vector into normalized signed feature impact / absolute importance, reusing ShapModel.add_feat_impact's backend so the two never diverge.
CPPPlot.ranking / profile / feature_map sample= — resolve a sample by name (feature-impact column + TMD-JMD parts) so per-sample SHAP plots need no manual col_imp=f"feat_impact_{name}" string plumbing.

Consolidation note

The explanation-similarity clustermap moved to the core AAPredPlot.clustermap (drawing from provided importance vectors), so ShapModelPlot (which only held the clustermap + its dendrogram cut) was removed here along with its backend and tests. This PR now carries only the SHAP per-sample helpers above.

Part of #313 (no closing keyword).

🤖 Generated with Claude Code

…CPPPlot sample= shortcut Ports the explanation-similarity clustermap from the original gamma-secretase project into a library-grade pro API, adds the shap_to_feat_imp normalization helper, and lets CPPPlot.ranking/profile/feature_map resolve a sample by name. - ShapModelPlot.clustermap: correlation-of-SHAP-vectors clustermap with row/col class-color sidebars, a class legend, a labelled horizontal colorbar, and font via plot_gco; returns the seaborn ClusterGrid. - ShapModelPlot.get_clusters: deterministic dendrogram cut (n_clusters / color_threshold), replacing the original dendrogram-color parsing. - shap_to_feat_imp: signed impact (reusing the ShapModel backend) / absolute importance, both normalized to sum(|.|)=100. - CPPPlot sample=: resolves col_imp=feat_impact_<entry> (+ TMD-JMD parts from df_parts for profile/feature_map) and sets shap_plot=True; default output unchanged when sample is None. ShapModelPlot / shap_to_feat_imp stay unwired at the top level (TODO #305, CONFIRM-FIRST). pro-gated; tests skip cleanly when shap is absent. Part of #305 / prototype for #313. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e-consistency test - comp_shap_correlation: raise a clear ValueError naming samples with a constant (zero-variance) SHAP vector instead of an opaque scipy non-finite-distance error (covers clustermap + get_clusters). - shap_to_feat_imp: raise on an all-zero vector instead of silently returning nan. - Add a regression test proving get_clusters uses the same linkage the clustermap dendrogram draws, plus negative tests for both new guards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Under default Matplotlib rcParams the class legend overflowed the figure's right edge (~35px) and was clipped on a plain savefig without bbox_inches='tight'. Reserve a right margin via grid.gs.update(right=0.80) so the legend fits inside the canvas; verified under both default rcParams and plot_settings(). Add a regression test asserting the legend stays within the figure bounds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…Counter dedup - Remove unused n_features unpacking (3 spots; only n_samples is used). - df_parts.index[int(sample)] instead of list(df_parts.index)[int(sample)]. - Duplicate-name detection via collections.Counter (single pass) instead of an O(n^2) list(names).count(...) scan. Output byte-identical; all tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The auto-discovering pro-contract meta-test requires every public *_pro symbol's one-line summary to carry the [pro] / aaanalysis[pro] install marker; shap_to_feat_imp lacked it and was failing test_pro_marker_in_summary. Add the marker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codecov · 2026-07-01T04:10:59Z

Codecov Report

❌ Patch coverage is 95.74468% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.94%. Comparing base (1a152de) to head (db3ee55).
⚠️ Report is 29 commits behind head on master.

Files with missing lines	Patch %	Lines
aaanalysis/feature_engineering/_cpp_plot.py	93.75%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #323   +/-   ##
=======================================
  Coverage   94.93%   94.94%           
=======================================
  Files         185      186    +1     
  Lines       17883    17915   +32     
  Branches     3038     3040    +2     
=======================================
+ Hits        16978    17010   +32     
+ Misses        598      597    -1     
- Partials      307      308    +1

Files with missing lines	Coverage Δ
aaanalysis/explainable_ai_pro/__init__.py	`100.00% <100.00%> (ø)`
aaanalysis/explainable_ai_pro/_shap_model_plot.py	`100.00% <100.00%> (ø)`
aaanalysis/feature_engineering/_cpp_plot.py	`97.96% <93.75%> (-0.38%)`	⬇️

... and 2 files with indirect coverage changes

Components	Coverage Δ
cpp_core	`94.95% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

breimanntools

The clustermap is not only for shap values but also for featuers or other numerical represntations. Perhaps we need a plot_clustmarp utils insterad of asigning it to SHapModel. Should we make a general plotting class called AAPlot (aap) and the predction plots can be asigned to this one as well. Or AAPredPlot) I do not know right now

…d by AAPredPlot.clustermap The explanation-similarity clustermap now lives on the core AAPredPlot (drawing from provided importance vectors). Remove the ShapModelPlot class (clustermap + get_clusters), its clustermap backend (sm_plot.py), and its tests. This #323 branch now carries only the stand-alone shap_to_feat_imp helper (pro) + the CPPPlot sample= plumbing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

breimanntools

I want to check this in the example notebook! Make sure that this is capability and shortcut is shown there clarely (like first the way we do it so far, then intro that a shortcut for this exist as follows....

…ooks (review) Per PR review: show the sample= shortcut in the example notebooks after the existing manual per-sample SHAP path. Each of feature_map / ranking / profile now presents the explicit way first (name the impact column, resolve TMD-JMD parts via get_seq_kws, pass col_imp + parts + shap_plot=True) and then the equivalent one-argument shortcut (sample=<entry or index>). feature_map/profile key the impact column on the entry name (as add_feat_impact writes it) and pass df_seq + df_parts; ranking needs no parts. Both calls render side by side so the equivalence is visible. Notebooks re-executed with outputs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

breimanntools and others added 5 commits July 1, 2026 05:02

breimanntools commented Jul 1, 2026

View reviewed changes

breimanntools and others added 3 commits July 1, 2026 18:52

Merge remote-tracking branch 'origin/master' into feat/shap-clustermap

550471a

Merge remote-tracking branch 'origin/master' into feat/shap-clustermap

473a39f

breimanntools changed the title ~~feat: ShapModelPlot.clustermap + shap_to_feat_imp + CPPPlot sample= (prototype for #313)~~ feat(pro): shap_to_feat_imp + CPPPlot sample= (SHAP per-sample plumbing) Jul 2, 2026

breimanntools mentioned this pull request Jul 2, 2026

feat(prediction): AAPred + AAPredPlot — evaluate & deploy prediction models #332

Merged

breimanntools marked this pull request as ready for review July 2, 2026 12:48

breimanntools commented Jul 3, 2026

View reviewed changes

breimanntools merged commit 0bc2266 into master Jul 3, 2026
13 checks passed

breimanntools deleted the feat/shap-clustermap branch July 3, 2026 18:19

breimanntools mentioned this pull request Jul 4, 2026

feat: ShapModelPlot.clustermap (explanation-similarity) + shap_to_feat_imp + sample= SHAP plots #313

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pro): shap_to_feat_imp + CPPPlot sample= (SHAP per-sample plumbing)#323

feat(pro): shap_to_feat_imp + CPPPlot sample= (SHAP per-sample plumbing)#323
breimanntools merged 9 commits into
masterfrom
feat/shap-clustermap

breimanntools commented Jul 1, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

breimanntools left a comment

Uh oh!

breimanntools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

breimanntools commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

Consolidation note

Uh oh!

codecov Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

breimanntools left a comment

Choose a reason for hiding this comment

Uh oh!

breimanntools left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

breimanntools commented Jul 1, 2026 •

edited

Loading

codecov Bot commented Jul 1, 2026 •

edited

Loading