Part of #305.
Problem
Two prediction-result figures from the original project have no library API and
are reused across the paper but must be hand-built each time:
- Prediction-score distribution — per-class histogram of a model's
prediction score (0–100%) showing how substrate / non-substrate / unknown
separate (plot_pred1 / _plot_pred1, scripts/plot_cpp_pred.py): a
sns.histplot(hue=..., binrange=(0,100), multiple="stack") plus ~30 lines of
int-ytick / xlim / despine / custom-legend fiddling.
- Ranked candidates with agreement — proteins ranked by predicted score as
horizontal bars colored by class, with per-protein std error bars and gene
yticks (plot_pred3_top_hits, reused 3×): sns.barplot(orient="h") +
plt.errorbar(fmt='none', capsize=...).
The library already has plot_rank, but it has no error-bar / class-color
variant for the "named top-N candidates" figure.
Goal
Add a prediction-score histogram and extend ranking so the two standard
prediction figures are one call each, driven by a model's per-sample scores.
Requirements
KPIs / Acceptance criteria
Scope / non-goals
Dependencies
Standards checklist
Part of #305.
Problem
Two prediction-result figures from the original project have no library API and
are reused across the paper but must be hand-built each time:
prediction score (0–100%) showing how substrate / non-substrate / unknown
separate (
plot_pred1/_plot_pred1,scripts/plot_cpp_pred.py): asns.histplot(hue=..., binrange=(0,100), multiple="stack")plus ~30 lines ofint-ytick / xlim / despine / custom-legend fiddling.
horizontal bars colored by class, with per-protein std error bars and gene
yticks (
plot_pred3_top_hits, reused 3×):sns.barplot(orient="h")+plt.errorbar(fmt='none', capsize=...).The library already has
plot_rank, but it has no error-bar / class-colorvariant for the "named top-N candidates" figure.
Goal
Add a prediction-score histogram and extend ranking so the two standard
prediction figures are one call each, driven by a model's per-sample scores.
Requirements
aa.plot_prediction_hist(df_pred, score=..., group=..., ...)(class-separated score distribution); consider a
TreeModelPlot/ShapModelPlothome.
plot_rank(or add a sibling) with per-item error bars + classcolors + optional cutoff line, for the ranked-candidates figure.
plotting.mdcompliance.Examplesinclude.KPIs / Acceptance criteria
df_predwith score /std / class columns (visual + shape check).
plot_rankdefault output unchanged (additive args only; regression).Scope / non-goals
aap.predict. SHAP-driven scores are item-8.Dependencies
Standards checklist
__init__.py/__all__if new symbols) · plotting.mdcompliance · numpydoc · tests · no-print