Part of #336 (usability epic).
Problem
Two sharp edges hit while using StructurePreprocessor on ~400 PDB chains:
depth silently needs the external msms binary. With on_failure="nan", one unavailable
feature NaNs the whole entry (not just that column), so I silently got "0/1000 windows mapped"
with no warning — only found by testing features one at a time.
- Feature/backend split is opaque. Features are divided across
encode_dssp vs encode_pdb, and
mixing them errors cryptically: "contact_count_8A … should be encoded by … use the matching
method" — the user doesn't know which method owns which key.
Suggestion
- Per-feature failure isolation: drop/NaN only the failing feature column, keep the rest; emit a
one-line warning ("msms not found → depth unavailable").
- A single
encode(features=[...]) that routes each feature key to the right backend
(dssp / pdb / pae), so users don't need to know the split.
Part of #336 (usability epic).
Problem
Two sharp edges hit while using
StructurePreprocessoron ~400 PDB chains:depthsilently needs the externalmsmsbinary. Withon_failure="nan", one unavailablefeature NaNs the whole entry (not just that column), so I silently got "0/1000 windows mapped"
with no warning — only found by testing features one at a time.
encode_dsspvsencode_pdb, andmixing them errors cryptically: "
contact_count_8A… should be encoded by … use the matchingmethod" — the user doesn't know which method owns which key.
Suggestion
one-line warning ("
msmsnot found →depthunavailable").encode(features=[...])that routes each feature key to the right backend(dssp / pdb / pae), so users don't need to know the split.