Skip to content

feat: clinical infrastructure for blood biomarker clocks#203

Open
marcbal77 wants to merge 1 commit intobio-learn:masterfrom
marcbal77:feature/clinical-infrastructure
Open

feat: clinical infrastructure for blood biomarker clocks#203
marcbal77 wants to merge 1 commit intobio-learn:masterfrom
marcbal77:feature/clinical-infrastructure

Conversation

@marcbal77
Copy link
Copy Markdown
Member

Summary

  • Add clinical data layer to GeoData (5th layer alongside dnam, rna, protein_alamar, protein_olink)
  • Add GeoData.from_clinical_matrix(df, source_units=, units=) factory for loading clinical blood test data
  • Add biomarker registry (biolearn.clinical.registry) with canonical names, units, valid ranges, and unit conversions for 16 biomarkers
  • Add required_features() method to all 12 model classes, returning {"layer": str, "features": list, "metadata": list}
  • Add load_nhanes_as_geodata() bridge function in biolearn.load

This is PR 1 of the clinical clocks initiative. It ships only infrastructure (no clock implementations). All 69 existing clocks continue to work unchanged. Foundation for PhenoAge Clinical, KDM, Bortz Blood Age, and other clinical aging clocks in subsequent PRs.

Partial fix for #194.

1.0 API surfaces for review

These interfaces will become stable at 1.0.0. Please review carefully:

  • GeoData.from_clinical_matrix(df, source_units=, units=) signature
  • model.required_features() return format {"layer": str, "features": list, "metadata": list}
  • GeoData.clinical attribute (features-as-rows, samples-as-columns)
  • BIOMARKER_REGISTRY structure

Test plan

  • 11 tests for GeoData clinical layer (init, copy, save/load roundtrip, from_clinical_matrix, unit conversion, validation)
  • 16 tests for biomarker registry and unit conversions
  • Parametrized required_features() interface test across all 69 models
  • Consistency test: required_features() matches methylation_sites() for dnam models
  • Full existing test suite passes (323 passed, 10 skipped)
  • make format clean

Add GeoData clinical layer, biomarker registry with unit conversions,
required_features() interface on all models, and NHANES-to-GeoData bridge.
Foundation for clinical aging clocks (PhenoAge, KDM, Bortz, etc.)

Addresses bio-learn#194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant