SPECTRA predicts sample-level microbiome-associated phenotype patterns from metagenomic abundance data. You can run the local example files in this repository, or use the online xMICARE web server:
https://www.biosino.org/iMAC/xmicare/
For the web interface workflow, see xMICARE_Tutorial.md.
Yin W., Liu L., Zhu R. et al. Gut Microbiome-derived Disease-associated Indices for Non-Invasive Detection of Chronic Diseases. Journal, Year.
- Main SPECTRA workflow for metagenomic abundance data
- Main metagenomic MRI calculators and SPECTRA model
- Example metagenomic abundance, MRI, BMI-MRI, 16S abundance, and metadata files
- Extension application: 16S abundance workflow
- Extension application: high-BMI SPECTRA workflow
- Command-line scripts and shared utility functions
.
├── README.md
├── xMICARE_Tutorial.md
├── environment.yml
├── requirements.txt
├── data/
│ ├── example_metagenomic_abundance.csv
│ ├── example_mri.csv
│ ├── example_bmi_mri.csv
│ ├── example_16s_abundance.csv
│ └── example_metadata.csv
├── models/
│ ├── metagenomic/
│ │ ├── spectra_model.pkl
│ │ └── mri_calculators/
│ └── extensions/
│ ├── 16S/
│ └── high_bmi/
├── scripts/
│ ├── utils.py
│ ├── predict_spectra_from_abundance.py
│ ├── predict_mri_from_abundance.py
│ ├── predict_spectra.py
│ ├── predict_16s.py
│ └── predict_bmi.py
├── tutorial_images/
└── results/
Recommended Python version: 3.12.
conda env create -f environment.yml
conda activate spectra-envOr install the Python packages directly:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtMain package versions:
numpy==1.26.4pandas==2.2.1scikit-learn==1.4.1.post1scipy==1.11.4imbalanced-learn==0.12.2scikit-bio==0.6.2shap==0.47.0
All input CSV files use sample IDs in the first column.
Use this input for the main SPECTRA workflow:
data/example_metagenomic_abundance.csv
Rows are samples and columns are microbial taxa/features. Values should follow the clr transformed abundance format as the example file.
If MRI values have already been calculated, SPECTRA can also start from an MRI table:
data/example_mri.csv
This example MRI file is generated from data/example_metagenomic_abundance.csv with scripts/predict_mri_from_abundance.py, so the one-step and two-step example workflows produce the same SPECTRA probabilities.
Expected MRI columns:
ACVD, AS, BPA, CL, IBD, IGT, T2D, CI, HC, FL, ME, SC
The 16S extension uses:
data/example_16s_abundance.csv
The high-BMI extension uses:
data/example_bmi_mri.csv
The example metadata file contains sample IDs and phenotype labels:
data/example_metadata.csv
This one-step command starts from a metagenomic abundance matrix and directly writes SPECTRA prediction outputs.
python scripts/predict_spectra_from_abundance.py \
--input data/example_metagenomic_abundance.csv \
--mri-model models/metagenomic/mri_calculators \
--spectra-model models/metagenomic/spectra_model.pkl \
--output results/spectra_metagenomic_predictions.csvOutputs:
results/spectra_metagenomic_predictions.csv: predicted label plus class probabilitiesresults/spectra_metagenomic_predictions_probability.csv: class probabilities only
Use this command only if you want to inspect or reuse the intermediate MRI values.
python scripts/predict_mri_from_abundance.py \
--input data/example_metagenomic_abundance.csv \
--mri-model models/metagenomic/mri_calculators \
--output results/metagenomic_mri.csvUse this command only when MRI values have already been calculated.
python scripts/predict_spectra.py \
--model models/metagenomic/spectra_model.pkl \
--input data/example_mri.csv \
--output results/spectra_predictions.csvpython scripts/predict_16s.py \
--input data/example_16s_abundance.csv \
--mri-model models/extensions/16S/mri_models_all.pkl \
--spectra-model models/extensions/16S/spectra_16s_model.pkl \
--output results/16s_predictions.csvpython scripts/predict_bmi.py \
--model models/extensions/high_bmi/spectra_high_bmi.pkl \
--input data/example_bmi_mri.csv \
--output results/bmi_predictions.csvpython scripts/predict_spectra_from_abundance.py \
--input path/to/your_metagenomic_abundance.csv \
--output results/your_spectra_predictions.csv
python scripts/predict_mri_from_abundance.py \
--input path/to/your_metagenomic_abundance.csv \
--output results/your_mri.csv
python scripts/predict_spectra.py \
--input path/to/your_mri.csv \
--output results/your_mri_based_predictions.csv
python scripts/predict_16s.py \
--input path/to/your_16s_abundance.csv \
--output results/your_16s_predictions.csv
python scripts/predict_bmi.py \
--input path/to/your_bmi_mri.csv \
--output results/your_bmi_predictions.csvAfter installing the environment, run:
python scripts/predict_spectra_from_abundance.py
python scripts/predict_mri_from_abundance.py
python scripts/predict_spectra.py
python scripts/predict_16s.py
python scripts/predict_bmi.pyThe output CSV files will be written to results/.