DeepRESIS is a command-line tool for predicting ncRNA-drug resistance relationships. You provide an ncRNA FASTA file, a drug SMILES file, and an ncRNA-drug pair file. The tool produces resistance prediction results and top-k gene ranking outputs.
On a fresh machine:
git clone https://github.com/idrblab/DeepRESIS.git
cd DeepRESIS
bash scripts/bootstrap_all.sh
conda activate deepresis
deepresis --helpThis is the main installation path. It installs the environment, downloads the required large files from Hugging Face, generates deepresis.toml, and validates that the runtime is ready.
bash scripts/bootstrap_all.sh will automatically:
- create a conda environment
- install Python dependencies
- install the R runtime and required R packages
- install the R package
LncFinder - install DeepRESIS in editable mode
- download 5 model checkpoint files from Hugging Face
- download
drug_gene.csvandncrna_gene.csvfrom Hugging Face - generate
deepresis.toml - validate the runtime and downloaded assets
Large model and gene matrix files are not stored in the GitHub repository. They are fetched automatically during bootstrap.
DeepRESIS needs three input files.
Example:
>ncRNA_1
AUGCUAGCUAGCUA
>ncRNA_2
GGCUAAGCUU
The FASTA record ID becomes the ncrna_id.
Example:
drug1 CCO
drug2 CCN(CC)CC
The first column is drug_id, and the second column is the SMILES string.
Example:
ncRNA_1 drug1
ncRNA_2 drug2
The first column is ncrna_id, and the second column is drug_id.
Important: every ncrna_id in the pair file must exist in the FASTA file, and every drug_id in the pair file must exist in the SMILES file.
After bootstrap finishes, run prediction with the generated config file:
deepresis predict \
--config ./deepresis.toml \
--fasta /path/to/test.fasta \
--smiles /path/to/test_cid.txt \
--pairs /path/to/test_pairs.txt \
--output-dir ./outputs \
--topk 300You do not need to pass --model-dir or --gene-matrix-dir in the normal workflow.
DeepRESIS writes three output files in the output directory.
This file contains resistance prediction results for each requested ncRNA-drug pair.
Columns:
ncrna_iddrug_idscorelabellabel_id
This file contains the top-k ranked genes for each pair when gene ranking is available.
Columns:
ncrna_iddrug_idrankgene_idscoredrug_gene_scorencrna_gene_score
This file records why gene ranking could not be generated for specific pairs.
Columns:
ncrna_iddrug_idmessage
You can run the repository demo after installation:
bash examples/run_demo.shOr run the same example explicitly:
deepresis predict \
--config ./deepresis.toml \
--fasta ./examples/test.fasta \
--smiles ./examples/test_cid.txt \
--pairs ./examples/test_pairs.txt \
--output-dir ./demo_outputsExpected behavior for the current sample:
resistance_predictions.tsvcontains exactly 3 requested pairstopk_genes.tsvmay contain only the headergene_ranking_warnings.tsvmay contain 3 warnings because the sample circRNA IDs are not present inncrna_gene.csv
Conda is not installed or not on your PATH.
Next step: install Miniconda or Anaconda first, then rerun bash scripts/bootstrap_all.sh.
The machine cannot access Hugging Face, or the connection was interrupted.
Next step: check internet access and rerun the bootstrap command.
One or more downloaded checkpoint files are missing under artifacts/models.
Next step: delete the incomplete artifacts directory and rerun bash scripts/bootstrap_all.sh.
drug_gene.csv or ncrna_gene.csv is missing under artifacts/gene_matrix.
Next step: delete the incomplete artifacts directory and rerun bash scripts/bootstrap_all.sh.
The ViennaRNA binary is not available in the installed environment.
Next step: rerun the bootstrap command and confirm that environment creation completed successfully.
The R package LncFinder was not installed correctly.
Next step: rerun the bootstrap command. If it still fails, inspect the R installation step printed by the script.
bash scripts/bootstrap_all.sh deepresis_prodbash scripts/bootstrap_all.sh deepresis_prod /data/deepresis_assets /data/deepresis.tomlNormally you do not need this. If needed, you can override the Hugging Face directory URLs:
export DEEPRESIS_MODELS_URL="https://huggingface.co/swallow-design/DeepRESIS/tree/main/model_pharameter"
export DEEPRESIS_GENE_MATRIX_URL="https://huggingface.co/swallow-design/DeepRESIS/tree/main/gene_matrix"
bash scripts/bootstrap_all.shThe bootstrap script will automatically normalize tree/main or blob/main URLs to real resolve/main download URLs.
from deepresis.pipeline import run_prediction
run_prediction(
fasta_path="test.fasta",
smiles_path="test_cid.txt",
pairs_path="test_pairs.txt",
output_dir="outputs",
config_path="deepresis.toml",
topk=300,
)