Skip to content

Plant-Net/GENIAL_Framework

Repository files navigation

🍅Tomato bulk RNA-seq and GENIAL framework

Conditions 🧪

The conditions represents tomato infected with 7 different conditions (Meloidogyne incognita 7 and 14 dpi, Botrytis cinerea, Phytophthora infestans, Cladosporium fulvum, and Potato spindle tuber viroid mild and severe strains). Representing a total of 83 samples across 7 infections. The data are publicaly available throught their respective bioproject:

Infection Tissue Controls hpi Controls replicates Infected hpi Infected Replicates BioProject Reference DOI
M.incognita Root 168 8 168 8 PRJNA734743 DOI
M.incognita Root 336 8 336 8 PRJNA734743 DOI
PSTVd Mild strain Root 408 6 408 6 PRJNA515609 DOI
PSTVd Severe strain Root 408 6 PRJNA515609 DOI
B.cinerea Leaf 0 7 30 8 PRJNA662936 DOI
P.infestans Leaf 0 6 72 6 PRJNA505207 DOI
C.fulvum Leaf 72 3 72 3 PRJNA781749 DOI

Summary 🧭

From the multi-transcriptomics bulk RNA-seq data, we applied HIVE. HIVE returned a list of genes available in the data folder but the following framework can be applied to any gene list. From the list, we retrieved the GRN using TomTom neo4j database. We further curate the GRN to have a balance between confidence and sparsity.

We used decoupleR's ULM to infer TF activities and retrieve the significant ones. We consider the previous GRN and t-stat output of DESeq2 perform on each infection independently. We then used decoupleR's MLM to infer pathways activities from Mercator pathways (also available using TomTom).

Topological Data Analysis was performed on the same GRN with corresponding TF activities. We first applied the mapper algorithm to find a simpler representation of the GRN, and we further used the ToMATo algorithm to find groups on the mapper graph obtained before.

To find representatives nodes in each of the groups, we selected TF having significant activities for multiple conditions.

Installation ⚙️

conda create --name ENV_NAME python=3.12 pip install -r requirements.txt

You also need R requirements such as DESEq2, edgeR, ggplot2, ... present in the R scripts.

Run the GENIAL framework ▶️

You need the HIVE selection present in Data/ or any other matrix

From the raw transcriptomics table: DEA/DEA.R to perform the DEA and get the needed Wald stats. Then DEA/Merge_matrix.ipynb to get the merged data used after.

GRN and Activities: GRN.ipynb to retrieve the necessary networks (GRN) from TomTom and check them. TF_pathway_activity_mercator.ipynb to perform TF and pathway activites. To correct the activity, in the folder Shuffle you will find shuffle_TF.ipynb to correct TF activities and shuffle_pathways to correct pathway activities.

TDA: TDA/Prepare_data.ipynb to format the data for TDA. It will create all necessary matrices. TDA/mapper.py to obtain the TDA network colored for all pathogens and the four configuration. Finally, TDA/Pathway_TDA_and_link_TF_Pathway.ipynb to check for TF - Pathway links.

For the plots, most of them are obtain with Plot/Plot_clean.ipynb or directly within dedicated notebook.

For the supplementary table, with all informations for each TFs of the GRN (t-stats, activities, TDA group, TF Familly,...), you can create it after running the workflow with Create_SUPP_table.ipynb

Reference ✍️

You can find all the detailed and explained results here

DOI

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages