The scripts included in this repository were used to analyze data regarding the composition of TDP-43-associated cryptic exons, differential gene expression as a result of TDP-43 KD, toxicity of single target knockdown, and survival of cells after over-expression of selected TDP-43 targets in shTDP43-treated cells. Some raw data files are also included and others can be available upon request.
Sinha I.R.(#), Ye, Y(#), Li Y, Sandal PS, Burns G.D., Cruz AF, Sun S, Wong PC, Ling JP. Expanded Detection of Cryptic Exons Through Nonsense-Mediated Decay Inhibition Reveals Broad Landscape of TDP-43 Targets. bioRxiv. doi: TBD
Irika R. Sinha and Yingzhi Ye are co-first authors (#)
| File Name | Purpose |
|---|---|
| 250329_CrypticPSIfill_InitialVis.Rmd* | Initial analysis and visualization of cryptic exon data (ex. PSI calculations, CE type PI charts, NMD effect on PSI) |
| 250409_CrypticPSI_CE_ALS_Genes.Rmd* | Comparison of known ALS-FTD risk genes with CE-including genes for overlap |
| 250410_bulkseq.Rmd* | Differential gene expression analysis after TDP-43 and adjustment for function transcripts |
| 250411_UGrepeat.Rmd* | Create function to analyze dinucleotide repeats in cryptic exon splice sites +/- 600bp. Includes UG and CA repeat analysis. |
| 250425_shRNA_imaging.Rmd | Calculation of shRNA toxicity & visualization |
| 250506_YY_Rescue.Rmd | Calculation of survival after shTDP43 + target gene overexpression & visualization |
*R Markdown document included as html. Download to open properly.
| File Name | Description |
|---|---|
| 250409_ALSgenes.csv | ALS-associated risk genes as per alsod.ac.uk |
250423_CellProfiler_Settings_p1-5.csv 250423_CellProfiler_Settings_p6-8.csv |
Exported CellProfiler settings for shRNA image analysis + quantification of nuclei and PI specks |
| 250509_CE_table.csv | Identified cryptic exon genes, coordinates, splice junctions, and calculated PSIs |
| 250509_IS_Rescue_YY_Quant_Pool.csv | Manual counts of nuclei and PI specks after rescue experiments |
| GRch38sequence_report.tsv | Includes information on chromosome length |
| 250410_nfcore_RNAseq_pipeline_report.html | nf-core/RNAseq pipeline info output |
| File Name | Description |
|---|---|
| 250412_cryptic_df_UG_freq_L.csv | [UG]n near 5' end of CEs |
| 250412_cryptic_df_UG_freq_R.csv | [UG]n near 3' end of CEs |
- CRAN: BiocManager, gghighlight, ggplot2, ggpubr, ggrepel, ggstats, pheatmap, tidyverse, viridis, progress, makeunique, readxl
- Bioconductor: apeglm, biomaRt, Biostrings, DESeq2, clusterProfiler, org.Hs.eg.db
- GitHub: snapmine
This work was supported in part by the Alzheimer’s Association (to J.P.L.), the Institute for Data-Intensive Engineering and Science (to J.P.L.), , the Robert Packard Center for ALS Research at Johns Hopkins (to P.C.W.), the Target ALS Foundation (to P.C.W.), ALS Finding a Cure (to P.C.W.), the ALS Association (to P.C.W.), and the US Food and Drug Administration (no. 1U01FD008129 to P.C.W.).
This material is based upon work supported by the National Science Foundation (NSF) Graduate Research Fellowship Program under Grant No. DGE2139757 (to I.R.S.). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
This work was supported by resources from the Advanced Research Computing at Hopkins (ARCH) core facility (rockfish.jhu.edu), which is supported by the NSF (no. OAC 1920103).
We thank Katherine E. Irwin and Anya A. Kim for their troubleshooting support and suggestions.