A collection of code from my Master of Biomedical Science.
This repository is divided into a few sections pertaining to different types of code used over the course of my research:
This mostly consists of code for jobs submitted to the slurm system on the Melbourne University Spartan HPC. The directory contains the following:
- Phylogeny pipeline part 1, which produces a core SNP alignment using Snippy
- Phylogeny pipeline part 2, which produces a tree of this alignment for reference using FastTree
- Phylogeny pipeline part 3, which uses Gubbins to detect likely recombinant sites in the core alignment
- Phylogeny pipeline part 4, which masks recombinant sites detected by gubbins using a .bed file produced from its outputs, using Snippy.
- Phylogeny pipeline part 5, which produces a phylogeny of the masked core alignment using IQTree2.
- CRISPRCasFinder script, which uses a Singularity container to use the CRISPRCasFinder program.
- MLST script for assignment of multi-locus sequence types to the dataset.
- NCBI Data script for the download of Staphylococcus argenteus genome data.
- Prokka script for genome annotation.
- Roary script for pangenome analysis of .gff files.
- Snippy scripts for contig and FastQ formats.
- Spades script for assembly of short read data into FASTA files.
- A basic renaming script to rename files from a .csv reference
This directory contains markdown files featuring code used mostly for the construction of figures for my thesis:
- Piechart-map figures
- Phylogenies
- More specific, in-depth NT phylogenies
- Bayesian analysis figures