This is a repository containing scripts for the course project. The repository includes scripts to download and analyze sequences.
DownloadMitoGenBanks:
- ExtractGenBankIDs.py: Extracts GenBank IDs from a list of mitochondrial genomes
- MitoGenomeList.txt: File used for testing ExtractGenBankIDs.py
- MitoLocusIDs.txt: File generated by ExtractGenBankIDs.py and used as input for DownloadGenBank
- DownloadGenBank.py: Script for downloading GenBank files from NCBI
DownloadGenomeGenBanks:
- getGenomeGenBanks: Script for downloading genbank files for entire genome using FTP
- GenomeList.txt: List of genomes used as input for getGenomeGenBanks.py
- MitoGenomeList.txt: List of mitochondrial genomes used as input for getGenomeGenBanks.py
Analysis:
- AnalyzeSequence.py: Script used to analyze the codon and amino acid usage of the coding sequence of a genome
- calculateGCcontent.py: Script used to analyze the GC content of the coding sequence of a genome
- NC_001224.1.gb: GenBank file used for testing the scripts for analysis
Miscellaneous:
- CheckGenBank.py: Script that checks if the genome sequence is present in the GenBank file
- ConvertGenBank.py: Script that converts a GenBank file to FASTA (.fna) format
- NC_001224.1.gb: GenBank file used for testing scripts
./pythonfile.py
- Python 2.7
- bioython version 1.69
- matplotlib version 2.0.2