**Inputs:** - [ ] 0-x unpaired fastq/sff files - [ ] 0-y pairs of illumina paired fastq files - [ ] [0-z pacbio files] **Transformation:** - [x] sff -> fastq **Filter:** - [ ] drop reads with Ns - [x] drop illumina reads where the index has a smallest quality score lower than some minimum - [ ] **De novo**: run host_filter - Note: individual files may be lost or empty after this process **Transformation (trimming)**: - [ ] cut 3' and 5' adapters `-a`, `-A`, `-g`, `-G` - [ ] cut 3' and 5' primers - [x] trim all N's from end of reads `--trim-n` - [x] trim low quality bases from 5' and 3' `-q <five>,<three>` or `-q <five>` - [x] remove X bases from beginning/end of reads `-u <X>` - [ ] [run something on pacbio reads separately] **Reduction (mapping/assembly)**: -- This is where denovo/mapping branch - Mapping: - [x] compile paired, unpaired reads - [ ] run bwa on compiled reads, separaetly - [ ] [run bwa/other mapper on pacbio reads, then merge bam files] - [ ] merge paired and unpaired bam files, if they exist - [ ] sort, index bam file - [ ] tag the bam file via tagreads - [ ] run `freebayes` on the bam file - [ ] create a consensus fasta - DeNovo: - [ ] compile paired and unpaired (?) - [ ] run a De Novo assembler (sga and spades are both in bioconda, but not ray) - [ ] try to figure out the number of reads for each contig - [ ] maybe simulate iterative_blast
Inputs:
Transformation:
Filter:
Transformation (trimming):
-a,-A,-g,-G--trim-n-q <five>,<three>or-q <five>-u <X>Reduction (mapping/assembly): -- This is where denovo/mapping branch
freebayeson the bam file