Hello,
we would like to use Ra to assemble a ~5 Gb plant genome (the genome is actually 2.5Gb in size, but it is highly-heterozygous, so we want to distinguish the two alleles in separate contigs). I have about 45x (of 5 Gb) of ONT data (N50 15 kb, QV >7) and we wonder if there is a way to predict the size of the minimap2 file and optimize the alignment step, since it is the most expensive.
For example, to simplify the output, adding/tweaking the minimap2 options: -X -p -N
and to be more sensitive: -A -B -O -E -z.
Do you think there is room in that part to increase specificity of alignments and decrease footprint and computation time?
Lastly, can you estimate the memory usage for an assembly where we have up to 36 CPUs (hyperthreading to 72) and max 500 GB RAM available? Will the polishing/graph construction steps be more memory demanding than the alignment step?
Thanks,
Dario
Hello,
we would like to use Ra to assemble a ~5 Gb plant genome (the genome is actually 2.5Gb in size, but it is highly-heterozygous, so we want to distinguish the two alleles in separate contigs). I have about 45x (of 5 Gb) of ONT data (N50 15 kb, QV >7) and we wonder if there is a way to predict the size of the minimap2 file and optimize the alignment step, since it is the most expensive.
For example, to simplify the output, adding/tweaking the minimap2 options: -X -p -N
and to be more sensitive: -A -B -O -E -z.
Do you think there is room in that part to increase specificity of alignments and decrease footprint and computation time?
Lastly, can you estimate the memory usage for an assembly where we have up to 36 CPUs (hyperthreading to 72) and max 500 GB RAM available? Will the polishing/graph construction steps be more memory demanding than the alignment step?
Thanks,
Dario