estimation of footprint to assemble a large plant genome

Hello,
we would like to use Ra to assemble a ~5 Gb plant genome (the genome is actually 2.5Gb in size, but it is highly-heterozygous, so we want to distinguish the two alleles in separate contigs). I have about 45x (of 5 Gb) of ONT data (N50 15 kb, QV >7) and we wonder if there is a way to predict the size of the minimap2 file and optimize the alignment step, since it is the most expensive.
For example, to simplify the output, adding/tweaking the minimap2 options: -X -p -N
and to be more sensitive: -A -B -O -E -z.
Do you think there is room in that part to increase specificity of alignments and decrease footprint and computation time?

Lastly, can you estimate the memory usage for an assembly where we have up to 36 CPUs (hyperthreading to 72) and max 500 GB RAM available? Will the polishing/graph construction steps be more memory demanding than the alignment step?
Thanks,
Dario


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

estimation of footprint to assemble a large plant genome #3

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

estimation of footprint to assemble a large plant genome #3

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions