-
Notifications
You must be signed in to change notification settings - Fork 6
Parallelization
Depending on your hardware (personal computer, cluster etc.), you should use:
-
pargenes.py: on single-node architecture (your personal computer, or remote machines with only one node) -
pargenes-hpc.py: on multiple-node architecture (for instance clusters). You need to have MPI installed.
Warning: the number of cores should always be at least 2 with pargenes-hpc.py.
When running ParGenes on a cluster, you might want to know the maximum number of cores you can request without wasting computational resources. To determine this optimal number of cores, you can first perform a dry run: set the number of cores to at least 2 (-c 2 or more), and add --dry-run to your command line. In the dry run mode, ParGenes only reads the MSA files and tries to guess the optimal number of cores to use. This number is printed in the logs (for instance Recommended number of cores: 14).
Please remember that this recommended number of cores is only a guess from ParGenes.
To decide how many cores to allocate to your ParGenes job:
- Do never exceed the number of physical cores your hardware has, even if ParGenes recommends more cores (ParGenes does not know anything about your hardware).
- Exceeding the recommended number of cores will not slow down ParGenes, but might be a waste of resource.
- It is always ok to allocate less cores than recommended. If ParGenes recommends 100 cores but you only allocate 50 cores, the overall runtime is likely to be twice slower. But if you are working on a cluster, it might be worthed it querying 50 cores only to wait less time for the job to be started.