Description of the Issue:
Currently, the num_procs setting on a QUEENS scheduler controls two things at once:
- How many CPUs the workload manager (e.g. SLURM) gives to a worker.
- How many MPI processes the simulation tool is started with (e.g.
mpirun -n N 4C ...).
Because both come from the same value, a worker that gets 4 CPUs from the cluster is forced to run the simulation with exactly 4 ranks - and vice versa. There is no way to:
- give a worker some headroom for Python pre- or post-processing while running the simulation with fewer ranks, or
- run many short-lived jobs in small worker slots whose simulations still need several ranks.
This couples two things that are conceptually independent: how big the slot on the cluster is and how many cores the simulation actually uses.
Proposed Solution:
Let drivers carry their own optional num_procs value:
- If the driver's
num_procs is set, it is used for the simulation call.
- If it is not set (default), the driver falls back to the scheduler's
num_procs — exactly today's behaviour.
This is fully backward compatible (default None = unchanged behaviour) and adds one optional argument on the driver side. The scheduler keeps its current role of requesting the worker resources from the workload manager; the driver gets to say how many MPI ranks the simulation tool actually uses.
Out of scope (future work): Different core counts for each step of a single driver pipeline (e.g. meshing on 1 core, simulation on N cores, plotting on 1 core in the same job). That would require a deeper architectural change. This issue is a prerequisite but does not attempt to solve it.
Action Items:
- Add optional
num_procs argument to the driver base class.
- Use it for the jobscript rendering when set; fall back to the scheduler value otherwise.
- Add a small regression test with two drivers using different values on the same scheduler.
- One-paragraph note in the docs.
Related Issues:
No response
Interested Parties:
No response
Description of the Issue:
Currently, the
num_procssetting on a QUEENS scheduler controls two things at once:mpirun -n N 4C ...).Because both come from the same value, a worker that gets 4 CPUs from the cluster is forced to run the simulation with exactly 4 ranks - and vice versa. There is no way to:
This couples two things that are conceptually independent: how big the slot on the cluster is and how many cores the simulation actually uses.
Proposed Solution:
Let drivers carry their own optional
num_procsvalue:num_procsis set, it is used for the simulation call.num_procs— exactly today's behaviour.This is fully backward compatible (default
None= unchanged behaviour) and adds one optional argument on the driver side. The scheduler keeps its current role of requesting the worker resources from the workload manager; the driver gets to say how many MPI ranks the simulation tool actually uses.Out of scope (future work): Different core counts for each step of a single driver pipeline (e.g. meshing on 1 core, simulation on N cores, plotting on 1 core in the same job). That would require a deeper architectural change. This issue is a prerequisite but does not attempt to solve it.
Action Items:
num_procsargument to the driver base class.Related Issues:
No response
Interested Parties:
No response