|
def set_cpu_affinity(local_rank): |
|
LUMI_GPU_CPU_map = { |
|
# A mapping from GCD to the closest CPU cores in a LUMI-G node |
|
# Note that CPU cores 0, 8, 16, 24, 32, 40, 48, 56 are reserved for the |
|
# system and not available for the user |
|
# See https://docs.lumi-supercomputer.eu/hardware/lumig/ |
|
0: [49, 50, 51, 52, 53, 54, 55], |
|
1: [57, 58, 59, 60, 61, 62, 63], |
|
2: [17, 18, 19, 20, 21, 22, 23], |
|
3: [25, 26, 27, 28, 29, 30, 31], |
|
4: [1, 2, 3, 4, 5, 6, 7], |
|
5: [9, 10, 11, 12, 13, 14, 15], |
|
6: [33, 34, 35, 36, 37, 38, 39], |
|
7: [41, 42, 43, 44, 45, 46, 47], |
|
} |
|
cpu_list = LUMI_GPU_CPU_map[local_rank] |
|
print(f"Rank {rank} (local {local_rank}) binding to cpus: {cpu_list}") |
|
psutil.Process().cpu_affinity(cpu_list) |
Quite clear that the CPU bindings are sensible, but would be good to check if it is smart to set them twice with srun. This was mentioned by @mitjasai in #89
LUMI-AI-Guide/5-multi-gpu-and-node/run_ds_srun.sh
Lines 35 to 40 in 6ca88fc
LUMI-AI-Guide/5-multi-gpu-and-node/ds_visiontransformer.py
Lines 18 to 35 in 6ca88fc
We think it does not hurt but it might not be pedagogical sensible to claim that with srun the PyTorch script is portable.