Hello,
Thanks for your work and making the repo public. I have been trying to reproduce the results using the README.Md.
When I run torchrun --standalone --nproc_per_node=$NUM_OF_GPUs main.py run@global=namm_bam_i1.yaml
on 8xH100s, it takes forever for steps to update. Each step takes a few hours.
Is this normal or am I missing something.
TIA
Hello,
Thanks for your work and making the repo public. I have been trying to reproduce the results using the README.Md.
When I run torchrun --standalone --nproc_per_node=$NUM_OF_GPUs main.py run@global=namm_bam_i1.yaml
on 8xH100s, it takes forever for steps to update. Each step takes a few hours.
Is this normal or am I missing something.
TIA