Thanks for the great work!
https://github.com/cuda-mode/ring-attention/blob/main/ring-llama/test.ipynb
I can load the model with LlamaRingFlashAttention and move to the device but I've seen
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
when I run y = model.generate
What did I miss? Thanks in advance!
Thanks for the great work!
https://github.com/cuda-mode/ring-attention/blob/main/ring-llama/test.ipynb
I can load the model with
LlamaRingFlashAttentionand move to the device but I've seenwhen I run
y = model.generateWhat did I miss? Thanks in advance!