Conversation
|
@CoffeeVampir3 mind if I push the FlashAttention commit to this branch? To put everything Llama in a single PR. Also would be great if you can include some quick start guide for Llama, to check of implementation is running correctly. |
Yeah! That sounds like a plan. I think anything related to modeling/optimization this is a fine place to put.
I do have a testing repo here, I did not want to pollute the repo with test code that was unrelated to llm-foundary, but I did confirm the modeling is correct and trainable here: https://github.com/CoffeeVampir3/Llama-3-Clean-Minimized For the llm-foundary specific, there's a variety of hooks needed to be added into the modeling. I don't think it'll be too difficult, but just a matter of doing it. Quick start for the foundary stuff is waiting on that. |
@CoffeeVampir3 I can try having it work with llm-foundary in a manner similar to how they did for their mpt model, if you haven't already started on that |
I haven't started, would be great if you're interested 👍 |
Initial modeling files for llama3 -- not hooked into llm-foundary yet.