Thanks for your excellent work on DFlash!
I’m interested in comparing the differences between strategies such as MTP-3 and DFlash, with the goal of identifying a more cost-efficient approach for inference systems under a given SLO constraint.
Additionally, I’d like to learn more about any plans to open-source the training code, as that would enable others to more easily reproduce results and explore related directions.
Thanks for your excellent work on DFlash!
I’m interested in comparing the differences between strategies such as MTP-3 and DFlash, with the goal of identifying a more cost-efficient approach for inference systems under a given SLO constraint.
Additionally, I’d like to learn more about any plans to open-source the training code, as that would enable others to more easily reproduce results and explore related directions.