You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work and open-sourced code. I am wondering do you have any reasons for setting max_length=256 for GRPO? why dont we set larger output length?
Hi,
Thanks for your great work and open-sourced code. I am wondering do you have any reasons for setting max_length=256 for GRPO? why dont we set larger output length?