Stochasticity makes algorithm more robust. So
are humans.
Hence, I shall embrace the uncertainty.
做对世界有益的事
I have no idea what happened /
but now I am not the same.
-
Fudan University
- Shanghai,China
-
09:14
(UTC +08:00) - https://chi-shan0707.github.io/
- https://github.com/FDUGuideBook
- https://github.com/wdzdiy-wiki
Highlights
- Pro
Pinned Loading
-
TinyLoRA-GRPO-Coder
TinyLoRA-GRPO-Coder PublicInspired by 《Learning to Reason in 13 parameters》, use TinyLoRA+GRPO(32 parameters) to fine-tune Qwen2.5-Coder-3B-Instruct(or other models) to accomplish competitive programming.
-
-
Qwen4Luogu-RL
Qwen4Luogu-RL PublicThis repo can work. But I make some updates in a new repo. Please see more in https://github.com/Chi-Shan0707/TinyLoRA-Qwen-Coder
Python 8
-
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.




