Skip to content

Latest commit

 

History

History
33 lines (22 loc) · 795 Bytes

File metadata and controls

33 lines (22 loc) · 795 Bytes

Litex LLM Development

2025.9.22 gsm8k 93.5%,minif2f 4.5%. qwen2.5-instruct-7B converge after one epoch on gsm8k. Data(8k gsm, 100 minif2f) is not enough. High quality data is even more scarce.

TODO:

  1. More minif2f
  2. gsm 93% is not good enough. Temperature, topk, prompt, model parameters all need to be tuned.
  3. Synthetic theorem proof data should be used instead of gsm8k.
  4. reproduction on deepseek-math-7B for comparison with deepseek-prover-1.5-base
  5. More theorem proof task with medium difficulty for RL

Environment Setup

bash setup_env.sh

Download lora weights from google drive

Training

bash run.sh

Evaluation

evaluate on miniF2F

python eval.py