Skip to content

rong-hash/chipseek

Repository files navigation

Curriculum-Guided Dynamic Policy Optimization

Framework

framework

Overall Framework of ChipExplore

Setup

1. LLaMA Factory Environment (for SFT)

Supervised Fine-Tuning (SFT) relies on the LLaMA Factory library. Please follow their official installation instructions to set up the environment:

Ensure you have the necessary dependencies installed as per their guide.

Reasoning cold start data is located in data zip file.

2. CDPO Environment (For RL)

b. Training Setup

We mainly relies on Verl to build up the CDPO algorithm. Please install with the docker image.

docker create --runtime=nvidia --gpus all --net=host --shm-size="10g" --cap-add=SYS_ADMIN -v .:/workspace/verl --name verl <image:tag> sleep infinity
docker start verl
docker exec -it verl bash

b. EDA Tool Docker Containers:

Build the necessary Docker containers for simulation (Icarus Verilog) and synthesis/PPA analysis (Yosys, OpenROAD):

  • Build Icarus Verilog Docker:

    docker build -t iverilog -f docker/iverilog/Dockerfile .
  • Build EDA Environment Docker (Yosys, OpenROAD):

    docker build -t eda_env -f docker/eda/Dockerfile .

Ensure Docker is running on your system before executing these commands.

Dataset Format

Example data formats used for training and evaluation can be found in the data/ directory:

  • SFT Data: See data/sft_dataset_example.json for the instruction/output format used in supervised fine-tuning.
  • CDPO/RTLLM Data: See data/r1_dataset_example.json for the format including problem descriptions, gold solutions, testbenches, and PPA metrics used in CDPO and RTLLM evaluation.

Training

1. Supervised Fine-Tuning (SFT)

SFT is performed using LLaMA Factory.

  • Configuration: The SFT training configuration is defined in recipes/Qwen2.5-7B-Instruct/sft/verilog_sft.yaml. Modify this file to change dataset paths, hyperparameters, etc.

  • Run Training: Execute the training using the llamafactory-cli. Adjust CUDA_VISIBLE_DEVICES as needed.

    CUDA_VISIBLE_DEVICES=0,1,2,3 FORCE_TORCHRUN=1 llamafactory-cli train recipe/sft/qwen2.5-coder-7B-full_sft.yaml

2. CDPO Training

Please ensure the path of base models and datasets are correct in the shell scripts.

Train CodeQwen

./recipe/cdpo/test_cdpo_7b_verilog_codeqwen.sh

Train Qwen2.5-Coder

./recipe/cdpo/test_cdpo_7b_verilog_qwen.sh

Train DeepSeek-Coder

./recipe/cdpo/test_cdpo_7b_verilog_deepseekcoder.sh

Train CodeLlama

./recipe/cdpo/test_cdpo_7b_verilog_codellama.sh

Evaluation

Evaluation scripts are located in the src/evaluation/ directory. Ensure the Open R1 environment (Python packages and Docker containers) is set up as described earlier.

1. RTLLM Benchmark

Generate Verilog code solutions for the RTLLM benchmark problems.

  • Script: src/evaluation/test_on_RTLLM.py
  • Usage:
    python src/evaluation/test_on_RTLLM.py \
      --model <path_to_your_trained_model_or_hf_name> \
      --n <number_of_samples_per_problem> \
      --temperature <sampling_temperature> \
      --gpu_ids <gpu_ids_to_use> \
      --benchmark_path benchmark/rtllm_benchmark.json
      # Add other arguments like --lora_path if needed
    Example:
    python src/evaluation/test_on_RTLLM.py --model /root_extends/model/Qwen2.5-Coder-7B-Verilog-sft --n 10 --temperature 0.7 --gpu_ids 0
    This will generate a .jsonl file in the generated_code/ directory containing the generated solutions.

2. VerilogEval Benchmark

Generate Verilog code solutions for the VerilogEval benchmark problems (Human or Machine generated).

  • Script: src/evaluation/test_on_verilogeval_vllm.py
  • Usage:
    python src/evaluation/test_on_verilogeval_vllm.py \
      --model <path_to_your_trained_model_or_hf_name> \
      --bench_type <Human_or_Machine> \
      --n <number_of_samples_per_problem> \
      --temperature <sampling_temperature> \
      --gpu_ids <gpu_ids_to_use> \
      # Add other arguments as needed
    Example:
    python src/evaluation/test_on_verilogeval_vllm.py --model /root_extends/model/Qwen2.5-Coder-7B-Verilog-sft --bench_type Human --n 10 --temperature 0.7 --gpu_ids 0
    This will also generate a .jsonl file in the generated_code/ directory.

3. PPA Verification (for RTLLM)

python src/evaluation/benchmark_model_ppa.py --bench_type <bench_type>

bench_type should be in [ppa, power, area, delay, area_delay, power_delay]

About

ChipSeek: Optimizing Verilog Generation via EDA-Integrated Reinforcement Learning (ACL 2026)

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages