KB4CT is a knowledge-based compiler tuning system that optimizes LLVM pass sequences through a combination of offline empirical prototype discovery and an online knowledge-guided personalized evolutionary algorithm. The system operates in two main stages: Offline Knowledge Base Construction and Online Knowledge-Guided Personalized Optimization.
Extract the LLVM IR datasets from the supplementary material. Place the extracted datasets in the project root directory, ensuring the structure is as follows:
KB4CT/
├── dataset/
│ ├── train/
│ │ ├── dataset1/
│ │ ├── dataset2/
│ │ └── ...
│ └── test/
│ ├── dataset1/
│ ├── dataset2/
│ └── ...
│── DLL
├── LLVMEnv/
├── llvm_tools/
├── output/
└── KB4CT.py
python KB4CT.pyYou can configure the parameters in the if name == 'main': section of KB4CT.py.
"offline_ga_params": {
"seq_len": 100, # Sequence length
"population_size": 100, # Population size
"generations": 30, # Number of generations
"elite_size": 20, # Number of elite individuals
"crossover_rate": 0.8, # Crossover probability
"mutation_rate": 0.8 # Mutation probability
}"online_ga_params": {
"population_size": 50, # Population size
"generations": 5, # Number of generations
"elite_size": 10, # Number of elite individuals
"crossover_rate": 0.8, # Crossover probability
"mutation_rate": 0.99 # Mutation probability
}After execution, the results will be saved in the output/ directory:
pass_embeddings_visualization.png: Visualization of pass embeddingspass_clusters_visualization.png: Visualization of pass clustersablation_study_*.png: Ablation study result figuresablation_study_report.txt: Ablation study reportablation_detailed_results.json: Detailed result data
The system supports the following ablation study modes:
- full: Full knowledge-guided method
- no_knowledge_crossover: No knowledge-guided crossover
- no_knowledge_mutation: No knowledge-guided mutation
- random_init: Random population initialization
- no_knowledge: Standard GA without any knowledge guidance