Skip to content

Tuziking/MPRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 

Repository files navigation

MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models

Paper Presentation License Status

This repository is the official implementation of the paper: "MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models".

The paper has been accepted as a Full Paper with an Oral Presentation at PAKDD 2026 (The 30th Pacific-Asia Conference on Knowledge Discovery and Data Mining) and will be included in the conference proceedings.

Note: The paper is currently in the camera-ready stage. The link to the official publication and the preprint (e.g., arXiv) will be updated soon.


πŸ“’ News

  • [2026-02-08]: πŸŽ‰ Our paper has been accepted by PAKDD 2026 as a Full Paper (Oral)!
  • [Status]: πŸ› οΈ The codebase is currently under organization for better readability and will be open-sourced shortly. Stay tuned!

πŸš€ TODO List

  • Paper acceptance (PAKDD 2026 Oral)
  • Release paper preprint (arXiv)
  • Release core MPRL training code and reward modeling scripts
  • Release evaluation benchmarks for structured formats (JSON, XML, YAML)
  • Upload pre-trained model weights & checkpoints

πŸ“‚ Project Structure (Coming Soon)

The planned structure for this repository:

.
β”œβ”€β”€ configs/          # Training and evaluation configurations (YAML/JSON)
β”œβ”€β”€ data/             # Datasets and preprocessing scripts for structured data formatting
β”œβ”€β”€ mprl/             # Core Multi-Perspective Reinforcement Learning implementation
β”‚   β”œβ”€β”€ models/       # Policy and Reward model architectures
β”‚   └── trainers/     # RL training loops 
β”œβ”€β”€ evaluation/       # Scripts for evaluating syntax validity and format adherence
β”œβ”€β”€ scripts/          # Bash scripts for launching training and inference
└── README.md


πŸŽ“ Citation

If you find this work or code helpful for your research, please consider citing:

@inproceedings{qian2026mprl,
  title={MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models},
  author={Qian, Bo and Wu, Yuting and Zeng, Shuang and Wang, Ziming and Wang, Qiaochen},
  booktitle={Proceedings of the 30th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)},
  year={2026}
}

About

Official implementation of the paper: "MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models" (PAKDD 2026, Full Paper & Oral).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors