MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models
This repository is the official implementation of the paper: "MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models".
The paper has been accepted as a Full Paper with an Oral Presentation at PAKDD 2026 (The 30th Pacific-Asia Conference on Knowledge Discovery and Data Mining) and will be included in the conference proceedings.
Note: The paper is currently in the camera-ready stage. The link to the official publication and the preprint (e.g., arXiv) will be updated soon.
- [2026-02-08]: π Our paper has been accepted by PAKDD 2026 as a Full Paper (Oral)!
- [Status]: π οΈ The codebase is currently under organization for better readability and will be open-sourced shortly. Stay tuned!
- Paper acceptance (PAKDD 2026 Oral)
- Release paper preprint (arXiv)
- Release core MPRL training code and reward modeling scripts
- Release evaluation benchmarks for structured formats (JSON, XML, YAML)
- Upload pre-trained model weights & checkpoints
The planned structure for this repository:
.
βββ configs/ # Training and evaluation configurations (YAML/JSON)
βββ data/ # Datasets and preprocessing scripts for structured data formatting
βββ mprl/ # Core Multi-Perspective Reinforcement Learning implementation
β βββ models/ # Policy and Reward model architectures
β βββ trainers/ # RL training loops
βββ evaluation/ # Scripts for evaluating syntax validity and format adherence
βββ scripts/ # Bash scripts for launching training and inference
βββ README.md
If you find this work or code helpful for your research, please consider citing:
@inproceedings{qian2026mprl,
title={MPRL: Multi-Perspective Reinforcement Learning for Enhancing Format Adherence Capability of Large Language Models},
author={Qian, Bo and Wu, Yuting and Zeng, Shuang and Wang, Ziming and Wang, Qiaochen},
booktitle={Proceedings of the 30th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)},
year={2026}
}