Thanks for your great work! I found a problem: The training_loss_l and the training_loss_w of my diffusion-based model both went to a big value when I set beta_dpo to 2000-5000(even higher). Could you please tell me how to solve this problem? Your early reply will be highly appreciated!