Skip to content

Releases: OpenMLRL/CoMLRL

v1.3.6

13 Feb 22:47

Choose a tag to compare

arXiv
documentation

This release (v1.3.6) provides the code used in Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic.
See CoMLRL GitHub and Docs for more details about the latest version.

Core Features

  1. MAGRPO, IAC, MAAC Trainers for Decentralized LLM Collaboration
  2. Environments including Writing, Coding, and Minecraft Collaboration.
Changelog

- #60 Fixed critical error in hetero model loading, and allows more flexible critics and agents (see [docs/model-loading](https://openmlrl.github.io/CoMLRL/docs/user-guide/model-loading/)).
- Change downstreaming repos' interfaces accordingly (align them into v1.3.6) and polish docs.

v.1.3.5

07 Feb 22:27
9948bdc

Choose a tag to compare

Changelog

  • Add unit tests for hyperparameter constraints.
  • Clean legacy interfaces.

v.1.3.4

07 Feb 03:34

Choose a tag to compare

Changelog

  • Fix the bug of loading heterogeneous models and reform the loading logics.
  • Enable MBGD in MAGRPO to align with MAAC and IAC.
  • Remove redundant and legacy hyperparameters (e.g., model kwargs, patching hyperparameters).
  • Clean multi-device legacy, like drop last and num_workers.
  • Add unit tests for model loading and separate it from CI as a badge.
  • Clean short functions.
  • Reorganize the docs and align the parameters.

v.1.3.3

05 Feb 22:18

Choose a tag to compare

Changelog

  • Compact MAREINFORCETrainer derivation, and move to the new folder.
  • Unify the interface for different trainers.
  • Remove redundant patches and wrappers.
  • Reorganize the variables in the config yamls.

v.1.3.2

29 Jan 14:45

Choose a tag to compare

arXiv
documentation

Warning

Deprecated: A new version supports more flexible interfaces for heterogeneous LLMs loading is provided in v1.3.6.

This release (v1.3.2) provides the code used in Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic.
See CoMLRL GitHub and Docs for more details about the latest version.

Core Features

  1. MAGRPO, IAC, MAAC Trainers (off-policy) for Decentralized LLM Collaboration
  2. Environments including Writing, Coding, and Minecraft Collaboration.

Changelog

  • Fixed wandb logging issue in MAGRPOTrainer.
  • Align all environment repos with version 1.3.2.

v1.3.1

30 Dec 18:20
bb0ad77

Choose a tag to compare

Changelog

  • Allow batch training in MAGRPOTrainer, IACTrainer and MAACTrainer
  • Allow multi-turn training in IACTrainer and MAACTrainer
  • Change the x-axis from data_step to env_step
  • Pair with LLM_Collab_Code_Generation v1.3.1
image image image

v1.3.0

20 Dec 03:01
70d9662

Choose a tag to compare

Changelog

Use TD loss for Critic update

v1.2.9

01 Dec 20:05

Choose a tag to compare

Changelog

Add MAAC for single-turn training.

image

v1.2.8

29 Nov 16:14
873a87e

Choose a tag to compare

Changelog

Make IAC's estimation for V rather than Q.

image image

v1.2.7

22 Nov 02:45

Choose a tag to compare

Changelog:

Change IPPO to be IAC, since it's on-policy.