MatInvent: Accelerating inverse materials design using generative diffusion models with reinforcement learning
MatInvent is a general and efficient reinforcement learning workflow that optimizes diffusion models for goal-directed crystal generation. MatInvent enables robust optimization for inverse material design tasks with single or multiple target properties. Compatible with diverse diffusion model architectures and property constraints, MatInvent could offer broad applicability in materials discovery.
System requirements: This package requires a standard Linux computer with GPU (supports CUDA >= 11) and enough RAM (> 2 GB). If you want to run the code on a GPU that does not support CUDA>=11, you need to modify the versions of PyTorch and CUDA in the env.yml file. Two methods for environment setup are provided, and you can choose one according to your preference. Typically, method 1 (~2 min) is much faster than method 2 (>10 min).
Method 1 (uv)
A quick way to setup the environment is by uv, a fast Python package and project manager.
- Run the script uv_install.sh to create a uv environment and install the dependencies into
.venvfolder:bash scripts/uv_install.sh
- Activate the uv environment with
source .venv/bin/activate.
Method 2 (conda/mamba)
- Use
condato install dependencies and set up the environment for a Nvidia GPU machine. We recommend using the Miniconda installer. You can also installmambato the conda base environment by the commandconda install mamba -n base -c conda-forge.mambais a faster, drop-in replacement forconda. - Then create a conda environment and install the dependencies:
Activate the conda environment with
conda env create -f env.yml # OR use mamba to create new env faster # mamba env create -f env.yml
conda activate matinvent.
Additional Configuration
- We use Weights & Biases (wandb) by default to record RL results, which can visualize RL curves online in real time. If you would like to use
wandb, please login to your account before running the RL experiment. If you do not want to usewandb, please change theloggerparameter in the run script tologger=csv. - If you want to calculate specific heat capacities for RL rewards using FairChem software and pre-trained ML potentials eSEN-30M-OAM, please follow this tutorial to install the additional conda environment.
Checkpoint files for the pretrained diffusion models and property prediction model are available at Hugging Face.
The material properties and rewards can be obtained through theoretical simulations, ML predictions, and empirical calculations. Any cost-acceptable property estimator can be used to calculate RL rewards without requiring gradients. This codebase provides over 15 property evaluators and corresponding RL rewards, encompassing electronic, magnetic, mechanical, thermal, dielectric, and physicochemical properties, as well as synthesizability and supply chain risk. Since DFT computation is expensive and time-consuming, we provide additional ML prediction models for rapid testing. All configuration files related to RL rewards can be found in configs/reward.
You can use the following commands or run_rl.sh script to run RL experiments for single or multiple property optimization tasks.
# option 1
python -u main.py \
expname=test \
pipeline=mat_invent \
model=mattergen \
reward=hhi \
logger=wandb# option 2
bash scripts/run_rl.shYou need to modify the relevant parameters for your own task. The parameter expname defines the name of this RL experiment. model defines which diffusion model is used in RL. reward defines the target property for the RL reward. We use the hydra package to manage the configuration of input parameters.
gen_eval.sh script is used to generate a large number of crystal structures using unconditional or RL-finetuned MatterGen models and to evaluate the generation quality (such as SUN ratio).
bash scripts/gen_eval.shPlease modify the environment variables in scripts/gen_eval.sh before running it.
This work was supported as part of NCCR Catalysis (grant number 225147), a National Centre of Competence in Research funded by the Swiss National Science Foundation.
If you find our work useful, please consider citing it:
@article{matinvent,
title={Accelerating inverse materials design using generative diffusion models with reinforcement learning},
author={Chen, Junwu and Guo, Jeff and Fako, Edvin and Schwaller, Philippe},
journal={arXiv preprint arXiv:2511.03112},
year={2025}
}