ddmms · kuryla · Feb 22, 2026 · Feb 22, 2026 · Feb 22, 2026 · Feb 22, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -2,7 +2,7 @@
 # pre-commit install
 repos:
 - repo: https://github.com/pre-commit/pre-commit-hooks
-  rev: v5.0.0
+  rev: v6.0.0
   hooks:
     - id: end-of-file-fixer
     - id: mixed-line-ending
@@ -11,7 +11,7 @@ repos:
 
 - repo: https://github.com/astral-sh/ruff-pre-commit
   # Ruff version.
-  rev: v0.11.13
+  rev: v0.15.5
   hooks:
     # Run the linter.
     - id: ruff-check
@@ -20,7 +20,7 @@ repos:
     - id: ruff-format
 
 - repo: https://github.com/numpy/numpydoc
-  rev: v1.8.0
+  rev: v1.10.0
   hooks:
     - id: numpydoc-validation
       files: ^ml_peg/
diff --git a/docs/source/user_guide/benchmarks/conformers.rst b/docs/source/user_guide/benchmarks/conformers.rst
@@ -0,0 +1,42 @@
+==========
+Conformers
+==========
+
+ACONFL
+======
+
+Summary
+-------
+
+Performance in predicting relative conformer energies of 12 C12H26,
+16 C16H34 and 20 C20H42 conformers. Reference data from PNO-LCCSD(T)-F12/ AVQZ calculations.
+
+Metrics
+-------
+
+1. Conformer energy error
+
+For each complex, the the relative energy is calculated by taking the difference in energy
+between the given conformer and the reference (zero-energy) conformer. This is
+compared to the reference conformer energy, calculated in the same way.
+
+Computational cost
+------------------
+
+Low: tests are likely to take minutes to run on CPU.
+
+Data availability
+-----------------
+
+Input structures:
+
+* Conformational Energy Benchmark for Longer n-Alkane Chains
+  Sebastian Ehlert, Stefan Grimme, and Andreas Hansen
+  The Journal of Physical Chemistry A 2022 126 (22), 3521-3535
+  DOI: 10.1021/acs.jpca.2c02439
+
+Reference data:
+
+* Same as input data
+* :math:`PNO-LCCSD(T)-F12/ AVQZ` level of theory: a local, explicitly
+  correlated coupled cluster method.
diff --git a/docs/source/user_guide/benchmarks/index.rst b/docs/source/user_guide/benchmarks/index.rst
@@ -14,3 +14,5 @@ Benchmarks
     bulk_crystal
     lanthanides
     non_covalent_interactions
+    tm_complexes
+    conformers
diff --git a/docs/source/user_guide/benchmarks/molecular.rst b/docs/source/user_guide/benchmarks/molecular.rst
@@ -124,3 +124,53 @@ Reference data:
 
 * Same as input data
 * DLPNO-CCSD(T)/CBS
+
+
+BMIM Cl RDF
+===========
+
+Summary
+-------
+
+Tests whether MLIPs incorrectly predict covalent bond formation between chloride
+anions (Cl⁻) and carbon atoms in 1-butyl-3-methylimidazolium (BMIM⁺) cations.
+Such Cl-C bonds should NOT form in the ionic liquid under normal conditions.
+
+This benchmark runs NVT molecular dynamics simulations of BMIM Cl at
+353.15 K and analyses the Cl-C RDF to detect any unphysical bond formation.
+
+
+Metrics
+-------
+
+1. Cl-C Bonds Formed
+
+Binary metric indicating whether unphysical Cl-C bonds formed during the MD simulation.
+
+The Cl-C RDF is computed from the MD trajectory. If the RDF shows a peak (g(r) > 0.1)
+at distances below 2.5 Å, this indicates bond formation and the model fails the test.
+
+* 0 = no bonds formed (correct physical behaviour)
+* 1 = bonds formed (unphysical, model failure)
+
+
+Computational cost
+------------------
+
+Medium: tests require running 10,000 steps of Langevin MD for a system of 10 ion
+pairs, which may take tens of minutes on GPU.
+
+
+Data availability
+-----------------
+
+Input structures:
+
+* Generated using molify from SMILES representations of BMIM⁺ (CCCCN1C=C[N+](=C1)C)
+  and Cl⁻ ions, packed to experimental density of 1052 kg/m³ at 353.15 K.
+* Zills, F. molify: Molecular Structure Interface. Journal of Open Source Software
+  10, 8829 (2025). https://doi.org/10.21105/joss.08829
+* Density from: Yang, F., Wang, D., Wang, X. & Liu, Z. Volumetric Properties of
+  Binary and Ternary Mixtures of Bis(2-hydroxyethyl)ammonium Acetate with Methanol,
+  N,N-Dimethylformamide, and Water at Several Temperatures. J. Chem. Eng. Data 62,
+  3958-3966 (2017). https://doi.org/10.1021/acs.jced.7b00654
diff --git a/docs/source/user_guide/benchmarks/molecular_dynamics.rst b/docs/source/user_guide/benchmarks/molecular_dynamics.rst
@@ -0,0 +1,38 @@
+
+==================
+Molecular dynamics
+==================
+
+Water density
+================
+
+Summary
+-------
+
+Performance in predicting the density of water at temperatures of 270, 290, 300, and 330 K.
+The water systems consist of 333 molecules.
+
+Metrics
+-------
+
+1. Density error
+
+For each system, the density is calculated by taking the average density of an NPT molecular
+dynamics run. The initial part of the simulation, here 500 ps, is omitted from the density
+calculation. This is compared to the reference density, obtained from experiment.
+
+Computational cost
+------------------
+
+Low: tests are likely to take several days to run on GPU.
+
+Data availability
+-----------------
+
+Input structures:
+
+*
+
+Reference data:
+
+* Experiment
diff --git a/docs/source/user_guide/benchmarks/nebs.rst b/docs/source/user_guide/benchmarks/nebs.rst
@@ -46,3 +46,60 @@ Reference data:
 
 * Manually taken from https://doi.org/10.1149/1.1633511.
 * Meta-GGA (Perdew-Wang) exchange correlation functional
+
+
+Si defects
+==========
+
+Summary
+-------
+
+Performance in predicting DFT singlepoint energies and forces along fixed nudged-elastic-band
+(NEB) images for a silicon interstitial migration pathway.
+
+Metrics
+-------
+
+For each of the three NEB datasets (64 atoms, 216 atoms, and 216 atoms di-to-single), MLIPs are
+evaluated on the same ordered NEB images as the reference.
+
+1. Energy MAE
+
+Mean absolute error (MAE) of *relative* energies along the NEB, shifting image 0 to 0 eV
+for both the DFT reference and the MLIP predictions.
+
+2. Force MAE
+
+Mean absolute error (MAE) of forces across all atoms and images along the NEB.
+
+Computational cost
+------------------
+
+Medium: tests are likely to take several minutes to run on CPU.
+
+Data availability
+-----------------
+
+Input/reference data:
+
+* Reference extxyz trajectories (including per-image DFT energies and forces) are distributed as a
+  separate zip archive and downloaded on-demand from the ML-PEG data store.
+  The calculation script uses the public ML-PEG S3 bucket to retrieve these inputs.
+* The reference DFT energies/forces come from Quantum ESPRESSO (PWscf) single-point calculations
+  with:
+
+  - Code/version: Quantum ESPRESSO PWSCF v.7.0
+  - XC functional: ``input_dft='PBE'``
+  - Cutoffs: ``ecutwfc=30.0`` Ry, ``ecutrho=240.0`` Ry
+  - Smearing: ``occupations='smearing'``, ``smearing='mv'``, ``degauss=0.01`` Ry
+  - SCF convergence/mixing: ``conv_thr=1.0d-6``, ``electron_maxstep=250``, ``mixing_beta=0.2``,
+    ``mixing_mode='local-TF'``
+  - Diagonalization: ``diagonalization='david'``
+  - Symmetry: ``nosym=.false.``, ``noinv=.false.`` (symmetry enabled)
+  - Pseudopotential: ``Si.pbe-n-kjpaw_psl.1.0.0.UPF`` (PSLibrary)
+
+  K-points by case:
+
+  - 64 atoms: Γ-only (``K_POINTS automatic 1 1 1 0 0 0``)
+  - 216 atoms: Γ-only (``K_POINTS gamma``)
+  - 216 atoms di-to-single: Γ-only (``K_POINTS gamma``)
diff --git a/docs/source/user_guide/benchmarks/tm_complexes.rst b/docs/source/user_guide/benchmarks/tm_complexes.rst
@@ -0,0 +1,42 @@
+==========================
+Transition Metal Complexes
+==========================
+
+3dTMV
+=======
+
+Summary
+-------
+
+Performance in predicting vertical ionization energies for 28 transition metal
+complexes.
+
+Metrics
+-------
+
+1. Ionization energy error
+
+For each complex, the ionization energy is calculated by taking the difference in energy
+between the complex in its oxidized state and initial state, which differ by one electron
+and spin multiplicity. This is compared to the reference ionization energy, calculated in the same way.
+
+Computational cost
+------------------
+
+Low: tests are likely to take minutes to run on CPU.
+
+Data availability
+-----------------
+
+Input structures:
+
+* Toward Benchmark-Quality Ab Initio Predictions for 3d Transition Metal
+  Electrocatalysts: A Comparison of CCSD(T) and ph-AFQMC Hagen Neugebauer, Hung T.
+  Vuong, John L. Weber, Richard A. Friesner, James Shee, and Andreas Hansen Journal of
+  Chemical Theory and Computation 2023 19 (18), 6208-6225,
+  DOI: 10.1021/acs.jctc.3c00617
+
+Reference data:
+
+* Same as input data
+* ph-AFQMC level of theory: Auxiliary-Field Quantum Monte Carlo.
diff --git a/ml_peg/analysis/conformers/37Conf8/analyse_37Conf8.py b/ml_peg/analysis/conformers/37Conf8/analyse_37Conf8.py
@@ -13,14 +13,18 @@
 import pytest
 
 from ml_peg.analysis.utils.decorators import build_table, plot_parity
-from ml_peg.analysis.utils.utils import build_d3_name_map, load_metrics_config, mae
+from ml_peg.analysis.utils.utils import (
+    build_dispersion_name_map,
+    load_metrics_config,
+    mae,
+)
 from ml_peg.app import APP_ROOT
 from ml_peg.calcs import CALCS_ROOT
 from ml_peg.models.get_models import load_models
 from ml_peg.models.models import current_models
 
 MODELS = load_models(current_models)
-D3_MODEL_NAMES = build_d3_name_map(MODELS)
+DISPERSION_NAME_MAP = build_dispersion_name_map(MODELS)
 
 EV_TO_KCAL = units.mol / units.kcal
 CALC_PATH = CALCS_ROOT / "conformers" / "37Conf8" / "outputs"
@@ -113,7 +117,7 @@ def get_mae(conformer_energies) -> dict[str, float]:
     filename=OUT_PATH / "37conf8_metrics_table.json",
     metric_tooltips=DEFAULT_TOOLTIPS,
     thresholds=DEFAULT_THRESHOLDS,
-    mlip_name_map=D3_MODEL_NAMES,
+    mlip_name_map=DISPERSION_NAME_MAP,
 )
 def metrics(get_mae: dict[str, float]) -> dict[str, dict]:
     """