This guide covers evaluating generated crystal structures against reference datasets for Available Metrics.
(prerequisites)=
Before running evaluation metrics, download and extract the reference dataset files containing structure embeddings, composition features, and phase diagram data.
You can download directly from the web:
Download benchmarks_mp_20.tar.gz from Figshare
Or use the command line (from project root):
# Download the reference dataset
curl -L -A "Mozilla/5.0" -o benchmarks_mp_20.tar.gz https://figshare.com/ndownloader/files/59462369
# Extract the dataset
tar -zxvf benchmarks_mp_20.tar.gzThis will create the following directory structure:
benchmarks/
└── assets/
├── mp_20_all_composition_features.pt # VAE composition embeddings for diversity metrics
├── mp_20_all_structure_features.pt # VAE structure embeddings for diversity metrics
├── mp_20_all_structure.json.gz # MP-20 reference structures for novelty checking
├── mp_all_unique_structure_250416.json.gz # All MP unique structures for novelty checking
└── ppd-mp_all_entries_uncorrected_250409.pkl.gz # Phase diagram data for energy above hull
These files contain the reference data required for computing evaluation metrics against the MP-20 dataset.
(generate-samples)=
Generate crystal structures using a pre-trained LDM model. (Default model is trained on alex-mp-20 dataset.)
# Generate 10000 samples with 2000 batch size using DDIM sampler
python src/sample.py --num_samples=10000 --batch_size=2000 --output_dir=outputs/samples(evaluate-models)=
Evaluate generated structures against reference datasets (i.e., MP-20) to assess quality and diversity.
Generate new structures and evaluate them in one command:
python src/evaluate.py \
--model_path=ckpts/mp_20/ldm/ldm_null.ckpt \
--structure_path=outputs/eval_samples \
--reference_dataset=mp-20 \
--num_samples=10000 \
--batch_size=2000If you already have generated structures:
python src/evaluate.py \
--structure_path=outputs/dng_samples \
--reference_dataset=mp-20 \
--output_file=benchmark/results/my_results.csv(evaluation-metrics)=
The evaluation script computes several metrics to assess generation quality: :::{note} Available Metrics
- Unique: Identifies structures that are not duplicates within the generated set
- Novel: Identifies structures not found in the reference dataset
- E Above Hull: Calculates the energy above hull for each structure to assess thermodynamic stability (also computes Metastable/Stable)
- Composition Validity: Checks if the composition is chemically valid using SMACT
- Structure Diversity: Computes inverse Fréchet distance (1/(1+FMD)) between generated and reference structure embeddings from VAE (higher is better)
- Composition Diversity: Computes inverse Fréchet distance (1/(1+FMD)) between generated and reference composition embeddings from VAE (higher is better) :::
For detailed implementation, see src/utils/metrics.py.
You can also compute metrics using the Python API directly:
from monty.serialization import loadfn
from src.utils.metrics import Metrics
# Load generated structures
gen_structures = loadfn("outputs/eval_samples/structures.json.gz")
# Create metrics object
metrics = Metrics(
metrics=["unique", "novel", "e_above_hull", "composition_validity"],
reference_dataset="mp-20",
phase_diagram="mp-all",
metastable_threshold=0.1,
progress_bar=True,
)
# Compute metrics
results = metrics.compute(gen_structures=gen_structures)
# Save results
metrics.to_csv("outputs/results.csv")
# Or get as DataFrame
df = metrics.to_dataframe()
print(df.head())Available reference datasets:
mp-20: Materials Project structures with ≤20 atomsalex-mp-20: Alexandria MP structures with ≤20 atoms
Results are saved to the specified output file in CSV format for further analysis.
(benchmarks)=
Pre-computed benchmark results for de novo generation (DNG) are available in the benchmarks/dng/ directory:
- MP-20:
benchmarks/dng/chemeleon2_rl_dng_mp_20.json.gz- 10,000 generated structures using RL-trained model on MP-20 - Alex-MP-20:
benchmarks/dng/chemeleon2_rl_dng_alex_mp_20.json.gz- 10,000 generated structures using RL-trained model on Alex-MP-20
These files contain generated crystal structures in compressed JSON format:
from monty.serialization import loadfn
# Load benchmark structures
structures = loadfn("benchmarks/dng/chemeleon2_rl_dng_mp_20.json.gz")
print(f"Loaded {len(structures)} structures")