Skip to content

Binary compound #33

Open
rayzxu wants to merge 37 commits intomainfrom
binary_compound_Ruizhi
Open

Binary compound #33
rayzxu wants to merge 37 commits intomainfrom
binary_compound_Ruizhi

Conversation

@rayzxu
Copy link
Copy Markdown
Collaborator

@rayzxu rayzxu commented Apr 24, 2025

Add crystal structure relaxing experiment code and documentation

@jgoumaz
Copy link
Copy Markdown
Collaborator

jgoumaz commented Apr 24, 2025

Hi, could you add the slurm logs (src/open_r1/tasks/crystal_structure/reward_server/reward_logs) to gitignore, please?

Copy link
Copy Markdown
Contributor

@doncamilom doncamilom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! It looks great :)

Before merging let's update the results section with the comments I've made above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove the logs from the repo.
Instead, add relevant parts of it to the documentation file


Features
--------

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this please, to match your task's features?

.. image:: _static/structure_relaxing_success_rate.png
:width: 400
:align: center
:alt: success rate of the model in generating structures with lower internal energy No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good!
Cann you please add the results under a new Results section at the end of the file?
Add the plots and the summary as you have here, but also some sample outputs. Basically we want to see, in general, how does the model solve this task?

@doncamilom
Copy link
Copy Markdown
Contributor

doncamilom commented Apr 6, 2026

Reproducibility Assessment

Reviewing for reviewer-reproducibility readiness.

Summary

This PR adds two new material science tasks: crystal structure relaxation (crystalrelax) and conditional material generation (cmg). These represent a new domain (solid-state/crystallography) alongside the existing organic chemistry tasks.

Hard Blockers (code will not import)

  1. Missing __init__.py: src/open_r1/tasks/crystal_structure/ has no __init__.py. The imports in src/open_r1/tasks/__init__.py (from .crystal_structure.relaxing import ...) will fail with ModuleNotFoundError. This breaks ALL task imports on this branch, not just the new tasks.

  2. Missing spacegroups.txt: src/open_r1/dataset/crystal_structure_relaxing.py opens os.path.join(THIS_DIR, "spacegroups.txt") at module level. The file only exists in reward_server/AIRS_preporcess/. Import-time crash.

  3. Hardcoded path in ConditionalMaterialGeneration.__init__: Line 684 opens "/iopsstor/store/cscs/swissai/a05/chem/comps_used_in_sft.json" with no override mechanism. Instant crash outside CSCS.

Additional Issues

  • Zero tests for either task
  • Undeclared heavy dependencies: mace-torch, gemmi, smact, pymatgen, ase, Flask are required but not in setup.py
  • device='cuda' hardcoded in MACE calculator -- no CPU fallback for testing or CI
  • MACE calculator created per reward call (mace_mp(model="large", device='cuda') inside the reward loop) -- extremely slow
  • Code duplication: CIF tokenizer implemented twice; reward_server/app.py and reward_server/reward_server.py are identical files
  • Logic bug: if "src-test.txt" (line ~925) is always True (string is truthy)
  • random.randint(1, 1) == 1 is always True -- dead conditional
  • Debug print() statements throughout
  • Typo: AIRS_preporcess should be AIRS_preprocess

Reproducibility Gaps

  • Dataset (/iopsstor/store/cscs/...) not publicly available, no download script
  • Recipes hardcode CSCS paths (/iopsstor/...)
  • No demo fixture or sample data
  • Documentation exists for crystalrelax but not for cmg
  • crystalrelax.rst contains copy-pasted SMILES task text ("Reads and processes variations of SMILES notations") -- incorrect for crystals

Checklist for reviewer reproducibility

  • Add src/open_r1/tasks/crystal_structure/__init__.py
  • Add spacegroups.txt to dataset directory
  • Remove hardcoded /iopsstor/ path from ConditionalMaterialGeneration.__init__
  • Add all dependencies to setup.py (mace-torch, gemmi, smact, pymatgen, ase)
  • Make dataset publicly available or provide generation instructions
  • Cache MACE calculator (class-level, not per-call)
  • Add CPU fallback for MACE device
  • Delete duplicate files (reward_server.py = app.py)
  • Fix if "src-test.txt" bug
  • Add unit tests for tokenizer round-trip and reward functions
  • Add documentation for cmg task
  • Fix copy-pasted SMILES description in crystalrelax.rst
  • Remove debug print statements
  • Add demo fixture data

@doncamilom
Copy link
Copy Markdown
Contributor

Paper-to-Code Mapping Update

This branch contains implementations for 2 paper tasks: CMG and CrR.

Paper claims:

CMG (Conditional Material Generation) -- ConditionalMaterialGeneration

CrR (Binary Compound Structure Relaxation) -- BinaryCompoundRelaxing

  • Paper appendix describes CrR: refine perturbed crystal structures to lower energy
  • Reward: -1 if invalid, -5 if valid but higher energy, -10 otherwise (matches paper description)
  • Uses MACE-MP universal potential for energy computation
  • Dataset: Materials Project, 2000 binary crystals with perturbations

What is MISSING:

  1. Hard blockers (missing init.py, missing spacegroups.txt) prevent ANY task from importing
  2. Dataset not available -- dataset_components.csv lists CrR as "needs export, planned"
  3. MACE-MP dependency not declared and requires GPU
  4. CIF tokenizer (M2S format) is duplicated and untested
  5. No evaluation harness for reproducing paper numbers

Priority for reviewer reproducibility: HIGH

CrR is one of 3 inorganic tasks in the paper. CMG is a headline result. Both must work for paper claims to be verifiable.

@doncamilom
Copy link
Copy Markdown
Contributor

Final Classification

Paper tasks: CMG (Conditional Material Generation), CrR (Crystal Relaxation)
Priority: HIGH -- CrR is the only implementation anywhere; CMG competes with PR #44
Paper relevance: 2 of 3 inorganic tasks in the paper

Verdict

This PR is the sole source of the CrR (Binary Compound Structure Relaxation) task. It is also one of two CMG implementations. Both tasks are described in the paper and results are reported in Table 4. However, the code has hard blockers (missing __init__.py breaks ALL task imports) and cannot run on a fresh clone.

Work needed for peer review

Item Effort Blocks
Add src/open_r1/tasks/crystal_structure/__init__.py Tiny ALL imports
Fix missing spacegroups.txt in dataset directory Tiny CrR import
Remove hardcoded /iopsstor/ path from CMG.__init__ Small CMG instantiation
Determine which CMG implementation (this or PR #44) matches paper Decision CMG results
Release CrR dataset (MP binary crystals with perturbations) Medium CrR results
Release CMG dataset (MP 1000 samples) Medium CMG results
Add mace-torch, gemmi, smact, pymatgen, ase to setup.py Small Installation
Cache MACE calculator at class level (not per-call) Small Performance
Add CPU fallback for MACE device Small CI/testing
Delete duplicate files (app.py = reward_server.py) Tiny Cleanup
Fix if "src-test.txt" bug (always True) Tiny Correctness
Add unit tests for CIF tokenizer and reward functions Medium CI
Write CrR evaluation script for Table 4 Medium Paper claims
Write CMG evaluation script for Table 4 Medium Paper claims
Fix copy-pasted SMILES description in crystalrelax.rst Tiny Documentation
Add documentation for cmg task Medium Completeness
Add demo fixtures Small Reviewer testing

@doncamilom doncamilom added priority: high Important for paper claims effort: large Multiple days, significant rework paper-task Implements a task described in the paper has-bugs Contains known runtime bugs labels Apr 6, 2026
@doncamilom doncamilom force-pushed the binary_compound_Ruizhi branch from f20885c to 7e71b84 Compare April 6, 2026 13:00
@doncamilom
Copy link
Copy Markdown
Contributor

doncamilom commented Apr 6, 2026

Updated checklist (after Ruizhi's fixes + regression cleanup)

Ruizhi addressed most items from the original checklist. Here is the current state:

Completed

  • Add src/open_r1/tasks/crystal_structure/__init__.py
  • Add spacegroups.txt to dataset directory
  • Remove hardcoded /iopsstor/ paths — recipes now use ${MIST_DATA_DIR}
  • Add expand_path() to relaxing.py
  • Add dependencies to setup.py (mace-torch, gemmi, smact, pymatgen, ase)
  • Cache MACE calculator at class level (not per-call)
  • Add CPU fallback for MACE device (torch.cuda.is_available())
  • Delete duplicate reward_server.py (= app.py)
  • Fix if "src-test.txt" always-truthy bug
  • Remove debug print statements (0 remaining)
  • Fix copy-pasted SMILES description in crystalrelax.rst
  • Remove CMG task from this PR (now in separate PR Add CMG task: Conditional Material Generation (paper task) #52)
  • Heavy imports (gemmi, pymatgen, ase, mace) are lazy
  • __init__.py uses try/except for crystal_structure registration
  • No regressions against main (forward.py, utils.py, smiles_competence.py all clean)

Still needed

  • CI not triggered on latest commit — needs a push (even empty) to re-trigger. Latest passing run was on 6424374, but HEAD is ecd0dc1.
  • No unit tests for CrR — no test coverage for the relaxing task reward functions or CIF tokenizer round-trip
  • No demo fixture for CrR — fixture_manifest.csv doesn't list crystalrelax
  • cmg.yaml recipe is orphaned — CMG task was removed from this PR (now in Add CMG task: Conditional Material Generation (paper task) #52) but the recipe file is still here. Should be removed.
  • Dataset not publicly available${MIST_DATA_DIR}/binary_compound_relaxing needs to be released or a generation script provided
  • app.py (reward server) still presentreward_server/ directory with Flask app remains. Is this needed, or can it be removed since rewards are computed in-process?

doncamilom and others added 16 commits April 6, 2026 22:24
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
 Add CPU fallback for MACE device
 Fix if "src-test.txt" bug
Move gemmi, pymatgen, ase, mace, and torch imports from module level
into the methods that use them in relaxing.py. Wrap the crystal_structure
task registration in __init__.py with try/except ImportError. Guard test
code in _tokenizer.py behind if __name__ == "__main__". Also fix the
AIRS_preporcess -> AIRS_preprocess typo in the import path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restore files deleted by bad rebase (reaction tasks, demos, docs, tests,
recipes).  Reset forward.py, utils.py, smiles_competence.py to main.
Use expand_path in crystal relaxing task and replace hardcoded /iopsstor
paths in recipe YAMLs with ${MIST_DATA_DIR}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  - demo/crystalrelax_tiny: 8 train / 2 test M2S binary compound fixtures
  - fixture_manifest.csv and run_fixture_smoke.py updated (conditional on
    heavy deps being available)
  - relaxing.py: add _has_local_files() guard to skip download_data when
    local files exist (consistent with ForwardReaction pattern)
  - Remove orphaned cmg.yaml (CMG task moved to PR #52)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@doncamilom doncamilom force-pushed the binary_compound_Ruizhi branch from 8d60d09 to d251b03 Compare April 6, 2026 20:29
@doncamilom
Copy link
Copy Markdown
Contributor

doncamilom commented Apr 6, 2026

Rebased onto main (post-#60). Clean diff — zero regressions in infrastructure files (utils.py, forward.py, smiles_competence.py, run_r1_grpo.py). All changes are CrR-specific.

Remaining items after merge:

  • Verify CrR fixture loads correctly with heavy deps (gemmi, mace, pymatgen, ase)
  • Confirm Figshare binary_compound_relax/ matches what was used for paper results
  • Add unit tests for relaxing.py reward functions (accuracy_reward, format_reward)
  • Verify CIF tokenizer round-trip works on the fixture data
  • Consider removing reward_server/app.py if not needed (Flask reward server not used in GRPO pipeline)

— Andres

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

effort: large Multiple days, significant rework has-bugs Contains known runtime bugs paper-task Implements a task described in the paper priority: high Important for paper claims

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants