Skip to content

Add CMG task: Conditional Material Generation (paper task)#52

Open
doncamilom wants to merge 2 commits intomainfrom
feature/condmatgen-task
Open

Add CMG task: Conditional Material Generation (paper task)#52
doncamilom wants to merge 2 commits intomainfrom
feature/condmatgen-task

Conversation

@doncamilom
Copy link
Copy Markdown
Contributor

Summary

  • Cherry-picks the CMG (Conditional Material Generation) task from condmatgen-jeffrey
  • This is one of 3 inorganic tasks described in the paper, with results in Table 4
  • Given a set of chemical elements, the model proposes a novel crystalline compound with a space group number
  • Reward includes SMACT validity, element precision, and novelty bonus

Changes

  • src/open_r1/tasks/condmatgen/condmatgen.py — task implementation with lazy pymatgen/smact imports, expand_path for dataset loading, relative path for comps_used_in_sft.json
  • src/open_r1/tasks/condmatgen/comps_used_in_sft.json — compositions seen during SFT (for novelty bonus)
  • src/open_r1/tasks/__init__.py — register condmatgen task with try/except for optional deps
  • recipes/condmatgen.yaml — recipe with ${MIST_DATA_DIR}/condmatgen dataset path
  • docs/source/tasks/condmatgen.rst — Sphinx autodoc page
  • docs/source/modules.rst — add condmatgen to toctree

Test plan

  • Verify python -c "from open_r1.tasks import CHEMTASKS; print(CHEMTASKS.get('condmatgen'))" succeeds (with pymatgen installed)
  • Verify task gracefully skips when pymatgen/smact not installed
  • Run a smoke test with the recipe against actual data

🤖 Generated with Claude Code

@doncamilom doncamilom mentioned this pull request Apr 6, 2026
doncamilom added a commit that referenced this pull request Apr 6, 2026
  - demo/crystalrelax_tiny: 8 train / 2 test M2S binary compound fixtures
  - fixture_manifest.csv and run_fixture_smoke.py updated (conditional on
    heavy deps being available)
  - relaxing.py: add _has_local_files() guard to skip download_data when
    local files exist (consistent with ForwardReaction pattern)
  - Remove orphaned cmg.yaml (CMG task moved to PR #52)
doncamilom and others added 2 commits April 6, 2026 22:22
Cherry-picked from condmatgen-jeffrey. Given a set of chemical elements,
the model proposes a novel crystalline compound with a space group number.
Reward includes SMACT validity, element precision, and novelty bonus.

Paper task reported in Table 4.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@doncamilom doncamilom force-pushed the feature/condmatgen-task branch from 713abca to 4223010 Compare April 6, 2026 20:22
doncamilom added a commit that referenced this pull request Apr 6, 2026
  - demo/crystalrelax_tiny: 8 train / 2 test M2S binary compound fixtures
  - fixture_manifest.csv and run_fixture_smoke.py updated (conditional on
    heavy deps being available)
  - relaxing.py: add _has_local_files() guard to skip download_data when
    local files exist (consistent with ForwardReaction pattern)
  - Remove orphaned cmg.yaml (CMG task moved to PR #52)
@doncamilom
Copy link
Copy Markdown
Contributor Author

Rebased onto main (post-#60). Clean diff, no regressions, CI should be green.

Remaining items after merge:

  • Verify CMG fixture loads correctly with pymatgen/smact installed
  • Confirm Figshare CondMatGen GRPO data.json matches what was used for paper results
  • Align tag format with paper (<material>...<sgN>...</material> vs current <answer>) — needs decision on whether to update code or paper
  • Add unit tests for accuracy_reward

— Andres

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant