This file is the single source of agent guidance for the ai2cm/ace repository.
This is a Python machine learning project for atmospheric modeling (ACE - AI2 Climate Emulator).
- Code is in the
fme/directory (ace, core, coupled, diffusion, downscaling modules) - Tests follow pytest conventions
- Configuration uses YAML files in
configs/ - The project uses PyTorch for ML components
- The default conda environment for the repo is named
fme
- Run all tests:
make test - Run fast tests only:
make test_fast - Run very fast tests:
make test_very_fast - Run tests with coverage:
make test_cov - Create development environment:
make create_environment - Build Docker image:
make build_docker_image - Run pre-commit hooks:
pre-commit run --all-files
When running tests in a conda environment, use python -m pytest (not pytest) to ensure the correct interpreter is used.
Pre-commit hooks run ruff, ruff-format, and mypy. If ruff-format modifies files, re-stage and create a new commit (do not amend).
Tests marked @pytest.mark.parallel must be run via torchrun. Environment
variables FME_DISTRIBUTED_BACKEND (torch|model|none),
FME_DISTRIBUTED_H, and FME_DISTRIBUTED_W control the backend.
Set FME_FORCE_CPU=1 for CPU. Quick smoke test:
FME_FORCE_CPU=1 FME_DISTRIBUTED_BACKEND=model FME_DISTRIBUTED_H=2 FME_DISTRIBUTED_W=1 \
torchrun --nproc-per-node 2 -m pytest -m parallel .
Full matrix (8 configs, ~3-4 min): make cpu_test_all_parallel
Narrow to specific tests: make cpu_test_all_parallel TEST_PATH=fme/core/distributed/parallel_tests/test_step.py
Regression tests using .pt baselines must generate the baseline with a
single-rank python -m pytest run, then verify under spatial parallel via
torchrun. Generating baselines under the same backend you test against does
not validate cross-backend correctness.
- Config classes loaded from user-specified yaml: append
Configto the built type (TrainStepperConfig). - Private functions get a
_prefix.
- Validate in
__post_init__, not at runtime. - Config loading backwards compatibility for inference is critical, but can be broken for training; use deprecation warnings for config removal. Ask user if unsure.
- Create helpers for repeated test setup (threshold: 3+ instances).
- Prefer explicit helpers over pytest fixtures; use fixtures only when sharing scope across tests is valuable.
- When fixing a bug, add a failing test first.
- Prefer tests that cover user-story-level behavior over tests that lock down subjective API details.
- Helper function
validate_tensor_dictis available for regression testing against a saved reference
- Consolidate duplicated code to shared locations (e.g.
fme/core/). fme/coreis not allowed to import fromfme.aceor other submodules.fme/aceis allowed to import fromfme.core, but not from other submodules.
Use this same workflow for both initial review and re-review.
pull_request_read:getfor metadata/stateget_difffor full diff (or current context)get_review_commentsfor review threadsget_commentsfor general PR discussion
list_commitsandget_commitwhen commit-by-commit analysis is needed
- Initial review: review the full PR diff and all current discussion.
- Re-review (delta review):
- If user provides a starting SHA, review changes from that point.
- If not, ask for starting SHA or default to changes since last review comment timestamp.
- Focus on what changed, whether prior comments were addressed, and whether new issues were introduced.
Use these severity buckets:
- Critical Issues (Must Fix): security vulnerabilities, logic bugs, breaking changes
- Suggestions (Should Consider): performance, error handling, clarity/design improvements
- Minor/Nitpicks (Optional): style, naming, docs polish
For re-reviews, classify prior comments as Addressed, Partially Addressed, Unaddressed, or Dismissed. Treat clear author rationale as addressed when appropriate.
Write concise markdown with:
- PR Summary: title/number, author, target branch, goal
- Changes Overview: files changed + high-level summary
- Code Review Findings: grouped by severity with file/line references
- Discussion Status: key unresolved comment threads
- Testing Assessment: gaps and edge cases
- Recommendation: Ready to merge / Needs minor changes / Needs revision
For re-reviews, include:
- Commits reviewed (
<start_sha>...<head_sha>) - Status of previously raised comments
- Outstanding items before merge
- Be specific, constructive, and explicit about blocking vs non-blocking items.
- Prefer delta-focused summaries for re-reviews.
- If the PR was heavily refactored, recommend a fresh full review.
- For large PRs, batch API calls to avoid rate limits.
- Remember that GitHub MCP access is read-only.