Merge ACE2.1-ERA5 (AIMIP) training and evaluation baseline configs#1027
Open
Merge ACE2.1-ERA5 (AIMIP) training and evaluation baseline configs#1027
Conversation
Remove intermediate fine-tuning explorations superseded by the separate-decoder + LR warmup approach (the final model). Trim the fine-tuning and seed-selection launch scripts to reference only the final approach. Deleted configs (non-final fine-tuning variants): - ace-fine-tune-decoder-pressure-level-config.yaml - ace-fine-tune-decoder-pressure-level-lr-warmup-config.yaml - ace-fine-tune-decoder-pressure-level-frozen-config.yaml - ace-fine-tune-decoder-pressure-level-frozen-lr-warmup-config.yaml - ace-fine-tune-decoder-pressure-level-reweight-config.yaml - ace-fine-tune-decoder-pressure-level-separate-decoder-config.yaml - restart-ace-fine-tune-decoder-pressure-levels.sh
Resolves conflicts in fme/core/step/single_module.py and test_step.py by accepting main's SecondaryDecoderConfig/SecondaryDecoder approach and dropping the branch's inline MLP + additional_diagnostic_names approach. Updates AIMIP configs to use the new secondary_decoder config format and moves loss/parameter_init from stepper to stepper_training per the TrainConfig restructuring in main.
Delete 15 pre-generated IC-specific config files and instead do the _r[N]i label substitution inside the gantry container at job runtime via sed, keeping only the 3 template configs committed.
2 tasks
brianhenn
commented
Mar 31, 2026
configs/baselines/era5-aimip/ace-evaluator-seed-selection-single-config.yaml
Show resolved
Hide resolved
Arcomano1234
approved these changes
Mar 31, 2026
Contributor
Arcomano1234
left a comment
There was a problem hiding this comment.
Left a few comments / questions for my own curiosity but this looks mostly good to go. I've been using some of these scripts so it will be nice to have in main. My only real comment is removing your hard-coded wandb name in a lot of the job submission scripts.
configs/baselines/era5-aimip/run-ace-evaluator-seed-selection-single.sh
Outdated
Show resolved
Hide resolved
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the full set of scripts and configurations for "ACE2.1-ERA5" — a modification of the deterministic ACE2-ERA5 model trained and evaluated under the AIMIP protocol. It also merges
mainto pick up theSecondaryDecoderConfigAPI used for the pressure-level decoder fine-tuning stage, while retaining all of the job names, output paths, and checkpoint IDs as used on the original branch where the workflow actually occurred.Changes:
configs/baselines/era5-aimip/— new directory containing all scripts and configs for the ACE2.1-ERA5 pipeline (previouslyconfigs/baselines/era5/aimip/)run-ace-train.sh/ace-train-config.yaml— train 4-seed ensemble on ERA5 1979–2008run-ace-evaluator-seed-selection.sh/run-ace-evaluator-seed-selection-single.sh— evaluate trained and fine-tuned checkpoints to select best seedsrun-ace-fine-tune-decoder-pressure-levels.sh/ace-fine-tune-pressure-level-separate-decoder-config.yaml— fine-tune a secondary MLP decoder for 65 pressure-level diagnostic variables, usingsecondary_decoder(main'sSecondaryDecoderConfig)run-ace-inference.sh/ace-aimip-inference-{,p2k-,p4k-}config.yaml— 46-year inference with 5 ICs × 3 SST scenarios; IC label expansion done via inlinesedat job time (eliminates 15 near-identical committed config files)README.md— documents the intended workflowTests added
If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated