chore: nightly sync main into dev (11_05_2026)#4739
Closed
svcnvidia-nemo-ci wants to merge 10 commits into
Closed
Conversation
Co-authored-by: Antoni-Joan Solergibert <asolergibert@nvidia.com>
…al tests` (#4730) Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Author
|
/ok to test 5835ecb |
Author
|
/ok to test bd3c87d |
bd3c87d to
6f7d410
Compare
Author
|
/ok to test 6f7d410 |
Merges 8 commits from main into dev. Dev already contains yesterday's sync (PR #4716) plus follow-up fixes, so this PR only carries main commits made after that sync. Notable changes: - 434368c build(deps): bump nvidia-modelopt to 0.43 (#4723) - e42e2fa ci: Major refactor of release-workflows (#4602) - 33d47e0 [ci] fix: treat cancelled run-main-script step as failure (#4727) - 5123f6a ci: revert bad uv.lock bump and label future bumps with Run functional tests (#4730) - ad58411 Add Python-side guardrail for DeepEP IB limits (#4719) - e93755e chore(beep boop): Bump (main) (2026-05-11) - a2ec5c1 Revert Add Python-side guardrail for HybridEP IB limit (#4718) - 5e31514 Create a Protocol for the MLP layer of TransformerLayer (#3435) Kept dev's pyproject.toml, uv.lock, docker/Dockerfile.ci.dev, and .github/CODEOWNERS (per nightly-sync skill). Ran black + isort on changed Python files.
3bc523c to
e2f070b
Compare
Author
|
/ok to test e2f070b |
Author
|
/ok to test 23c6c77 |
The merge with -X theirs took main's version for many files where dev had follow-up changes after main's squash-merge. The most common pattern was main returning partial(...) while dev returns ModuleSpec(...). Restored dev's version of: - megatron/core/extensions/transformer_engine.py (TEFusedDenseMLP class) - megatron/core/models/gpt/gpt_layer_specs.py (uses TEFusedDenseMLP) - megatron/core/models/gpt/moe_module_specs.py (returns ModuleSpec) - megatron/core/models/gpt/experimental_attention_variant_module_specs.py - megatron/core/models/gpt/heterogeneous/heterogeneous_layer_specs.py - megatron/core/models/T5/t5_spec.py - megatron/core/models/bert/bert_layer_specs.py - megatron/core/models/hybrid/hybrid_layer_specs.py - megatron/core/models/vision/vit_layer_specs.py - megatron/core/post_training/modelopt/hybrid/model_specs.py - megatron/core/transformer/mlp.py - megatron/core/transformer/moe/experts.py - megatron/core/transformer/moe/fused_a2a.py - megatron/core/transformer/moe/moe_layer.py - megatron/core/transformer/transformer_layer.py (mlp_hyper_connection field) - examples/multimodal/layer_specs.py - pretrain_vlm.py - Several tests/unit_tests/transformer/**/*.py - Several tests/unit_tests/models/**/*.py - tests/unit_tests/dist_checkpointing/models/test_mlp_glu.py - tests/functional_tests/test_cases/common/moe_perf/__main__.py
23c6c77 to
55c13fa
Compare
Author
|
/ok to test 55c13fa |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Nightly sync of
mainintodevfor 2026-05-11.yesterday's sync PR chore: nightly sync main into dev (10_05_2026) #4716 plus follow-up fixes)
nightly-sync-main-to-devworkflowCommits merged from main
434368c81build(deps): bump nvidia-modelopt to 0.43 (build(deps): bump nvidia-modelopt to 0.43 #4723)e42e2fad9ci: Major refactor of release-workflows (ci: Major refactor of release-workflows #4602)33d47e059[ci] fix: treat cancelled run-main-script step as failure ([ci] fix: treat cancelled run-main-script step as failure #4727)5123f6a61ci: revert bad uv.lock bump and label future bumps withRun functional tests(ci: revert bad uv.lock bump and label future bumps withRun functional tests#4730)ad58411ddAdd Python-side guardrail for DeepEP IB limits (Add Python-side guardrail for DeepEP IB limits #4719)e93755e13chore(beep boop 🤖): Bump (main) (2026-05-11)a2ec5c1dfRevert "Add Python-side guardrail for HybridEP InfiniBand limit and rename seq_len" (Revert "Add Python-side guardrail for HybridEP InfiniBand limit and rename seq_len (#4094)" #4718)5e3151416Create a Protocol for the MLP layer of TransformerLayer (Create a Protocol for the MLP layer of TransformerLayer #3435)Conflict resolution
Merge ran cleanly with
-X theirs— no manual conflicts to resolve.Dev already contains yesterday's sync plus subsequent fixes, so the
merge surface is small.
Files NOT taken from main
Kept dev's versions of the dependency-management triple and CODEOWNERS:
pyproject.toml(dev-only deps)uv.lock(consistent with dev's pyproject.toml)docker/Dockerfile.ci.dev(dev-onlyuv syncflags).github/CODEOWNERS(dev governance)Formatting
Ran
black --config pyproject.toml(24.10.0) andisort(5.13.2) onall changed Python files vs
origin/dev. One file required isortreformatting (
examples/multimodal/radio/radio_g.py).Reviewer notes
This PR is created as a draft by the sync bot. CI will be triggered
with
/ok to test <sha>and the bot will mark the PR ready once allnon-exempt required checks reach a terminal green state.
🤖 Generated with Claude Code