Skip to content

chore: nightly sync main into dev (11_05_2026)#4739

Closed
svcnvidia-nemo-ci wants to merge 10 commits into
devfrom
main2dev/11_05_2026
Closed

chore: nightly sync main into dev (11_05_2026)#4739
svcnvidia-nemo-ci wants to merge 10 commits into
devfrom
main2dev/11_05_2026

Conversation

@svcnvidia-nemo-ci
Copy link
Copy Markdown

@svcnvidia-nemo-ci svcnvidia-nemo-ci commented May 11, 2026

Summary

Nightly sync of main into dev for 2026-05-11.

Commits merged from main

Conflict resolution

Merge ran cleanly with -X theirs — no manual conflicts to resolve.
Dev already contains yesterday's sync plus subsequent fixes, so the
merge surface is small.

Files NOT taken from main

Kept dev's versions of the dependency-management triple and CODEOWNERS:

  • pyproject.toml (dev-only deps)
  • uv.lock (consistent with dev's pyproject.toml)
  • docker/Dockerfile.ci.dev (dev-only uv sync flags)
  • .github/CODEOWNERS (dev governance)

Formatting

Ran black --config pyproject.toml (24.10.0) and isort (5.13.2) on
all changed Python files vs origin/dev. One file required isort
reformatting (examples/multimodal/radio/radio_g.py).

Reviewer notes

This PR is created as a draft by the sync bot. CI will be triggered
with /ok to test <sha> and the bot will mark the PR ready once all
non-exempt required checks reach a terminal green state.

🤖 Generated with Claude Code

nschank and others added 8 commits May 10, 2026 20:59
Co-authored-by: Antoni-Joan Solergibert <asolergibert@nvidia.com>
…ename seq_len (#4094)" (#4718)

Signed-off-by: oliver könig <okoenig@nvidia.com>
…al tests` (#4730)

Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@svcnvidia-nemo-ci svcnvidia-nemo-ci added Run functional tests Run MBridge tests Attach this for testing this PR against MBridge main labels May 11, 2026
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test 5835ecb

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test bd3c87d

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test 6f7d410

Merges 8 commits from main into dev. Dev already contains yesterday's
sync (PR #4716) plus follow-up fixes, so this PR only carries main
commits made after that sync.

Notable changes:
- 434368c build(deps): bump nvidia-modelopt to 0.43 (#4723)
- e42e2fa ci: Major refactor of release-workflows (#4602)
- 33d47e0 [ci] fix: treat cancelled run-main-script step as failure (#4727)
- 5123f6a ci: revert bad uv.lock bump and label future bumps with
  Run functional tests (#4730)
- ad58411 Add Python-side guardrail for DeepEP IB limits (#4719)
- e93755e chore(beep boop): Bump (main) (2026-05-11)
- a2ec5c1 Revert Add Python-side guardrail for HybridEP IB limit (#4718)
- 5e31514 Create a Protocol for the MLP layer of TransformerLayer (#3435)

Kept dev's pyproject.toml, uv.lock, docker/Dockerfile.ci.dev, and
.github/CODEOWNERS (per nightly-sync skill).

Ran black + isort on changed Python files.
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test e2f070b

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test 23c6c77

The merge with -X theirs took main's version for many files where dev
had follow-up changes after main's squash-merge. The most common pattern
was main returning partial(...) while dev returns ModuleSpec(...).

Restored dev's version of:
- megatron/core/extensions/transformer_engine.py (TEFusedDenseMLP class)
- megatron/core/models/gpt/gpt_layer_specs.py (uses TEFusedDenseMLP)
- megatron/core/models/gpt/moe_module_specs.py (returns ModuleSpec)
- megatron/core/models/gpt/experimental_attention_variant_module_specs.py
- megatron/core/models/gpt/heterogeneous/heterogeneous_layer_specs.py
- megatron/core/models/T5/t5_spec.py
- megatron/core/models/bert/bert_layer_specs.py
- megatron/core/models/hybrid/hybrid_layer_specs.py
- megatron/core/models/vision/vit_layer_specs.py
- megatron/core/post_training/modelopt/hybrid/model_specs.py
- megatron/core/transformer/mlp.py
- megatron/core/transformer/moe/experts.py
- megatron/core/transformer/moe/fused_a2a.py
- megatron/core/transformer/moe/moe_layer.py
- megatron/core/transformer/transformer_layer.py (mlp_hyper_connection field)
- examples/multimodal/layer_specs.py
- pretrain_vlm.py
- Several tests/unit_tests/transformer/**/*.py
- Several tests/unit_tests/models/**/*.py
- tests/unit_tests/dist_checkpointing/models/test_mlp_glu.py
- tests/functional_tests/test_cases/common/moe_perf/__main__.py
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Author

/ok to test 55c13fa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Run functional tests Run MBridge tests Attach this for testing this PR against MBridge main

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants