chore: nightly sync main into dev (06_05_2026)#4659
Open
svcnvidia-nemo-ci wants to merge 108 commits intodevfrom
Open
chore: nightly sync main into dev (06_05_2026)#4659svcnvidia-nemo-ci wants to merge 108 commits intodevfrom
svcnvidia-nemo-ci wants to merge 108 commits intodevfrom
Conversation
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: john2 <john2@jrlogin01.jureca>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: root <root@nvl72098-T17.cm.cluster> Co-authored-by: William Dykas <wdykas@oci-hsg-cs-001-vscode-03.cm.cluster> Co-authored-by: root <root@nvl72160-T13.cm.cluster>
…classmethod (#3812) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
#4403) Co-authored-by: Antoni-Joan Solergibert <asolergibert@nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Signed-off-by: Maanu Grover <maanug@nvidia.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Co-authored-by: Siddharth Singh <sidsingh@nvidia.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nference cuda graph scope for hybrid models (#4440)
…ss curve gaps for latent MoE models (#4433) Signed-off-by: root <jiemingz@nvidia.com>
…4158) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…4422) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: rprenger <rprenger@nvidia.com>
Signed-off-by: qiyuw <qiyuw@nvidia.com> Co-authored-by: Antoni-Joan Solergibert <asolergibert@nvidia.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
ad8f471 to
3f10d85
Compare
Author
|
/ok to test 3f10d85 |
3f10d85 to
b83a102
Compare
Author
|
/ok to test b83a102 |
b83a102 to
46ee761
Compare
Author
|
/ok to test 46ee761 |
Author
|
Superseded by today's nightly sync. |
# Conflicts: # megatron/core/distributed/param_and_grad_buffer.py
Member
|
/ok to test 676f3fa |
…but in dev Signed-off-by: Deyu Fu <deyuf@nvidia.com>
Contributor
|
/ok to test 0cb4ec3 |
Member
|
/ok to test 2207908 |
Author
|
Superseded by today's nightly sync. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Nightly sync of
mainintodev.maingit merge origin/main -X theirs --no-edit, with manual reconciliation for known conflicts.Files taken from main
megatron/core/optimizer/layer_wise_optimizer.py(no-op; identical between main and dev currently)Files kept on dev (overriding the skill's default of taking main's version)
The skill recommends taking main's version of these files for known semantic conflicts. In this sync the situation is reversed — dev's versions are the more current ones. Main's versions reference
args.hybrid_context_parallel, but dev renamed that flag toargs.dynamic_context_parallel(commitcde56a4"Fix for rope when enabling THD + Dynamic-CP; use the naming Dynamic-CP"). Taking main's versions would cascade intoAttributeErrorat runtime.megatron/training/training.pymegatron/training/utils.pymegatron/training/initialize.pymegatron/training/datasets/data_samplers.pyFiles deleted in main, accepted as deletion
These were legacy GPT loaders removed in main #4322 ("remove legacy GPT code"). Nothing in the merged tree references them.
tools/checkpoint/loader_legacy.pytools/checkpoint/loader_llama_mistral.pyFiles deleted in dev, NOT restored
megatron/core/pipeline_parallel/hybrid_cp_schedule.pywas intentionally removed in dev (commitcde56a4) as part of the dynamic-CP refactor. Not restored, since the merged tree uses dev'swrap_data_iteratormechanism — no caller importsBalancedCPSchedulerorHybridCPDataLoaderWrapper.Dependency triple kept on dev
Per the skill's hard rule:
pyproject.toml,uv.lock,docker/Dockerfile.ci.devwere restored fromorigin/dev. Dev'snvidia-resiliency-extpinned revision (15a8515) was verified to contain all APIs the merged tree imports (get_write_results_queue,CheckpointMetadataCache,CachedMetadataFileSystemReader, etc.). No git-source reconciliation required.API mismatch detection
After taking main's version of files (then later reverting), audited:
multi_latent_attention.pycallsoff_interface.group_offload()andoff_interface.group_commit()— both exist on dev'sFineGrainedActivationOffloadingInterfacegpt_model.pyandhybrid_model.pycallinit_chunk_handler(6 kwargs)— matches dev's signature_resolve_cu_seqlensexists on dev'sGatedDeltaNet_is_distopt_quantized_paramexists on dev'sDistributedOptimizerCudaGraphScopeexists in dev'senums.pyNo active mismatches remain.
Linting
black --config pyproject.toml(24.10.0): no diffisort(5.13.2): no diffpylinton changedmegatron/core/files (84 files): 10.00/10Remerge diff
Remerge diff stat (file-level summary)
Full diff omitted to keep the PR body compact (~10k lines). Reviewers can run
git show --remerge-diff 431ac5df05104bc1d5015f5ac1842285d1c5e6eelocally or browse the merge commit on GitHub.