Skip to content

Reorder mtp_post_process after attention backward in 1F1B schedule plan#4695

Open
gdengk wants to merge 2 commits into
NVIDIA:mainfrom
gdengk:gdeng/main-pr-4430-mtp-post-process-order
Open

Reorder mtp_post_process after attention backward in 1F1B schedule plan#4695
gdengk wants to merge 2 commits into
NVIDIA:mainfrom
gdengk:gdeng/main-pr-4430-mtp-post-process-order

Conversation

@gdengk
Copy link
Copy Markdown
Contributor

@gdengk gdengk commented May 8, 2026

Ports #4430 from dev to main.

Moves mtp_post_process.forward after attention backward in the 1F1B schedule plan so the forward post-process no longer runs inside the moe_combine FP8 context block before attention backward.

Original dev PR: #4430
Original dev commit: 2e7fdc3

Move mtp_post_process.forward out of the moe_combine fp8 context block
so it runs after the attn backward, improving overlap between the
forward mtp_post_process and the backward attn compute.

(cherry picked from commit 2e7fdc3)
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 8, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Phlip79 Phlip79 added the 26.06 label May 12, 2026
@gdengk gdengk marked this pull request as ready for review May 12, 2026 21:29
@gdengk gdengk requested review from a team as code owners May 12, 2026 21:29
@svcnvidia-nemo-ci svcnvidia-nemo-ci requested a review from a team May 12, 2026 21:29
@svcnvidia-nemo-ci svcnvidia-nemo-ci added Final Review PR is in the "final review" stage complexity: low labels May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

26.06 complexity: low Final Review PR is in the "final review" stage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants