Skip to content

[Triton] DSV4 fusion phase2#791

Open
k50112113 wants to merge 5 commits into
mainfrom
shaoclee/dsv4-fusion-p2
Open

[Triton] DSV4 fusion phase2#791
k50112113 wants to merge 5 commits into
mainfrom
shaoclee/dsv4-fusion-p2

Conversation

@k50112113
Copy link
Copy Markdown
Contributor

@k50112113 k50112113 commented May 14, 2026

This PR depends on ROCm/aiter#3190

This PR includes:

  1. Optimized sparse page attention decode on DSV4 (+5% e2e)
  2. Add fused_hc_post_pre (intra-layer and inter-layer fusions) (+3% e2e)

Here is the summary of both Phase 1 (#0 and #1) and Phase 2 (#2 and #3) optimizations so far

image

@k50112113 k50112113 requested a review from valarLip May 14, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants