Add option: ulysses qkv_fusion by xiehaohx · Pull Request #887 · ModelTC/LightX2V

xiehaohx · 2026-02-05T16:35:07Z

Add option: ulysses qkv_fusion

gemini-code-assist · 2026-02-05T16:35:24Z

Summary of Changes

Hello @xiehaohx, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Ulysses attention mechanism by adding a new use_tensor_fusion option. This option provides greater control over how Query, Key, and Value tensors are handled internally, allowing for flexible processing of these tensors either as a combined unit or individually. The changes primarily impact the tensor reshaping and communication logic within the attention computation, adapting it to support both fused and unfused QKV operations.

Highlights

New Configuration Option: Introduced a new boolean parameter, use_tensor_fusion, to the apply function in ulysses_attn.py. This option controls whether Query, Key, and Value (QKV) tensors are processed in a fused manner.
Conditional QKV Handling: Implemented conditional logic throughout the attention mechanism to handle QKV tensors differently based on the use_tensor_fusion flag. This affects how QKV tensors are stacked, reshaped, and prepared for communication.
Parallel Communication Adaptation: Modified the communication pathways, including both FP8 and standard communication, to correctly process either a single fused QKV tensor or individual Q, K, and V tensors, especially when head parallelism is enabled.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

lightx2v/common/ops/attn/ulysses_attn.py
- Added use_tensor_fusion parameter with a default value of False to the apply function signature.
- Introduced an internal use_qkv_fusion variable, mirroring the use_tensor_fusion parameter, to drive conditional logic.
- Refactored tensor stacking and reshaping operations for image QKV to conditionally create a fused img_qkv or separate img_q, img_k, img_v tensors.
- Updated head-parallel processing logic to handle either fused img_qkv or individual img_q, img_k, img_v tensors during permutation and allocation of output tensors.
- Modified FP8 and non-FP8 communication blocks to conditionally perform all_to_all_single operations on either the fused img_qkv tensor or the separate img_q, img_k, img_v tensors.
- Adjusted the waiting mechanism for asynchronous communication operations to correctly await either a single fused QKV work item or multiple individual Q, K, V work items.
- Updated the final reshaping and assignment of shard_img_q, shard_img_k, shard_img_v to correctly extract values from either the fused output_qkv or the individual output_q, output_k, output_v tensors.

Activity

No human activity (comments, reviews, or approvals) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request successfully introduces the use_tensor_fusion option to enable QKV fusion within the Ulysses attention mechanism. This feature can potentially offer performance benefits. However, the implementation introduces significant code duplication across various conditional blocks, particularly where use_qkv_fusion and use_fp8_comm interact. This increases the complexity and reduces the maintainability of the code.

gemini-code-assist · 2026-02-05T16:36:56Z

lightx2v/common/ops/attn/ulysses_attn.py

+            if use_qkv_fusion:
+                img_qkv = img_qkv.permute(3, 2, 1, 0, 4).contiguous()  # (shard_heads, world_size, img_qkv_len, 3, hidden_dims)
+                output_qkv = torch.empty_like(img_qkv)
+            else:
+                img_q = img_q.permute(2, 1, 0, 3).contiguous()  # (shard_heads, world_size, img_qkv_len, hidden_dims)
+                img_k = img_k.permute(2, 1, 0, 3).contiguous()
+                img_v = img_v.permute(2, 1, 0, 3).contiguous()
+                output_q = torch.empty_like(img_q)
+                output_k = torch.empty_like(img_k)
+                output_v = torch.empty_like(img_v)


This section, and several others throughout the apply method (e.g., lines 123-166, 168-181, and the subsequent for loop), introduces significant code duplication due to the nested if use_qkv_fusion: and else: blocks. This pattern makes the code harder to read, understand, and maintain.

Consider refactoring to reduce this duplication. For instance, you could prepare the tensors (e.g., img_qkv or individual img_q, img_k, img_v) and their corresponding output placeholders (output_qkv or output_q, output_k, output_v) in a unified manner before entering the communication and processing loops. This would allow the subsequent logic to operate on a consistent structure, regardless of whether QKV fusion is enabled, thereby reducing the need for repeated if/else checks.

For example, you could define lists of tensors to communicate and lists of output tensors, and then iterate over these lists in the communication and waiting phases.

root and others added 11 commits January 16, 2026 09:58

ring_attn_fp8_kv_fusion

ee69f31

main-sync

37febdb

ring_attn_fp8_kv_fusion

31fcd26

ring_attn_fp8_kv_fusion_gemini_code_assist

5bf8f6d

ring_attn_fp8_comm_kv_fusion

19e3561

Merge branch 'ModelTC:main' into xh/1

10e2701

Merge branch 'ModelTC:main' into xh/1

97c22b5

ring_attn.py

f4d9285

Merge branch 'ModelTC:main' into xh/1

a7fb314

Merge branch 'ModelTC:main' into xh/1

9a78347

add ulysses_qkv_fusion option

e078521

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

helloyongyang approved these changes Feb 5, 2026

View reviewed changes

helloyongyang merged commit 86bf5d4 into ModelTC:main Feb 5, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option: ulysses qkv_fusion#887

Add option: ulysses qkv_fusion#887
helloyongyang merged 11 commits intoModelTC:mainfrom
xiehaohx:xh/1

xiehaohx commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xiehaohx commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants