refactor: unify rec multi round decode mode with one-stage flag. by LMX-xin · Pull Request #1000 · jd-opensource/xllm

LMX-xin · 2026-03-05T03:23:10Z

What This PR Changes

Uses a single gflag to select decode mode:
- FLAGS_enable_xattention_one_stage == true -> one-stage
- FLAGS_enable_xattention_one_stage == false -> two-stage
Removes legacy/extra mode-selection paths:
- enable_xattention_two_stage_decode
- enable_xattention_one_stage_decode
- is_xattention_two_stage_decode_enabled()
Removes cache-state-based mode gating:
- No branch selection based on xattention_two_stage_decode_cache.has_value()
- No branch selection based on aggregated two_stage_* .defined() checks

LMX-xin · 2026-03-05T03:24:07Z

I will rebase after #933 merge

gemini-code-assist

Code Review

This pull request introduces a significant refactoring to unify the multi-round decode modes for rec under a single flag, FLAGS_enable_xattention_one_stage. The changes are extensive, touching attention kernels, metadata builders, and the CUDA graph executor. The two modes, one-stage and two-stage decode, are now cleanly separated, with the two-stage path implementing a shared/unshared attention optimization. New components like XAttentionWorkspace and xattention_planinfo have been added to support this. Crucially, a new test has been added to compare the outputs of both decode paths, ensuring correctness of the refactoring. The implementation appears solid and consistent across the codebase. I have no high or critical severity comments.

DragonFive

LGTM

gemini-code-assist Bot reviewed Mar 5, 2026

View reviewed changes

LMX-xin mentioned this pull request Mar 5, 2026

feat: add REC multi-round two-stage xattention support with CUDA Graph integration. #933

Merged

LMX-xin force-pushed the feat/refactor_rec_attention branch 2 times, most recently from 973d75e to 4ed51fb Compare March 11, 2026 02:17

XuZhang99 reviewed Mar 11, 2026

View reviewed changes

Comment thread xllm/core/layers/cuda/xattention_test.cpp Outdated

LMX-xin force-pushed the feat/refactor_rec_attention branch from 4ed51fb to 43e1cc4 Compare March 11, 2026 03:52

LMX-xin marked this pull request as ready for review March 11, 2026 06:26

LMX-xin requested review from DongheJin, JimHsiung, RobbieLeung, liutongxuan, walsonyang and yq33victor as code owners March 11, 2026 06:26

LMX-xin requested review from DragonFive, Wang-1F and magicheng0816 March 11, 2026 06:28

JimHsiung previously approved these changes Mar 11, 2026

View reviewed changes

DragonFive previously approved these changes Mar 11, 2026

View reviewed changes

magicheng0816 previously approved these changes Mar 11, 2026

View reviewed changes

walsonyang reviewed Mar 11, 2026

View reviewed changes

Comment thread xllm/core/runtime/rec_worker_impl.cpp Outdated

refactor: unify rec multi round decode mode with one-stage flag.

b6134fc

LMX-xin dismissed stale reviews from magicheng0816, DragonFive, and JimHsiung via b6134fc March 11, 2026 08:14

LMX-xin force-pushed the feat/refactor_rec_attention branch from 43e1cc4 to b6134fc Compare March 11, 2026 08:14

walsonyang approved these changes Mar 11, 2026

View reviewed changes

DragonFive approved these changes Mar 11, 2026

View reviewed changes

DragonFive merged commit 10b8122 into jd-opensource:main Mar 12, 2026
13 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: unify rec multi round decode mode with one-stage flag.#1000

refactor: unify rec multi round decode mode with one-stage flag.#1000
DragonFive merged 1 commit intojd-opensource:mainfrom
LMX-xin:feat/refactor_rec_attention

LMX-xin commented Mar 5, 2026

Uh oh!

LMX-xin commented Mar 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

DragonFive left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

LMX-xin commented Mar 5, 2026

What This PR Changes

Uh oh!

LMX-xin commented Mar 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

DragonFive left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants