Skip to content

[ATOM-SGL][Attn refrac] Separate model-specific MLA from SGL full attention backend#28

Open
ZhiweiYan-96 wants to merge 3 commits into
mainfrom
zhiwei/attn_model_decouple
Open

[ATOM-SGL][Attn refrac] Separate model-specific MLA from SGL full attention backend#28
ZhiweiYan-96 wants to merge 3 commits into
mainfrom
zhiwei/attn_model_decouple

Conversation

@ZhiweiYan-96
Copy link
Copy Markdown
Collaborator

Motivation

SGLang plugin have three components

  1. Attention: GDN, Full attention
  2. Model Forward patch, like deepseeek mla forward, qwen3.5 forward
  3. Sglang runtime management: managing forward batch related info

@ZhiweiYan-96 ZhiweiYan-96 changed the title Zhiwei/attn model decouple [ATOM-SGL][Attn refrac] Separate model-specific MLA from SGL full attention backend May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant