-
Notifications
You must be signed in to change notification settings - Fork 130
Pull requests: alibaba/rtp-llm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat - refactor fmha python in cudagraph & adapt pymodel mla cudagraph
#463
opened Dec 17, 2025 by
Nancheng-11
Loading…
fix: extra tokens after stop word and glm missing separator in tool call
#458
opened Dec 15, 2025 by
soaringk
Loading…
opt startup speed: reduce too long health check interval & move import in func
#456
opened Dec 15, 2025 by
ABNER-1
Loading…
feat: support python-xqa with CUDA 12.9 and compatible with CUDA 12.6
#444
opened Dec 11, 2025 by
qqbbiu
Loading…
fix: wrong residual when pre decoder layernorm + mm embedding + quant
#426
opened Dec 4, 2025 by
LLLLKKKK
Loading…
feat: support return raw output and output ids in debug info
#421
opened Dec 2, 2025 by
soaringk
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.