Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix - some qwen3_reranker have no lm_weight
#462 opened Dec 17, 2025 by jianglan89 Loading…
Fix shutdown
#461 opened Dec 17, 2025 by jianglan89 Loading…
feat: rocm maskLogits
#459 opened Dec 16, 2025 by CrimsonDump Loading…
[WIP] Feature/support qwen next merge
#457 opened Dec 15, 2025 by alibaba-miji Loading…
Ll/host pool
#455 opened Dec 15, 2025 by jianglan89 Loading…
fix: remove fallback and fastgen
#454 opened Dec 15, 2025 by xinfei-shi Loading…
[draft] Features/refactor frontend
#452 opened Dec 15, 2025 by wanglining97 Loading…
Feat/refactor cuda graph ut
#450 opened Dec 15, 2025 by JackTan25 Loading…
support fp8 fmha for rocm pymodel
#449 opened Dec 12, 2025 by liaocz Loading…
Feature/flashinfer python merge
#448 opened Dec 12, 2025 by zerozw Loading…
[WIP] feat: support dt customize embedding methods
#447 opened Dec 12, 2025 by yinjuncheng Loading…
feat: support trtllm_fp4_block_scale_routed_moe
#446 opened Dec 11, 2025 by CrimsonDump Loading…
hotfix: fix try catch
#442 opened Dec 10, 2025 by wanglining97 Loading…
refactor(rtp_llm): fix frontend loop stuck issue
#438 opened Dec 10, 2025 by sunmiaozju Loading…
[draft] master 0.0.2
#427 opened Dec 4, 2025 by jianglan89 Loading…
feat: support pure tp + ep for cuda graph
#425 opened Dec 3, 2025 by JackTan25 Loading…
refactor: optimize token reorder impl
#419 opened Dec 2, 2025 by MMadhatter Loading…
feature - add viztracer for inference api
#417 opened Dec 2, 2025 by jianglan89 Loading…
Support DeepSeek v3.2 encoding module
#415 opened Dec 2, 2025 by soaringk Loading…
ProTip! Filter pull requests by the default branch with base:main.