Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Develop/bailian
#813 opened Mar 22, 2026 by xinfei-shi Loading…
fix - fix qwen35 moe reasoning parser
#812 opened Mar 20, 2026 by alibaba-miji Loading…
add emb_dim in modelConfig
#811 opened Mar 20, 2026 by yinjuncheng Loading…
fix: 增加checkout步骤超时并优化参数配置
#810 opened Mar 20, 2026 by guoj14 Loading…
fix: fix module_mha_batch_prefill
#808 opened Mar 20, 2026 by liaocz Loading…
feat: use rtp-kernel fp8 group gemm
#807 opened Mar 20, 2026 by moui0 Loading…
fix: Skip cudagraph capture at prefillWarmup stage
#805 opened Mar 20, 2026 by bppps Loading…
feat: duplicated kv cache.
#804 opened Mar 19, 2026 by ZhangZhiPku Loading…
[WIP] feat: support xgrammer
#803 opened Mar 19, 2026 by wanglining97 Loading…
Feat/remove cpp device
#802 opened Mar 19, 2026 by JackTan25 Loading…
implement kvcache refactor
#801 opened Mar 19, 2026 by ZhihanYan Loading…
fix: qwen35 dense model weight loading
#799 opened Mar 18, 2026 by bppps Loading…
fix - use triton kernel to prepare cuda graph
#798 opened Mar 18, 2026 by zerozw Loading…
Feature/p2p connector 3
#795 opened Mar 17, 2026 by zhangchicc Loading…
SwapAB + Async Load Cache
#794 opened Mar 17, 2026 by alibaba-miji Loading…
4 tasks done
Feat/support sparse cp reuse cache
#792 opened Mar 17, 2026 by MMadhatter Loading…
Develop/logprobs opt
#791 opened Mar 17, 2026 by zongyuanwu Loading…
feat: support flashinfer qkrmsnorm
#789 opened Mar 17, 2026 by Bruce-Lee-LY Loading…
Feat/support sparse cp
#780 opened Mar 14, 2026 by Nancheng-11 Loading…
Feat/support gqa cp reuse cache
#770 opened Mar 11, 2026 by MMadhatter Loading…
feat:refactor graph to support rocm backend
#761 opened Mar 10, 2026 by muse-coder Loading…
ProTip! Follow long discussions with comments:>50.