Skip to content

Pull requests: ROCm/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix CK FP8 grouped GEMM dtype gating for columnwise operands
#594 opened May 21, 2026 by aris134 Contributor Draft
13 tasks
Triton RMSNorm Optimizations
#593 opened May 20, 2026 by Micky774 Contributor Loading…
7 of 13 tasks
speed up nvte_multi_padding / nvte_multi_unpadding ci-level 1 CI test level 1
#592 opened May 20, 2026 by matthiasdiener Contributor Loading…
13 tasks
Bump CI retention days ci-level 1 CI test level 1
#591 opened May 20, 2026 by matthiasdiener Contributor Loading…
1 of 13 tasks
add production GEMM tests ci-level 1 CI test level 1
#590 opened May 19, 2026 by matthiasdiener Contributor Draft
13 tasks
[WIP] Load TE core with RTLD_LOCAL to stop rocroller symbol leak ci-level 3 CI test level 3
#589 opened May 15, 2026 by sudhu2k Contributor Draft
13 tasks
mxfp8: use nvte_multi_tensor_quantize ci-level 1 CI test level 1
#588 opened May 15, 2026 by matthiasdiener Contributor Draft
1 of 13 tasks
MXFP8 training bug fixes for quantized_model_init and Torch FSDP fp8 all gather ci-level 3 CI test level 3
#587 opened May 15, 2026 by sudhu2k Contributor Loading…
8 of 13 tasks
Optimized rocm specific multicast transpose kernel ci-level 3 CI test level 3
#586 opened May 14, 2026 by alextmagro Contributor Loading…
Add custom multi_tensor_apply kernels (L2norm, Adam) ci-level 1 CI test level 1
#585 opened May 13, 2026 by matthiasdiener Contributor Draft
1 of 13 tasks
CK JIT integration ci-level 1 CI test level 1
#582 opened May 11, 2026 by ipanfilo Collaborator Loading…
1 of 13 tasks
Add Tealite: pure-Python TransformerEngine for ROCm/AMD GPUs
#581 opened May 7, 2026 by jayfurmanek Contributor Loading…
7 of 8 tasks
CK Tile MXFP8 Group GEMM gfx1250 ci-level 1 CI test level 1
#578 opened May 6, 2026 by aris134 Contributor Loading…
1 of 13 tasks
CK Tile Group GEMM gfx1250 ci-level 1 CI test level 1
#576 opened May 6, 2026 by aris134 Contributor Loading…
1 of 13 tasks
ck_tile grouped gemm: more padding ci-level 1 CI test level 1
#574 opened May 5, 2026 by matthiasdiener Contributor Loading…
1 of 13 tasks
[ROCm] Allow bf16/bf16/fp32 in nvte_multi_tensor_gemm dispatcher ci-level 1 CI test level 1
#573 opened May 4, 2026 by lizamd Loading…
13 tasks
[No Merge][No Review] testing aiter auto trigger on gh action ci-level 2 CI test level 2
#570 opened May 1, 2026 by VeeraRajasekhar Contributor Draft
13 tasks
add MXFP8 pre-swizzling for gfx1250 GEMM ci-level 1 CI test level 1
#568 opened Apr 29, 2026 by matthiasdiener Contributor Loading…
13 tasks
HipKittens MXFP8 GEMM Support ci-level 3 CI test level 3
#566 opened Apr 28, 2026 by alextmagro Contributor Loading…
[WIP] TDM porting
#558 opened Apr 22, 2026 by wangye805 Collaborator Draft
13 tasks
IFU v2.14.dev0 ci-level 3 CI test level 3
#557 opened Apr 21, 2026 by ipanfilo Collaborator Loading…
5 of 13 tasks
Enable CI lint gh action on ROCm ci-level 3 CI test level 3
#547 opened Apr 17, 2026 by VeeraRajasekhar Contributor Loading…
13 tasks
CI: auto-trigger AITER prebuilt upload when 3rdparty/aiter updates on dev
#543 opened Apr 15, 2026 by VeeraRajasekhar Contributor Loading…
8 of 13 tasks
Integrate AITER fused RoPE kernels with fallback to TE native
#541 opened Apr 15, 2026 by suachong Contributor Loading…
7 tasks done
ProTip! Filter pull requests by the default branch with base:dev.