Skip to content

fix(inference): wide f16 GEMV kernel restores 160 tok/s decode throughput#151

Merged
ohdearquant merged 2 commits into
mainfrom
fix/lm-head-wide-kernel
May 31, 2026
Merged

fix(inference): wide f16 GEMV kernel restores 160 tok/s decode throughput#151
ohdearquant merged 2 commits into
mainfrom
fix/lm-head-wide-kernel

Commits

Commits on May 31, 2026