Skip to content

fix(inference): wide f16 GEMV kernel restores 160 tok/s decode throughput#151

Merged
ohdearquant merged 2 commits into
mainfrom
fix/lm-head-wide-kernel
May 31, 2026
Merged

fix(inference): wide f16 GEMV kernel restores 160 tok/s decode throughput#151
ohdearquant merged 2 commits into
mainfrom
fix/lm-head-wide-kernel

feat(inference): add ppl_metal binary + pyproject.toml for dev deps

51154a2
Select commit
Loading
Failed to load commit list.