#

w8a8

Here is 1 public repository matching this topic...

Mininglamp-AI / cider

W8A8/W4A8 inference on Apple Silicon — unlocking unused INT8 TensorOps in M5 for 1.2–1.9× faster LLM prefill, built as MLX custom primitives.

metal quantization mlx apple-silicon w8a8 w4a8

Updated May 3, 2026
Python

Improve this page

Add a description, image, and links to the w8a8 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the w8a8 topic, visit your repo's landing page and select "manage topics."