ZMLX targets macOS on Apple Silicon (M-series) with MLX installed.
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install mlx
pip install -e ".[dev]"pytest -qNote: Most tests require Metal on macOS arm64. On unsupported hosts,
tests are skipped (see tests/conftest.py).
ruff check .
mypy src- Pick the right module under
src/zmlx/kernels/or add a new one. - Follow existing patterns:
- cache kernels with
@cacheand validate shapes early - accept
threadgroupwhen relevant (all kernels compute in float32 internally) - use
DEFAULT_HEADERfor shared helpers (sigmoid/silu/gelu_tanh)
- cache kernels with
- Add or update
__all__in the module. - Add a correctness test in
tests/(compare against MLX reference ops). - Update
docs/KERNELS.mdand, if user-facing,README.md. - Add an example under
examples/for new public APIs.
- Keep public APIs small and documented.
- Prefer simple, explicit kernel sources over heavy DSL magic.
- Always add correctness tests vs MLX reference ops.
- For performance work, include a reproducible benchmark script and report settings (shape, dtype, device).
- Keep PRs focused.
- Update
README.mdif user-facing behavior changes. - If adding new public API, add at least one example in
examples/. - Ensure
docs/KERNELS.mdmatches the actual kernel catalog.