feat: optional MLX embedder backend (torch-free, Apple-silicon native) by ry-ops · Pull Request #18 · ry-ops/aiana

ry-ops · 2026-06-13T19:36:57Z

Summary

Adds an optional MLX embedding backend so aiana can embed on Apple silicon without PyTorch, selectable via AIANA_EMBEDDER_BACKEND=mlx. Sentence-transformers stays the default — no behavior change unless opted in.

Motivation: running the sentence-transformers/torch embedder alongside a local LLM is heavy (it was the dependency that OOM'd a 16 GB M1 in practice). An MLX embedder shares the same unified-memory runtime as MLX-served models, so the embedding layer becomes lighter and fully Apple-silicon-native.

Changes

embeddings/mlx_embedder.py — new MLXEmbedder mirroring the Embedder interface (embed, dimension, similarity), running on mlx-embeddings. Optional import with a clear install hint; defaults to all-MiniLM-L6-v2 (384-dim) — the same model family and vector space as the sentence-transformers default, so existing aiana_memories vectors stay searchable (no migration).
embeddings/embedder.py — get_embedder() selects the backend from AIANA_EMBEDDER_BACKEND (sentence-transformers default, or mlx).
pyproject.toml — new optional extra mlx (mlx-embeddings), added to all.

Usage

pip install "aiana[mlx]"
export AIANA_EMBEDDER_BACKEND=mlx

Verification

On Python 3.13 / Apple silicon:

AIANA_EMBEDDER_BACKEND=mlx → get_embedder() -> MLXEmbedder
model: mlx-community/all-MiniLM-L6-v2-bf16
dimension: 384 | embedding L2-norm: 1.0
semantic separation (related > unrelated): True

Notes

Default backend and behavior are unchanged; this is purely additive/opt-in.
Switching an existing deployment from one model family to another implies re-embedding; defaulting the MLX backend to all-MiniLM-L6-v2 avoids that for the common case.
Independent of fix: make aiana installable and compatible with qdrant-client >= 1.12 #17 (the install/qdrant-client-compat fixes); this branch is cut from main and stacks cleanly with it.

🤖 Generated with Claude Code

…tive) Adds an alternative embedding backend that runs on MLX via mlx-embeddings, selectable with AIANA_EMBEDDER_BACKEND=mlx, so aiana can embed on Apple silicon without PyTorch. Defaults to all-MiniLM-L6-v2 (384-dim) -- the same model family and vector space as the sentence-transformers default, so existing aiana_memories vectors remain searchable. - embeddings/mlx_embedder.py: MLXEmbedder (embed/dimension/similarity), lazy optional import; raises a clear error if mlx-embeddings isn't installed. - embeddings/embedder.py: get_embedder() selects backend via AIANA_EMBEDDER_BACKEND (default 'sentence-transformers', or 'mlx'). - pyproject: new optional extra 'mlx' (= mlx-embeddings); added to 'all'. Sentence-transformers remains the default; no behavior change unless opted in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: optional MLX embedder backend (torch-free, Apple-silicon native)#18

feat: optional MLX embedder backend (torch-free, Apple-silicon native)#18
ry-ops wants to merge 1 commit into
mainfrom
feat/mlx-embedder-backend

ry-ops commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ry-ops commented Jun 13, 2026

Summary

Changes

Usage

Verification

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants