Skip to content

feat: optional MLX embedder backend (torch-free, Apple-silicon native)#18

Open
ry-ops wants to merge 1 commit into
mainfrom
feat/mlx-embedder-backend
Open

feat: optional MLX embedder backend (torch-free, Apple-silicon native)#18
ry-ops wants to merge 1 commit into
mainfrom
feat/mlx-embedder-backend

Conversation

@ry-ops

@ry-ops ry-ops commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Adds an optional MLX embedding backend so aiana can embed on Apple silicon without PyTorch, selectable via AIANA_EMBEDDER_BACKEND=mlx. Sentence-transformers stays the default — no behavior change unless opted in.

Motivation: running the sentence-transformers/torch embedder alongside a local LLM is heavy (it was the dependency that OOM'd a 16 GB M1 in practice). An MLX embedder shares the same unified-memory runtime as MLX-served models, so the embedding layer becomes lighter and fully Apple-silicon-native.

Changes

  • embeddings/mlx_embedder.py — new MLXEmbedder mirroring the Embedder interface (embed, dimension, similarity), running on mlx-embeddings. Optional import with a clear install hint; defaults to all-MiniLM-L6-v2 (384-dim) — the same model family and vector space as the sentence-transformers default, so existing aiana_memories vectors stay searchable (no migration).
  • embeddings/embedder.pyget_embedder() selects the backend from AIANA_EMBEDDER_BACKEND (sentence-transformers default, or mlx).
  • pyproject.toml — new optional extra mlx (mlx-embeddings), added to all.

Usage

pip install "aiana[mlx]"
export AIANA_EMBEDDER_BACKEND=mlx

Verification

On Python 3.13 / Apple silicon:

AIANA_EMBEDDER_BACKEND=mlx → get_embedder() -> MLXEmbedder
model: mlx-community/all-MiniLM-L6-v2-bf16
dimension: 384 | embedding L2-norm: 1.0
semantic separation (related > unrelated): True

Notes

  • Default backend and behavior are unchanged; this is purely additive/opt-in.
  • Switching an existing deployment from one model family to another implies re-embedding; defaulting the MLX backend to all-MiniLM-L6-v2 avoids that for the common case.
  • Independent of fix: make aiana installable and compatible with qdrant-client >= 1.12 #17 (the install/qdrant-client-compat fixes); this branch is cut from main and stacks cleanly with it.

🤖 Generated with Claude Code

…tive)

Adds an alternative embedding backend that runs on MLX via mlx-embeddings,
selectable with AIANA_EMBEDDER_BACKEND=mlx, so aiana can embed on Apple silicon
without PyTorch. Defaults to all-MiniLM-L6-v2 (384-dim) -- the same model family
and vector space as the sentence-transformers default, so existing
aiana_memories vectors remain searchable.

- embeddings/mlx_embedder.py: MLXEmbedder (embed/dimension/similarity), lazy
  optional import; raises a clear error if mlx-embeddings isn't installed.
- embeddings/embedder.py: get_embedder() selects backend via
  AIANA_EMBEDDER_BACKEND (default 'sentence-transformers', or 'mlx').
- pyproject: new optional extra 'mlx' (= mlx-embeddings); added to 'all'.

Sentence-transformers remains the default; no behavior change unless opted in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants