Rust-native single-binary MLX inference + conversion backend for Apple Silicon
-
Updated
Jun 17, 2026 - Rust
Rust-native single-binary MLX inference + conversion backend for Apple Silicon
Two-tier caching (L1 in-memory + L2 Redis) with SWR, circuit breaker, request dedup, and LRU eviction. Built for multi-instance deployments.
75x 성능 개선을 달성한 확률 계산 백엔드 엔진 97 → 7,347 RPS (Micro-Batching + PostgreSQL 단일화) 복잡한 인프라 제거로 성능과 단순성 동시에 확보
Add a description, image, and links to the tiered-cache topic page so that developers can more easily learn about it.
To associate your repository with the tiered-cache topic, visit your repo's landing page and select "manage topics."