#

mlx-vlm

Here are 20 public repositories matching this topic...

cubist38 / mlx-openai-server

A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.

flux queue speech-recognition image-generation whisper vision-api mlx fastapi multi-models apple-silicon continuous-batching tool-calling structured-outputs mlx-lm mlx-vlm openai-compatible

Updated Jun 8, 2026
Python

magicnight / Mac-MLX

macMLX brings local LLM inference to Apple Silicon with a first-class native macOS experience. No cloud, no telemetry, no Electron — just your Mac running models at full speed.

swift ai mlx swiftui mlx-swift mlx-lm mlx-vlm swama omlx-alternative

Updated May 10, 2026
Swift

cagataycali / strands-mlx

Experimental: MLX model provider for Strands Agents - Build, train, and deploy AI agents on Apple Silicon.

agents mlx mlx-lm mlx-vlm strands-agents

Updated Apr 22, 2026
Python

davepoon / mlx-vlm-smolvlm-realtime-webcam

Real-time webcam demo with SmolVLM(mlx-community/SmolVLM-Instruct-4bit) and MLX-VLM

mlx vision-framework apple-silicon vision-transformer llms idefics mlx-vlm

Updated Jun 12, 2025
Python

NeptuneIsTheBest / chat-with-mlx

An all-in-one LLMs chat Web UI based on the MLX framework, designed for Apple Silicon.

python gradio mlx llm mlx-lm mlx-vlm

Updated Apr 18, 2026
Python

Harperbot / metal-guard

Defensive layer for mlx / mlx-lm / mlx-vlm on Apple Silicon. Prevents IOGPUFamily kernel panics, SIGABRT, and Mac reboots from MLX inference. Includes a community-curated registry of known-panic models (KNOWN_PANIC_MODELS) with hardware contexts, root-cause hypotheses, and verified workarounds.

Updated May 25, 2026
Python

Flor1an-B / Ka1zen

Local AI for Apple Silicon. Chat with open LLMs (Qwen, Gemma, DeepSeek, Mistral, Llama…) fully offline on your Mac via Apple MLX — private by design, zero telemetry.

Updated Jun 11, 2026

JoeJoe1313 / LLMs-Journey

Various LLM resources and experiments

python machine-learning agents mlx vlm rag apple-silicon llm agentic-ai mlx-lm mlx-vlm

Updated Apr 16, 2026
Jupyter Notebook

llmostlabs / llmost

llmost - the most easy llm host

openai-api ai-tools llm localllm localllama mlx-lm mlx-vlm

Updated Apr 13, 2026

SNiPERxDD / retix

RETIX is a local-first vision MLX based CLI for agents that need to inspect screenshots, extract visible text, and verify visual claims with deterministic output.

cli vision agentic-workflow mlx-vlm agentic-coding

Updated Apr 9, 2026
Python

dbiswas55 / VLM-Inferences

A lightweight, config-driven framework for unified vision-language model inference across local and cloud backends.

transformers vlm huggingface inference-framework vision-language-model vllm ollama multimodal-ai mlx-vlm multimodal-inference

Updated Apr 5, 2026
Python

darylalim / flux.2-klein-pipeline

Generate and edit images with the Black Forest Labs FLUX.2 Klein 4B model on Apple Silicon with MLX.

image-editing text-to-image image-to-image flux2 mflux mlx-vlm

Updated May 31, 2026
Python

mlx_inference_openai

AreChen / mlx_inference_openai

MLX inference service compatible with OpenAI API, built on MLX-LM and MLX-VLM.基于MLX-LM和MLX-VLM构建的OpenAI API兼容的MLX推理服务.

self-hosted openai mlx openai-api mlx-lm mlx-vlm

Updated May 28, 2025
Python

lebronsvienyahhdih / strands-mlx

🤖 Run Strands Agents on Apple Silicon with ease—perform inference, fine-tune models, and leverage vision capabilities using Python and LoRA training.

agents mlx mlx-lm mlx-vlm strands-agents

Updated Jun 14, 2026
Python

kiarina / mlx-qwen3-omni-server

OpenAI-compatible Qwen3-Omni server on Apple Silicon (MLX / mlx-vlm). Text + image + audio + video in, text / tool calls out. Single-flight queue, resident model, sync-only.

mlx multimodal fastapi apple-silicon llm qwen tool-calling mlx-vlm openai-compatible qwen3-omni

Updated Jun 4, 2026
Python

Sunwood-ai-labs / local-llm-bench-lab

Repeatable Apple Silicon local LLM benchmark scripts, reports, and docs

gemma vitepress apple-silicon llama-cpp local-llm ollama mlx-vlm llm-benchmark

Updated Apr 29, 2026
Python

kiarina / mlx-vlm-server

OpenAI-compatible multi-model Qwen server for Apple Silicon (MLX / mlx-vlm): Qwen3-Omni + Qwen3.6-27B in one process, memory-budgeted resident cache, text/image/audio/video in, text/tool-calls out.

inference-server mlx vlm multimodal fastapi apple-silicon llm vision-language-model qwen tool-calling mlx-vlm openai-compatible qwen3 qwen3-omni

Updated Jun 4, 2026
Python

darylalim / nuextract-pipeline

Extract structured information from documents, convert images to Markdown, and generate templates with the NuMind NuExtract3 model on Apple Silicon with MLX.

structured-extraction document-understanding document-to-markdown mlx-vlm nuextract

Updated May 27, 2026
Python

darylalim / medgemma-pipeline

Analyze medical text, 2D images (e.g. chest X-ray), 3D CT volumes, and whole-slide pathology images with the Google MedGemma 1.5 model on Apple Silicon with MLX.

image-text-to-text mlx-vlm

Updated Jun 8, 2026
Python

jrp2014 / check_models

This repository provides a python script for running Vision Language Models via mlx-vlm

ai mlx vision-language-model mlx-lm mlx-vlm

Updated Jun 14, 2026
Python

Improve this page

Add a description, image, and links to the mlx-vlm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mlx-vlm topic, visit your repo's landing page and select "manage topics."