Push-to-talk audio chat with LiquidAI LFM2.5 Audio 1.5B model.
- 🎤 Push-to-talk recording - Click to start, click again to send
- 🔊 Real-time audio responses - Streaming speech-to-speech conversation
- 💬 Text transcription - See what the model understood and responded with
- 🎵 Seamless audio playback - Gap-free streaming audio synthesis
- 🌐 Web-based interface - No desktop app required
https://lab.kortexa.ai/lfm-2-5/ Requires lamma.cpp running LFM2.5 Audio locally using the instructions below.
Run the download script to get the LiquidAI LFM2.5-Audio model and custom llama.cpp server:
./download.shThis will download (~3GB total):
llama-liquid-audio-serverbinary (platform-specific)LFM2.5-Audio-1.5B-F16.gguf(2.2GB) - Main modelmmproj-LFM2.5-Audio-1.5B-F16.gguf(438MB) - Multimodal projectorvocoder-LFM2.5-Audio-1.5B-F16.gguf(369MB) - Audio synthesistokenizer-LFM2.5-Audio-1.5B-F16.gguf(136MB) - Tokenizer
Files are saved to ./model/ directory.
Note: This demo currently requires LiquidAI's custom
llama-liquid-audio-serverbinary. Once audio support is fully merged into the main llama.cpp repository, you'll be able to use the standard command:llama-server -hf LiquidAI/LFM2.5-Audio-1.5B-GGUF:F16 --host 0.0.0.0 --port 2026
macOS Security Note: The download script automatically tries to remove the quarantine attribute from the downloaded binary and all library files. If you get a security warning when running ./run.sh, you have two options:
- Go to System Settings > Privacy & Security and click "Allow Anyway" (you may need to do this multiple times for the binary and each .dylib)
- Run manually:
xattr -dr com.apple.quarantine model/bin/
Platform Support: Currently, the download script only supports macOS arm64. For other platforms (Linux, Android), manually download from the HuggingFace runners folder.
./run.shThe server will start on http://localhost:2026
In a separate terminal:
npm install
npm run devOpen your browser to the URL shown (typically https://localhost:8033)
- Click the microphone button to start recording
- Speak your message
- Click again to stop and send
- Listen to the AI's response!
If you prefer to run the server manually:
./model/bin/llama-liquid-audio-server \
-m ./model/LFM2.5-Audio-1.5B-F16.gguf \
-mm ./model/mmproj-LFM2.5-Audio-1.5B-F16.gguf \
-mv ./model/vocoder-LFM2.5-Audio-1.5B-F16.gguf \
--tts-speaker-file ./model/tokenizer-LFM2.5-Audio-1.5B-F16.gguf \
--host 0.0.0.0 \
--port 2026- Frontend: React + Vite + TailwindCSS
- Audio Processing: Web Audio API (16kHz input, 24kHz output)
- Backend: LiquidAI's custom llama.cpp server with audio support
- Model: LFM2.5-Audio-1.5B (interleaved text + audio generation)
- Recording: Captures microphone input using MediaRecorder API
- Conversion: Converts to 16kHz mono WAV format
- Encoding: Base64 encodes and sends to server via HTTP POST
- Processing: Server streams back interleaved text and audio chunks
- Playback: Web Audio API schedules audio chunks for seamless playback
- Input: 16kHz mono WAV (base64 encoded)
- Output: 24kHz mono PCM float32 (streamed)
- Protocol: Server-Sent Events (SSE) for streaming
- System Prompt:
"Respond with interleaved text and audio."
Once LiquidAI's audio support PR is merged into the main llama.cpp repository, the setup will be much simpler:
# Future simplified command (once merged)
llama-server -hf LiquidAI/LFM2.5-Audio-1.5B-GGUF:F16 \
--host 0.0.0.0 \
--port 2026This will automatically:
- Download the model files to
~/.cache/huggingface/or~/Library/Caches/llama.cpp/ - Load the vocoder and tokenizer
- Start the server with audio support
No need for manual downloads or custom binaries!
- Model: LiquidAI/LFM2.5-Audio-1.5B-GGUF
- Custom llama.cpp server with audio support (awaiting merge to main)
- Built with React, Vite, and Web Audio API
MIT