Skip to content

kortexa-ai/lfm-2-5.lab

Repository files navigation

LFM-2.5 Lab

Push-to-talk audio chat with LiquidAI LFM2.5 Audio 1.5B model.

Features

  • 🎤 Push-to-talk recording - Click to start, click again to send
  • 🔊 Real-time audio responses - Streaming speech-to-speech conversation
  • 💬 Text transcription - See what the model understood and responded with
  • 🎵 Seamless audio playback - Gap-free streaming audio synthesis
  • 🌐 Web-based interface - No desktop app required

Demo

https://lab.kortexa.ai/lfm-2-5/ Requires lamma.cpp running LFM2.5 Audio locally using the instructions below.

Quick Start

1. Download Model & Server

Run the download script to get the LiquidAI LFM2.5-Audio model and custom llama.cpp server:

./download.sh

This will download (~3GB total):

  • llama-liquid-audio-server binary (platform-specific)
  • LFM2.5-Audio-1.5B-F16.gguf (2.2GB) - Main model
  • mmproj-LFM2.5-Audio-1.5B-F16.gguf (438MB) - Multimodal projector
  • vocoder-LFM2.5-Audio-1.5B-F16.gguf (369MB) - Audio synthesis
  • tokenizer-LFM2.5-Audio-1.5B-F16.gguf (136MB) - Tokenizer

Files are saved to ./model/ directory.

Note: This demo currently requires LiquidAI's custom llama-liquid-audio-server binary. Once audio support is fully merged into the main llama.cpp repository, you'll be able to use the standard command:

llama-server -hf LiquidAI/LFM2.5-Audio-1.5B-GGUF:F16 --host 0.0.0.0 --port 2026

macOS Security Note: The download script automatically tries to remove the quarantine attribute from the downloaded binary and all library files. If you get a security warning when running ./run.sh, you have two options:

  1. Go to System Settings > Privacy & Security and click "Allow Anyway" (you may need to do this multiple times for the binary and each .dylib)
  2. Run manually: xattr -dr com.apple.quarantine model/bin/

Platform Support: Currently, the download script only supports macOS arm64. For other platforms (Linux, Android), manually download from the HuggingFace runners folder.

2. Start the Server

./run.sh

The server will start on http://localhost:2026

3. Start the Web App

In a separate terminal:

npm install
npm run dev

Open your browser to the URL shown (typically https://localhost:8033)

4. Talk!

  1. Click the microphone button to start recording
  2. Speak your message
  3. Click again to stop and send
  4. Listen to the AI's response!

Manual Server Command

If you prefer to run the server manually:

./model/bin/llama-liquid-audio-server \
  -m ./model/LFM2.5-Audio-1.5B-F16.gguf \
  -mm ./model/mmproj-LFM2.5-Audio-1.5B-F16.gguf \
  -mv ./model/vocoder-LFM2.5-Audio-1.5B-F16.gguf \
  --tts-speaker-file ./model/tokenizer-LFM2.5-Audio-1.5B-F16.gguf \
  --host 0.0.0.0 \
  --port 2026

Architecture

  • Frontend: React + Vite + TailwindCSS
  • Audio Processing: Web Audio API (16kHz input, 24kHz output)
  • Backend: LiquidAI's custom llama.cpp server with audio support
  • Model: LFM2.5-Audio-1.5B (interleaved text + audio generation)

How It Works

  1. Recording: Captures microphone input using MediaRecorder API
  2. Conversion: Converts to 16kHz mono WAV format
  3. Encoding: Base64 encodes and sends to server via HTTP POST
  4. Processing: Server streams back interleaved text and audio chunks
  5. Playback: Web Audio API schedules audio chunks for seamless playback

Technical Details

  • Input: 16kHz mono WAV (base64 encoded)
  • Output: 24kHz mono PCM float32 (streamed)
  • Protocol: Server-Sent Events (SSE) for streaming
  • System Prompt: "Respond with interleaved text and audio."

Future: Standard llama-server Support

Once LiquidAI's audio support PR is merged into the main llama.cpp repository, the setup will be much simpler:

# Future simplified command (once merged)
llama-server -hf LiquidAI/LFM2.5-Audio-1.5B-GGUF:F16 \
  --host 0.0.0.0 \
  --port 2026

This will automatically:

  • Download the model files to ~/.cache/huggingface/ or ~/Library/Caches/llama.cpp/
  • Load the vocoder and tokenizer
  • Start the server with audio support

No need for manual downloads or custom binaries!

Credits

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors