| title | whisper.api |
|---|---|
| emoji | 😶🌫️ |
| colorFrom | purple |
| colorTo | gray |
| sdk | docker |
| app_file | Dockerfile |
| app_port | 7860 |
An open-source, high-performance, self-hosted API for speech-to-text transcription powered by whisper.cpp.
This project provides a Deepgram-compatible interface (REST & WebSocket), making it easy to integrate into existing workflows while maintaining full data ownership.
- Standardized API: Drop-in compatible with
/v1/listenendpoints. - Advanced Transcription: Custom vocabulary (prompting), audio cropping (
start/duration), and speaker diarization. - Flexible Formats: Native support for JSON, SRT, and VTT exports.
- Live Streaming: Real-time 16kHz PCM transcription via WebSockets.
- Offline Management: Simple CLI for secure API key generation and model management.
Documentation lives in the docs/ folder (Astro Starlight). Run it locally with Bun:
cd docs && bun install && bun run devWhat you will find in the docs:
- Getting started and local setup
- Authentication and API keys
- REST and WebSocket API reference
- Code examples
- Models and deployment guides
- Contributing workflow
pip install -r requirements.txt
cp .env.example .env
chmod +x setup_whisper.sh
./setup_whisper.shpython -m app.cli init
python -m app.cli create --name "MyAdminKey"Note: For local testing only, you can enable POST /v1/auth/test-token in Swagger by setting ENABLE_TEST_TOKEN_ENDPOINT=true. It defaults to off; never enable it in production.
uvicorn app.main:app --host 0.0.0.0 --port 7860curl -X POST 'http://localhost:7860/v1/listen' \
-H "Authorization: Token <YOUR_KEY>" \
-H "Content-Type: audio/wav" \
--data-binary @audio.wavcurl -X POST 'http://localhost:7860/v1/listen' \
-H "Authorization: Token <YOUR_KEY>" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/audio.mp3"}'The server fetches the URL for you with SSRF protections (public hosts only, size limits; redirects off by default). See docs/ or .env.example for MAX_AUDIO_DOWNLOAD_BYTES, AUDIO_URL_FOLLOW_REDIRECTS, and related settings.
Author: Ved Gupta