Whisper API 🎙️

title	whisper.api
emoji	😶‍🌫️
colorFrom	purple
colorTo	gray
sdk	docker
app_file	Dockerfile
app_port	7860

Whisper API 🎙️

An open-source, high-performance, self-hosted API for speech-to-text transcription powered by whisper.cpp.

This project provides a Deepgram-compatible interface (REST & WebSocket), making it easy to integrate into existing workflows while maintaining full data ownership.

Key Features

Standardized API: Drop-in compatible with /v1/listen endpoints.
Advanced Transcription: Custom vocabulary (prompting), audio cropping (start/duration), and speaker diarization.
Flexible Formats: Native support for JSON, SRT, and VTT exports.
Live Streaming: Real-time 16kHz PCM transcription via WebSockets.
Offline Management: Simple CLI for secure API key generation and model management.

Documentation

Documentation lives in the docs/ folder (Astro Starlight). Run it locally with Bun:

cd docs && bun install && bun run dev

What you will find in the docs:

Getting started and local setup
Authentication and API keys
REST and WebSocket API reference
Code examples
Models and deployment guides
Contributing workflow

Quick Start

1. Installation

pip install -r requirements.txt
cp .env.example .env
chmod +x setup_whisper.sh
./setup_whisper.sh

2. Setup Database & Keys

python -m app.cli init
python -m app.cli create --name "MyAdminKey"

Note: For local testing only, you can enable POST /v1/auth/test-token in Swagger by setting ENABLE_TEST_TOKEN_ENDPOINT=true. It defaults to off; never enable it in production.

3. Start the Server

uvicorn app.main:app --host 0.0.0.0 --port 7860

4. Transcribe a File (cURL)

curl -X POST 'http://localhost:7860/v1/listen' \
  -H "Authorization: Token <YOUR_KEY>" \
  -H "Content-Type: audio/wav" \
  --data-binary @audio.wav

5. Transcribe from URL (cURL)

curl -X POST 'http://localhost:7860/v1/listen' \
  -H "Authorization: Token <YOUR_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/audio.mp3"}'

The server fetches the URL for you with SSRF protections (public hosts only, size limits; redirects off by default). See docs/ or .env.example for MAX_AUDIO_DOWNLOAD_BYTES, AUDIO_URL_FOLLOW_REDIRECTS, and related settings.

License & References

MIT License

Author: Ved Gupta

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github/workflows		.github/workflows
app		app
audio		audio
docs		docs
examples		examples
models		models
transcribe		transcribe
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup_whisper.sh		setup_whisper.sh
whisper.db		whisper.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper API 🎙️

Key Features

Documentation

Quick Start

1. Installation

2. Setup Database & Keys

3. Start the Server

4. Transcribe a File (cURL)

5. Transcribe from URL (cURL)

License & References

About

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper API 🎙️

Key Features

Documentation

Quick Start

1. Installation

2. Setup Database & Keys

3. Start the Server

4. Transcribe a File (cURL)

5. Transcribe from URL (cURL)

License & References

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 1

Languages