Dalston

Ollama for ASR. Run open-source speech recognition models on your machine or private cloud. Freedom from proprietary APIs, full privacy, no quality compromise.

Why Dalston

Pluggable and extensible — Mix and match transcription, alignment, diarization, and PII detection models. Swap components without breaking your pipeline. Completely open source and free.

Drop-in integration — OpenAI and ElevenLabs compatible APIs mean you can point your existing code at Dalston and it just works. Need more power? The native Dalston API unlocks advanced functionality like multi-engine routing, pipeline customization, and detailed engine metadata.

Cheap to run — make dev is free. A 1-hour podcast on a spot GPU costs cents. A 24/7 ElevenLabs/OpenAI-compatible API on AWS runs around $87/month all-in. See the cost estimator.

What It Does

Transcribe audio files or live streams with speaker diarization, word-level timestamps, and GPU acceleration. Run it on your own infrastructure.

# One-command local transcription (M57 zero-config bootstrap)
# - auto-starts local server if missing
# - auto-ensures default model (distil-small)
DALSTON_SECURITY_MODE=none dalston transcribe tests/audio/test_merged.wav --format json

{
  "text": "Hello, welcome to the meeting...",
  "segments": [
    {"speaker": "SPEAKER_01", "start": 0.0, "end": 2.5, "text": "Hello, welcome to the meeting."},
    {"speaker": "SPEAKER_02", "start": 2.8, "end": 5.1, "text": "Thanks for having me."}
  ]
}

Quick Start

git clone https://github.com/ssarunic/dalston.git && cd dalston
make dev      # full local stack on Docker

For zero-Docker single-process mode or AWS deployment, see the guides.

Features

Batch & Real-time — File uploads or WebSocket streaming
Speaker Diarization — Identify who said what
Word Timestamps — Precise timing for every word
OpenAI & ElevenLabs Compatible — Drop-in replacement for existing integrations
Modular Engines — Faster Whisper, NeMo Parakeet, Voxtral, Pyannote, and more
Private by Default — Runs entirely on your infrastructure, no data leaves your environment

Documentation

Start here:

Quickstart — first transcript in 5 minutes
Pick your deployment — laptop / spot GPU / 24/7 AWS
All guides → — engines, real-time, cost, principles

Engineering reference:

Architecture · REST API · WebSocket API

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1,258 Commits
.github		.github
alembic		alembic
cli		cli
dalston		dalston
debug		debug
docker		docker
docs		docs
engines		engines
hooks		hooks
infra		infra
models		models
scripts		scripts
sdk		sdk
tests		tests
tmp		tmp
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
Makefile		Makefile
PLAN.md		PLAN.md
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
alembic.ini		alembic.ini
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dalston

Why Dalston

What It Does

Quick Start

Features

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dalston

Why Dalston

What It Does

Quick Start

Features

Documentation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages