TranscribeAudio is a macOS-first local transcription toolkit. It combines a native SwiftUI desktop app, a Python CLI, and a local MCP server for agent-driven workflows.
- Transcribes local audio or video files on your machine.
- Stores plain-text transcripts, segment-level JSON, and run metadata.
- Exposes the same transcription flow through a desktop UI, a CLI, and MCP tools.
- Keeps outputs local by default instead of sending media through a hosted web product.
The goal is to make local transcription practical for day-to-day workflows:
- a native macOS app for manual jobs
- a scriptable JSON CLI for automation
- an MCP server for local agent integrations
- Native macOS app built with SwiftUI
- Python CLI with stable JSON output
- Local MCP server over stdio
- File, multi-file, and directory transcription flows
- Plain text transcript export plus structured segment metadata
- Optional language hint and configurable Whisper model / compute type
apps/TranscribeAudioMac/Native macOS client built with SwiftUI on macOS 14+.tools/transcribe_audio/Python backend, CLI entrypoints, and history/output management.tools/transcribe_audio_mcp/Local MCP server that wraps the CLI for tool-based agent workflows.examples/Anonymized sample payloads that document the public JSON shapes without exposing local data.
- macOS only for the native app
- Python 3.10+ for the backend tooling
- Swift 5.9+ / macOS 14+ for the desktop app
Bootstrap the Python environments from the repository root:
tools/transcribe_audio/bootstrap.sh
tools/transcribe_audio_mcp/bootstrap.shThis creates local virtual environments under .tools/, which are intentionally gitignored.
cd apps/TranscribeAudioMac
./run.shExport the current defaults and local history:
.tools/transcribe_audio/.venv/bin/python tools/transcribe_audio/transcribe_cli.py export_stateRun a transcription job:
.tools/transcribe_audio/.venv/bin/python tools/transcribe_audio/transcribe_cli.py transcribe \
--input "/path/to/media.wav" \
--model medium \
--language en \
--compute-type int8Stable CLI commands documented for public use:
transcribelist_outputsget_resultexport_state
.tools/transcribe_audio_mcp/.venv/bin/python tools/transcribe_audio_mcp/server.pyThe server exposes local transcription through MCP over stdio.
An anonymized sample export is available here:
examples/sample_export_state.json
The repository does not include real media, real output folders, or local history from the development machine.
- The desktop client is macOS-only.
- Transcription quality and speed depend on local hardware and the selected model.
- The Python backend expects a local runtime that can install the dependencies listed in
tools/transcribe_audio/requirements.txt. - The MCP server is local stdio only; it does not expose an HTTP transport.
- Media stays local unless you explicitly move it elsewhere.
- Runtime history and output files are stored under
local/, which is gitignored. - Public examples are sanitized and do not contain real file paths or personal recordings.
Current release screenshot:
Additional release assets can be added under docs/screenshots/ as the demo set expands.
