A powerful, easy-to-use tool for transcribing audio and video files using multiple AI transcription services. No Python knowledge required - just download, run, and transcribe!
-
Download the Latest Release
- Go to Releases
- Download
transcribe-windows-amd64.zip - Extract to a folder of your choice
-
Set Up API Keys
transcribe.exe --setup
This interactive wizard will guide you through configuring API keys for:
- AssemblyAI
- ElevenLabs
- Groq
- OpenAI
-
Transcribe Your First File
transcribe.exe "path/to/your/audio.mp4" --api groq
That's it! The transcription will be saved next to your audio file.
- Copy
transcribe.exeand any batch file frombatch_templates/to the same folder - Drag and drop an audio/video file onto the batch file
- Wait for transcription to complete
Example batch files:
transcribe_elevenlabs_de.bat- Transcribe with ElevenLabs (German, DaVinci Resolve optimized)- Automatically marks pauses and filler words for DaVinci Resolve Studio auto-cut
- See "DaVinci Resolve Features" section below for details
transcribe_groq_de.bat- Transcribe with Groq (German)transcribe_assemblyai.bat- Transcribe with AssemblyAI
- Multiple Transcription APIs: Choose from AssemblyAI, ElevenLabs, Groq, or OpenAI
- Multiple Output Formats:
- Plain text
- Standard SRT subtitles
- Word-level SRT (each word as its own subtitle)
- DaVinci Resolve optimized SRT
- Smart Processing:
- Automatic audio extraction from video files
- Intelligent file compression to meet API limits
- Automatic chunking for large files
- Filler word detection and removal
- Pause detection and marking
- Easy to Use:
- Interactive setup wizard
- Drag-and-drop batch files
- No Python installation required (standalone executable)
Download the latest release from the Releases page and extract the zip file.
-
Clone the repository:
git clone https://github.com/leotulipan/transcribe.git cd transcribe -
Install dependencies with UV:
uv sync
-
Run the tool:
uv run transcribe.py --help
transcribe.exe "path/to/audio.wav" --api groqNote: The --file and --folder options are deprecated. Simply provide the file or folder path as a positional argument.
transcribe.exe "path/to/audio_files" --api groqtranscribe.exe "path/to/audio.wav" --api groq --language detranscribe.exe "path/to/audio.wav" --api elevenlabs --davinci-srtThis creates an SRT file optimized for DaVinci Resolve Studio with pause markers that enable automatic cutting.
transcribe.exe "path/to/audio.wav" --api elevenlabs \
--davinci-srt \
--filler-lines \
--silent-portions 350 \
--padding-start -125What this does:
--davinci-srt: Enables DaVinci Resolve optimized output--filler-lines: Outputs filler words as separate UPPERCASE subtitle lines--silent-portions 350: Marks pauses and filler words longer than 350ms as(...)for auto-cut--padding-start -125: Adjusts timing by -125ms (starts earlier) for frame accuracy
The first time you run the tool, use the setup wizard:
transcribe.exe --setupThis will:
- Launch an interactive TUI (Text User Interface) to configure each API
- Validate your API keys
- Store keys securely in your user profile directory
API keys are stored in:
- Windows:
%LOCALAPPDATA%\audio_transcribe\.env - Linux/Mac:
~/.audio_transcribe/.env
These files are never committed to git and are only accessible by your user account.
- AssemblyAI: Register at https://www.assemblyai.com/ get the key at https://www.assemblyai.com/dashboard/api-keys
- ElevenLabs: Register at https://dub.link/elevenlabs get the key at https://elevenlabs.io/app/developers/api-keys
- Groq: Register at https://groq.com/ get the key at https://console.groq.com/keys
- OpenAI: Register at https://platform.openai.com/ get the key at https://platform.openai.com/settings/organization/api-keys
Note: AssemlyAI, Elevenlabs and Groq have free credits available. OpenAI afaik not anymore. You will need to load a tiny e.g. 5$ amount prepaid to try it out
Simple text file with the transcription.
Standard subtitle format compatible with most video players and editors.
Each word appears as its own subtitle line with precise timestamps. Available for all APIs (AssemblyAI, ElevenLabs, Groq, OpenAI) that provide word-level timestamps.
Optimized for DaVinci Resolve Studio with special features for automatic editing:
- Pause Detection: Silences and filler words longer than a specified duration (default 350ms) are marked as
(...)in the subtitles - Auto-Cut Feature: DaVinci Resolve Studio recognizes these
(...)markers and can automatically cut the video/audio at these pause points - Filler Words as Separate Lines: Filler words (like "um", "uh", "ähm") appear as their own subtitle lines in UPPERCASE, making them easy to identify and remove
- Frame-Accurate Timing: Adjustable timing offsets for frame-perfect synchronization
- Customizable Padding: Fine-tune start/end times with millisecond precision
Example: If you set --silent-portions 350, any pause or filler word longer than 350ms will become (...) in the SRT file. When you import this SRT into DaVinci Resolve Studio, you can use the auto-cut feature to automatically split your timeline at these pause markers, making it easy to remove unwanted silences and filler words.
- AssemblyAI: Up to 200 MB per file
- ElevenLabs: Up to 1000 MB per file (with automatic compression)
- Groq: 25 MB per file (~30 minutes of audio)
- OpenAI (Whisper): 25 MB per file
The tool automatically handles large files by:
- Extracting audio from video files
- Compressing audio to meet API limits
- Chunking files when necessary
Ready-to-use batch files are available in the batch_templates/ directory. These allow you to:
- Drag and drop files onto the batch file
- Automatically transcribe with pre-configured settings
- Customize the batch files for your needs
The transcribe_elevenlabs_de.bat file is pre-configured for DaVinci Resolve Studio workflows:
- German language transcription
- DaVinci Resolve optimized SRT output
- Pause detection at 350ms threshold
- Auto-cut markers: Pauses and filler words longer than 350ms are marked as
(...)
When you import the resulting SRT into DaVinci Resolve Studio, you can use the auto-cut feature to automatically split your timeline at these (...) markers, making it easy to remove unwanted silences and filler words during editing.
See batch_templates/README.md for more details and customization options.
Usage: transcribe.exe [OPTIONS] [FILE_OR_FOLDER]
Arguments:
[FILE_OR_FOLDER] Audio/video file or folder to transcribe
Options:
-a, --api [assemblyai|elevenlabs|groq|openai]
API to use (default: groq)
-l, --language TEXT Language code (ISO-639-1 or ISO-639-3)
-o, --output [text|srt|word_srt|davinci_srt|json|all]
Output format(s) (default: text,srt)
-D, --davinci-srt Output SRT optimized for DaVinci Resolve
-p, --silent-portions INTEGER Mark pauses longer than X milliseconds with (...)
Used with --davinci-srt for auto-cut markers
--filler-lines Output filler words as their own subtitle lines (UPPERCASE)
--filler-words TEXT Custom filler words to detect
--remove-fillers Remove filler words from output
--speaker-labels Enable speaker labels in SRT (when available)
--diarize Enable speaker diarization (ElevenLabs)
--num-speakers INTEGER Maximum number of speakers (1..32)
-m, --model TEXT Model to use (API-specific)
--setup Run interactive setup wizard for API keys
-v, --verbose Show all log messages
-d, --debug Enable debug logging
--help Show this message
Note: The --file and --folder options are deprecated. Use positional arguments instead.
Run transcribe.exe --help for the complete list of options.
Run the setup wizard:
transcribe.exe --setupThe tool will automatically compress or chunk large files. If you still get errors:
- Check the file size limits for your chosen API
- Try a different API with higher limits (e.g., ElevenLabs: 1000MB)
- Try a different API (each has different strengths)
- Specify the language explicitly:
--language de - Use a higher-quality model:
--model best(AssemblyAI) or--model whisper-large-v3(Groq)
- Ensure you're on Windows (x64)
- Check Windows Defender isn't blocking it
- Try running from Command Prompt:
transcribe.exe --help
For developers who want to contribute or build from source:
uv run build.pyThe executable will be in the dist/ directory.
audio_transcribe/- Main packagecli.py- Command-line interfaceutils/- Utilities and API adapterstranscribe_helpers/- Transcription helperstui/- Interactive setup wizard
batch_templates/- Ready-to-use batch filesfeature-sprints/- Planning and documentation files
uv run pytestContributions are welcome! Please see CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
- Issues: GitHub Issues
- Security: See SECURITY.md for reporting vulnerabilities
See CHANGELOG.md for a list of changes and version history.