Audio Transcribe

A powerful, easy-to-use tool for transcribing audio and video files using multiple AI transcription services. No Python knowledge required - just download, run, and transcribe!

Quick Start

For End Users (No Python Required)

Download the Latest Release
- Go to Releases
- Download transcribe-windows-amd64.zip
- Extract to a folder of your choice
Set Up API Keys
```
transcribe.exe --setup
```
This interactive wizard will guide you through configuring API keys for:
- AssemblyAI
- ElevenLabs
- Groq
- OpenAI

Transcribe Your First File

transcribe.exe "path/to/your/audio.mp4" --api groq

That's it! The transcription will be saved next to your audio file.

Using Batch Files (Even Easier!)

Copy transcribe.exe and any batch file from batch_templates/ to the same folder
Drag and drop an audio/video file onto the batch file
Wait for transcription to complete

Example batch files:

transcribe_elevenlabs_de.bat - Transcribe with ElevenLabs (German, DaVinci Resolve optimized)
- Automatically marks pauses and filler words for DaVinci Resolve Studio auto-cut
- See "DaVinci Resolve Features" section below for details
transcribe_groq_de.bat - Transcribe with Groq (German)
transcribe_assemblyai.bat - Transcribe with AssemblyAI

Features

Multiple Transcription APIs: Choose from AssemblyAI, ElevenLabs, Groq, or OpenAI
Multiple Output Formats:
- Plain text
- Standard SRT subtitles
- Word-level SRT (each word as its own subtitle)
- DaVinci Resolve optimized SRT
Smart Processing:
- Automatic audio extraction from video files
- Intelligent file compression to meet API limits
- Automatic chunking for large files
- Filler word detection and removal
- Pause detection and marking
Easy to Use:
- Interactive setup wizard
- Drag-and-drop batch files
- No Python installation required (standalone executable)

Installation

Option 1: Pre-built Executable (Recommended)

Download the latest release from the Releases page and extract the zip file.

Option 2: From Source (For Developers)

Clone the repository:

git clone https://github.com/leotulipan/transcribe.git
cd transcribe

Install dependencies with UV:
```
uv sync
```
Run the tool:
```
uv run transcribe.py --help
```

Usage

Basic Usage

transcribe.exe "path/to/audio.wav" --api groq

Note: The --file and --folder options are deprecated. Simply provide the file or folder path as a positional argument.

Process an Entire Folder

transcribe.exe "path/to/audio_files" --api groq

With Language Selection

transcribe.exe "path/to/audio.wav" --api groq --language de

DaVinci Resolve Optimized Output

transcribe.exe "path/to/audio.wav" --api elevenlabs --davinci-srt

This creates an SRT file optimized for DaVinci Resolve Studio with pause markers that enable automatic cutting.

Advanced DaVinci Resolve Options

transcribe.exe "path/to/audio.wav" --api elevenlabs \
  --davinci-srt \
  --filler-lines \
  --silent-portions 350 \
  --padding-start -125

What this does:

--davinci-srt: Enables DaVinci Resolve optimized output
--filler-lines: Outputs filler words as separate UPPERCASE subtitle lines
--silent-portions 350: Marks pauses and filler words longer than 350ms as (...) for auto-cut
--padding-start -125: Adjusts timing by -125ms (starts earlier) for frame accuracy

API Keys Setup

The first time you run the tool, use the setup wizard:

transcribe.exe --setup

This will:

Launch an interactive TUI (Text User Interface) to configure each API
Validate your API keys
Store keys securely in your user profile directory

API keys are stored in:

Windows: %LOCALAPPDATA%\audio_transcribe\.env
Linux/Mac: ~/.audio_transcribe/.env

These files are never committed to git and are only accessible by your user account.

Getting API Keys

AssemblyAI: Register at https://www.assemblyai.com/ get the key at https://www.assemblyai.com/dashboard/api-keys
ElevenLabs: Register at https://dub.link/elevenlabs get the key at https://elevenlabs.io/app/developers/api-keys
Groq: Register at https://groq.com/ get the key at https://console.groq.com/keys
OpenAI: Register at https://platform.openai.com/ get the key at https://platform.openai.com/settings/organization/api-keys

Note: AssemlyAI, Elevenlabs and Groq have free credits available. OpenAI afaik not anymore. You will need to load a tiny e.g. 5$ amount prepaid to try it out

Output Formats

Plain Text (`.txt`)

Simple text file with the transcription.

Standard SRT (`.srt`)

Standard subtitle format compatible with most video players and editors.

Word-Level SRT (`.word.srt`)

Each word appears as its own subtitle line with precise timestamps. Available for all APIs (AssemblyAI, ElevenLabs, Groq, OpenAI) that provide word-level timestamps.

DaVinci Resolve Optimized (`.srt`)

Optimized for DaVinci Resolve Studio with special features for automatic editing:

Pause Detection: Silences and filler words longer than a specified duration (default 350ms) are marked as (...) in the subtitles
Auto-Cut Feature: DaVinci Resolve Studio recognizes these (...) markers and can automatically cut the video/audio at these pause points
Filler Words as Separate Lines: Filler words (like "um", "uh", "ähm") appear as their own subtitle lines in UPPERCASE, making them easy to identify and remove
Frame-Accurate Timing: Adjustable timing offsets for frame-perfect synchronization
Customizable Padding: Fine-tune start/end times with millisecond precision

Example: If you set --silent-portions 350, any pause or filler word longer than 350ms will become (...) in the SRT file. When you import this SRT into DaVinci Resolve Studio, you can use the auto-cut feature to automatically split your timeline at these pause markers, making it easy to remove unwanted silences and filler words.

File Size Limits

AssemblyAI: Up to 200 MB per file
ElevenLabs: Up to 1000 MB per file (with automatic compression)
Groq: 25 MB per file (~30 minutes of audio)
OpenAI (Whisper): 25 MB per file

The tool automatically handles large files by:

Extracting audio from video files
Compressing audio to meet API limits
Chunking files when necessary

Batch File Templates

Ready-to-use batch files are available in the batch_templates/ directory. These allow you to:

Drag and drop files onto the batch file
Automatically transcribe with pre-configured settings
Customize the batch files for your needs

DaVinci Resolve Batch File

The transcribe_elevenlabs_de.bat file is pre-configured for DaVinci Resolve Studio workflows:

German language transcription
DaVinci Resolve optimized SRT output
Pause detection at 350ms threshold
Auto-cut markers: Pauses and filler words longer than 350ms are marked as (...)

When you import the resulting SRT into DaVinci Resolve Studio, you can use the auto-cut feature to automatically split your timeline at these (...) markers, making it easy to remove unwanted silences and filler words during editing.

See batch_templates/README.md for more details and customization options.

Command Line Options

Usage: transcribe.exe [OPTIONS] [FILE_OR_FOLDER]

Arguments:
  [FILE_OR_FOLDER]                 Audio/video file or folder to transcribe

Options:
  -a, --api [assemblyai|elevenlabs|groq|openai]
                                  API to use (default: groq)
  -l, --language TEXT             Language code (ISO-639-1 or ISO-639-3)
  -o, --output [text|srt|word_srt|davinci_srt|json|all]
                                  Output format(s) (default: text,srt)
  -D, --davinci-srt               Output SRT optimized for DaVinci Resolve
  -p, --silent-portions INTEGER   Mark pauses longer than X milliseconds with (...)
                                  Used with --davinci-srt for auto-cut markers
  --filler-lines                   Output filler words as their own subtitle lines (UPPERCASE)
  --filler-words TEXT             Custom filler words to detect
  --remove-fillers                 Remove filler words from output
  --speaker-labels                 Enable speaker labels in SRT (when available)
  --diarize                        Enable speaker diarization (ElevenLabs)
  --num-speakers INTEGER           Maximum number of speakers (1..32)
  -m, --model TEXT                Model to use (API-specific)
  --setup                          Run interactive setup wizard for API keys
  -v, --verbose                    Show all log messages
  -d, --debug                      Enable debug logging
  --help                           Show this message

Note: The --file and --folder options are deprecated. Use positional arguments instead.

Run transcribe.exe --help for the complete list of options.

Troubleshooting

"API key not found" Error

Run the setup wizard:

transcribe.exe --setup

File Too Large

The tool will automatically compress or chunk large files. If you still get errors:

Check the file size limits for your chosen API
Try a different API with higher limits (e.g., ElevenLabs: 1000MB)

Transcription Quality Issues

Try a different API (each has different strengths)
Specify the language explicitly: --language de
Use a higher-quality model: --model best (AssemblyAI) or --model whisper-large-v3 (Groq)

Executable Won't Run

Ensure you're on Windows (x64)
Check Windows Defender isn't blocking it
Try running from Command Prompt: transcribe.exe --help

Development

For developers who want to contribute or build from source:

Building the Executable

uv run build.py

The executable will be in the dist/ directory.

Project Structure

audio_transcribe/ - Main package
- cli.py - Command-line interface
- utils/ - Utilities and API adapters
- transcribe_helpers/ - Transcription helpers
- tui/ - Interactive setup wizard
batch_templates/ - Ready-to-use batch files
feature-sprints/ - Planning and documentation files

Running Tests

uv run pytest

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE for details.

Support

Issues: GitHub Issues
Security: See SECURITY.md for reporting vulnerabilities

Changelog

See CHANGELOG.md for a list of changes and version history.

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.cursor/rules		.cursor/rules
.github		.github
audio_transcribe		audio_transcribe
batch_templates		batch_templates
docs		docs
feature-sprints		feature-sprints
i18n		i18n
test		test
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Transcribe Elevenlabs (YouTube Format).bat		Transcribe Elevenlabs (YouTube Format).bat
analyze_jsons.py		analyze_jsons.py
build.py		build.py
favicon.svg		favicon.svg
features.md		features.md
project_outline.md		project_outline.md
pyproject.toml		pyproject.toml
setup.py		setup.py
test_all_apis.py		test_all_apis.py
test_apis_force.bat		test_apis_force.bat
transcribe.py		transcribe.py
uv.lock		uv.lock

License

leotulipan/transcribe

Folders and files

Latest commit

History

Repository files navigation

Audio Transcribe

Quick Start

For End Users (No Python Required)

Using Batch Files (Even Easier!)

Features

Installation

Option 1: Pre-built Executable (Recommended)

Option 2: From Source (For Developers)

Usage

Basic Usage

Process an Entire Folder

With Language Selection

DaVinci Resolve Optimized Output

Advanced DaVinci Resolve Options

API Keys Setup

Getting API Keys

Output Formats

Plain Text (.txt)

Standard SRT (.srt)

Word-Level SRT (.word.srt)

DaVinci Resolve Optimized (.srt)

File Size Limits

Batch File Templates

DaVinci Resolve Batch File

Command Line Options

Troubleshooting

"API key not found" Error

File Too Large

Transcription Quality Issues

Executable Won't Run

Development

Building the Executable

Project Structure

Running Tests

Contributing

License

Support

Changelog

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Plain Text (`.txt`)

Standard SRT (`.srt`)

Word-Level SRT (`.word.srt`)

DaVinci Resolve Optimized (`.srt`)

Packages