feat(transcribe): add local Whisper transcription helper by abman4444 · Pull Request #60 · browser-use/video-use

abman4444 · 2026-06-08T17:32:40Z

What

Adds helpers/transcribe_whisper.py — a drop-in alternative to the ElevenLabs Scribe transcribe.py helper for users who don't have an API key or prefer a free, fully offline option.

Why

The existing pipeline requires an ElevenLabs API key for transcription. Users on free-tier accounts or those who want a zero-cost setup have no fallback. Local Whisper fills that gap.

How it works

Uses openai-whisper with word_timestamps=True to produce word-level timestamps
Output JSON matches the Scribe envelope shape (words, text, language_code, alignment) so pack_transcripts.py, render.py, and the rest of the pipeline work unchanged
Caches transcripts per source file — skips re-transcription on repeat runs (same behaviour as Scribe helper)
Extracts mono 16kHz WAV via ffmpeg before passing to Whisper (same as the Scribe path)
Supports all Whisper model sizes (tiny → large); defaults to medium (good balance of speed and accuracy for English)

Usage

# Basic
python helpers/transcribe_whisper.py my_video.mp4

# Custom model and language
python helpers/transcribe_whisper.py my_video.mp4 --model large --language en

# Custom output directory
python helpers/transcribe_whisper.py my_video.mp4 --edit-dir /path/to/edit

Reviewer notes

No changes to existing files — purely additive
Requires openai-whisper (pip install openai-whisper) and ffmpeg on PATH, both already listed as setup dependencies
Word-level timestamp accuracy is slightly lower than Scribe (Whisper drifts ~50–100ms) but within the cut-padding working window defined in SKILL.md

Summary by cubic

Add a local Whisper transcription helper that outputs Scribe-compatible JSON, enabling offline, zero-API-key transcription without changing the rest of the pipeline. Adds caching and model selection for better control and speed.

New Features
- Added helpers/transcribe_whisper.py using openai-whisper with word_timestamps=True.
- Emits Scribe-compatible JSON (words, text, language_code, alignment) so pack_transcripts.py and render.py work unchanged.
- Extracts mono 16 kHz WAV via ffmpeg before transcription.
- Caches transcripts per source file to skip repeat runs.
- Supports all Whisper models (tiny → large); defaults to medium. Optional --language.
Dependencies
- Requires openai-whisper and ffmpeg available on PATH.

^{Written for commit ff6884c. Summary will update on new commits.}

Adds helpers/transcribe_whisper.py — a drop-in alternative to the ElevenLabs Scribe transcribe.py helper for users who don't have an API key or prefer a free, offline option. - Uses openai-whisper with word_timestamps=True to produce word-level timestamps matching the Scribe JSON envelope shape, so pack_transcripts.py, render.py, and the rest of the pipeline work unchanged - Caches transcripts per source (skips re-transcription on repeat runs) - Supports all Whisper model sizes (tiny → large); defaults to medium - Extracts mono 16kHz WAV via ffmpeg before transcription (same as Scribe path) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cubic-dev-ai

No issues found across 1 file

_{Re-trigger cubic}

cubic-dev-ai Bot reviewed Jun 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transcribe): add local Whisper transcription helper#60

feat(transcribe): add local Whisper transcription helper#60
abman4444 wants to merge 1 commit into
browser-use:mainfrom
abman4444:add-local-whisper-transcriber

abman4444 commented Jun 8, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abman4444 commented Jun 8, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How it works

Usage

Reviewer notes

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

abman4444 commented Jun 8, 2026 •

edited by cubic-dev-ai Bot

Loading