Skip to content

nilukush/youtube-transcript

Repository files navigation

YouTube Transcript Fetcher

🚀 Try Live Demo | ⭐ Star on GitHub | 💻 CLI Guide

A powerful tool to fetch YouTube video transcripts via Web UI or CLI, with intelligent proxy support to bypass rate limiting.

Tests Python License PyPI version

Features

  • Web UI: Browser-based interface for fetching transcripts
  • CLI: Command-line interface for automation and scripting
  • Smart Proxy Support: Automatic proxy configuration to bypass YouTube rate limiting
  • Multiple Languages: Fetch transcripts in different languages
  • Multiple Formats: Output as plain text or JSON
  • Smart Caching: Database-backed caching to avoid redundant API calls

Quick Start 🚀

Option 1: Web UI (Easiest - No Installation) 🌐

🚀 Try Live Demo

Web UI Demo

Works instantly in your browser - no installation required!

Perfect for: Quick transcripts, testing, non-technical users


Option 2: CLI (Install Locally) 💻

Fetch transcripts from the command line. See Installation below for all installation methods.

# Example: Fetch transcript by URL
ytt fetch "https://youtu.be/dQw4w9WgXcQ"

Perfect for: Automation, scripting, power users


Option 3: Self-Hosted (Deploy Yourself) 🔧

Deploy your own instance:

📖 Deployment Guide

Perfect for: Production use, custom configuration, full control


Installation

Option 1: Homebrew (macOS/Linux) ⭐

brew tap nilukush/ytt
brew install youtube-transcript-tools

Why Homebrew?

  • ✅ Single command installation
  • ✅ Automatic dependency management
  • ✅ Easy updates: brew upgrade youtube-transcript-tools
  • ✅ Native macOS package manager

Option 2: pipx (Isolated Environment)

pipx installs Python CLI tools in isolated environments, avoiding PEP 668 "externally-managed-environment" errors on macOS.

# Install pipx (one-time setup)
brew install pipx
pipx ensurepath

# Install ytt
pipx install youtube-transcript-tools

Why pipx?

  • ✅ No system Python conflicts
  • ✅ PEP 668 compliant
  • ✅ Easy updates: pipx upgrade ytt
  • ✅ Isolated from other tools

Option 3: pip (Virtual Environment)

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install
pip install youtube-transcript-tools

Option 4: pip (System-wide)

pip install youtube-transcript-tools

Note: If you see error: externally-managed-environment, use Option 1 (Homebrew), Option 2 (pipx), or Option 3 (virtual environment).


Option 5: From Source

git clone https://github.com/nilukush/youtube-transcript.git
cd youtube-transcript
pip install -e .

Development Installation

pip install -e ".[dev]"

Usage

Web UI

The Web UI provides the simplest way to fetch transcripts:

Starting the server:

uvicorn youtube_transcript.api.app:create_app --reload --host localhost --port 8888

Then open http://localhost:8888 in your browser.

Supported URL formats:

  • https://youtu.be/dQw4w9WgXcQ (shortened)
  • https://www.youtube.com/watch?v=dQw4w9WgXcQ (full URL)
  • dQw4w9WgXcQ (video ID only)

CLI

The CLI uses a fetch command to retrieve transcripts.

Quick examples:

# Fetch by URL
ytt fetch "https://youtu.be/dQw4w9WgXcQ"

# Fetch by video ID
ytt fetch dQw4w9WgXcQ

# Save to file
ytt fetch dQw4w9WgXcQ -o transcript.txt

# Output as JSON
ytt fetch dQw4w9WgXcQ --json

Basic usage:

ytt fetch "https://youtu.be/dQw4w9WgXcQ"

Advanced options:

# Language preference
ytt fetch dQw4w9WgXcQ --lang en

# Multiple languages
ytt fetch dQw4w9WgXcQ --lang en,es,fr

# Save to file
ytt fetch dQw4w9WgXcQ -o transcript.txt

# JSON output
ytt fetch dQw4w9WgXcQ --json

# Verbose mode
ytt fetch dQw4w9WgXcQ --verbose

All options:

Usage: ytt fetch [OPTIONS] URL_OR_ID

Options:
  --lang, -l      TEXT  Preferred language codes (comma-separated)
  --output, -o    TEXT  Output file path
  --json                Output in JSON format
  --verbose            Show detailed information
  --help, -h           Show this message

Troubleshooting

"No such command" Error

Wrong:

ytt "https://youtu.be/dQw4w9WgXcQ"

Correct:

ytt fetch "https://youtu.be/dQw4w9WgXcQ"

"Transcript Not Found" Error

This means:

  • The video doesn't have captions/subtitles enabled
  • The transcript is disabled by the uploader
  • The video ID is incorrect

Verification: Check if the video has captions on YouTube:

  1. Open the video on YouTube
  2. Click the "..." (more) button
  3. Look for "Show transcript" option

Rate Limiting (HTTP 429)

If you experience rate limiting:

  1. The application automatically uses proxy configuration (if set by the service provider)
  2. Try again later - rate limits reset over time
  3. Some videos may have stricter rate limits than others

CLI Not Found

If ytt command is not found:

# Reinstall the package
pip install -e .

# Or use Python module directly
python -m youtube_transcript.cli fetch "https://youtu.be/dQw4w9WgXcQ"

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src/youtube_transcript --cov-report=html

# Run specific test file
pytest tests/test_fetcher.py -v

Code Quality

# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Type check
mypy src/

Project Structure

youtube-transcript/
├── src/youtube_transcript/
│   ├── api/              # FastAPI endpoints and web routes
│   ├── cache/            # Redis caching layer
│   ├── config/           # Configuration management
│   ├── models/           # SQLModel database models
│   ├── repository/       # Database repository layer
│   ├── services/         # Business logic (fetcher, orchestrator)
│   ├── static/           # CSS and static assets
│   ├── templates/        # Jinja2 HTML templates
│   ├── utils/            # URL parsing utilities
│   └── cli.py            # CLI entry point
├── tests/                # Pytest tests
└── pyproject.toml        # Project configuration

API Endpoints

The web server exposes the following endpoints:

  • GET / - Web UI homepage
  • GET /transcript?url=URL - Fetch transcript via GET
  • GET /transcript/{video_id} - Fetch transcript by video ID
  • GET /htmx/transcript?url=URL - HTMX endpoint for dynamic updates
  • GET /docs - Interactive API documentation (FastAPI auto-docs)

Performance

Metric Target Status
Cached Response p95 < 500ms ✅ Met
Uncached Response p95 < 10s ✅ Met
Test Coverage > 80% ✅ Met (100%)
URL Parse Success > 99.5% ✅ Met

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Write tests for your changes
  4. Ensure all tests pass
  5. Submit a pull request

For Application Owners

If you're deploying this application as a service, see DEPLOYMENT.md for:

  • Proxy configuration
  • Environment variables
  • Production deployment
  • Scaling considerations

License

MIT License - see LICENSE file for details.

Acknowledgments

Support

Releases

No releases published

Packages

 
 
 

Contributors