Skip to content

revisionhiep-create/LocalReader-Pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

89 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LocalReader Pro

A modern, privacy-focused PDF/EPUB reader with AI-powered text-to-speech, multilingual support, and smart audio caching.

LocalReader Pro Main Interface

LocalReader Pro Settings

πŸ”˜ Key Features

πŸ”³ Core Reading

  • Multi-Format Support: PDF and EPUB files
  • Multilingual UI: Full interface translation (English, French, Spanish, Chinese)
  • Dual-Engine Architecture: Choose between Performance (CPU) and Quality (GPU) modes
  • Fast TTS Engine: Kokoro-82M v1.0 (~5x real-time synthesis speed)
  • Auto-Save Progress: Resume exactly where you left off
  • Sentence-Level Control: Click any sentence to start reading from there

πŸ”˜ Smart TTS Controls

  • Dynamic Voice Library: Automatically loads voices for English (US/UK), French, Spanish, Chinese, Japanese, Italian, and Portuguese.
  • Voice Settings Drawer: Floating button for quick access to voice, speed, and filter controls
  • Player Text Customization: New Text Size Slider to adjust subtitle/caption size (12px-24px) in real-time.
  • Decoupled Browsing: Browse other pages freely without jumping the audio. A "Back to Reading" button lets you snap back instantly.
  • Natural Speech Flow: Intelligent line joining prevents mid-sentence stops
  • Smart Punctuation Logic:
    • Supports English (..., ?!) and CJK (。, !, ?) punctuation correctly.
    • Smart "Soft Newlines" prevent rushing without creating double pauses.
  • Custom Pause Settings: Granular control over pause duration for punctuation (0-2000ms).
  • Custom Pronunciation Rules: Fix mispronunciations with RegEx support.
  • Speed Control: 0.5x to 3.0x playback speed.

βš™οΈ Smart Features

  • Smart Start: Auto-skip blank/cover pages on first open
  • Header/Footer Filter: Detect and remove/dim repeated page clutter
  • Global Search: Full-book search with instant navigation (Ctrl+F)
  • SQLite Audio Cache: 200MB LRU cache with automatic cleanup (Self-healing).

πŸ“ MP3 Export

  • One-Click Export: Convert entire document to MP3
  • Background Processing: UI stays responsive during export
  • On-Demand FFMPEG: Auto-downloads encoder (~100MB) on first export

πŸ”˜ Sleep Timer

  • Auto-Shutdown: Automatically closes the application after a set duration.
  • Visual Feedback: Button displays remaining time in a neutral style when active.
  • Background Safe: Timer runs on the backend to guarantee shutdown.

πŸ”³ Installation

Windows (Recommended)

One-Click Installer - No Manual Setup Required

  1. Extract the ZIP to your desired location
  2. Navigate to the dist folder
  3. Double-click: setup.exe
  4. Approve UAC Prompt when Windows requests administrator access
  5. Wait for Installation:
    • Checks for Python 3.12+ (downloads and installs if missing)
    • Deploys application files
    • Installs all dependencies automatically
    • Creates Desktop and Start Menu shortcuts
  6. Launch: Double-click "LocalReader Pro" on your Desktop

What the installer does:

  • βœ… Installs Python 3.12 if not present
  • βœ… Installs all required packages (FastAPI, PyTorch, Kokoro-TTS, etc.)
  • βœ… Creates shortcuts on Desktop and Start Menu
  • βœ… Sets up the application in the selected directory

Uninstalling:

  • Run uninstall.exe in the installation directory
  • Removes all shortcuts (application files remain for manual deletion)

To completely remove the supporting software (Python and Libraries):

Uninstall Python: Go to Windows Settings > Apps > Installed Apps, search for "Python 3.12", and select Uninstall.

Remove Libraries: If you haven't deleted the folder yet, open a terminal in the "dist" folder and run: pip uninstall -r requirements.txt

Clear Model Cache: Many voices and AI models are stored in your user profile. You can delete the .cache folder in your user directory (usually C:\Users\<YourName>\.cache\kokoro) to free up additional space.

Installation Size:

  • Installer: ~24 MB
  • Full installation: ~2.6 GB (including Python dependencies)

Linux / Manual Installation

Prerequisites: Python 3.10 - 3.13 (Recommended: Python 3.12)

⚠️ Important: Python 3.14+ is not yet supported due to onnxruntime compatibility.

Step 1: Install Python

# Ubuntu/Debian
sudo apt update
sudo apt install python3.12 python3.12-pip python3.12-venv

# Verify installation
python3.12 --version

Step 2: Extract and Navigate

unzip LocalReader_Pro_v2.5.zip
cd LocalReader_Pro_v2.5/dist

Step 3: Install Dependencies

# Option A: Using pip
pip install -r requirements.txt

# Option B: Using python -m pip (if pip not in PATH)
python3.12 -m pip install -r requirements.txt

This will install:

  • FastAPI (web framework)
  • uvicorn (web server)
  • torch (PyTorch for ML)
  • kokoro-onnx (TTS engine)
  • pydub (audio processing)
  • pywebview (desktop wrapper)
  • And other dependencies

Installation time: 5-10 minutes (downloading PyTorch ~2GB)

Step 4: Launch the App

python3.12 main.py

πŸ”˜ First-Time Setup

After launching the application:

  1. Choose Your Engine Mode:

    • Open Settings section in sidebar
    • Find "Processing Mode" dropdown
    • Choose between:
      • High Performance (CPU): Faster, lower RAM (~87MB model)
      • High Quality (GPU): Best audio quality (~309MB model)
  2. Download Voice Engine:

    • Click "Setup Voice Engine" button in sidebar
    • Downloads the model matching your selected mode
    • Wait for green status indicator (βšͺ β†’ πŸ”˜)
    • Tip: You can download both models and switch anytime!
  3. Upload Your First Book:

    • Click "Upload Book (PDF/EPUB)"
    • Select any PDF or EPUB file
    • App will process and display the book
  4. Start Reading:

    • Click the blue Play button
    • Or press Space to play/pause
  5. First MP3 Export (Optional):

    • Click "Export Audio (MP3)" in sidebar
    • Prompt appears: "Download FFMPEG encoder (~100MB)"
    • Click "Download FFMPEG" and wait ~2-3 minutes
    • Export starts automatically after download
    • Subsequent exports skip this step

πŸ”˜ Usage Guide

Basic Reading

  • Navigate Pages: Use buttons (β—€ β–Ά) or scroll to bottom/top for auto-flip
  • Play Audio: Press Space or click play button
  • Jump to Sentence: Click any sentence in the text
  • Change Voice: Use dropdown in sidebar settings
  • Adjust Speed: Drag speed slider (0.5x - 3.0x)

Smart Features

Smart Start:

  • Automatically activates on first open
  • Finds first page with >500 characters
  • Shows notification: "πŸ”˜ Skipped to start of content (Page X)"

Header/Footer Filter:

  1. Open Settings section in sidebar
  2. Find "Header/Footer Filter" dropdown
  3. Choose: Off, Clean (remove), or Dim (show faded)
  4. TTS skips filtered content in all modes

Global Search:

  1. Press Ctrl+F (or Cmd+F on Mac)
  2. Type query (minimum 2 characters)
  3. Click any result to jump to that page
  4. Press ESC to close

Custom Pronunciation Rules

  1. Click "Pronunciation" tab in sidebar
  2. Click + button to add rule
  3. Configure:
    • Original Text: The text to replace (e.g., "SQL")
    • Replacement Text: How to pronounce (e.g., "S Q L")
  4. Options:
    • β˜‘οΈ Match Case: "SQL" β‰  "sql"
    • β˜‘οΈ Whole Word: "cat" won't match "category"
    • β˜‘οΈ Use Pattern Matching: Enable RegEx

Example Rules:

  • ChatGPT β†’ Chat G P T (spell out)
  • COVID-19 β†’ COVID nineteen (pronounce naturally)

Custom Pause Settings

  1. Open "Pause Settings" section in sidebar
  2. Adjust sliders to set pause duration (0-2000ms):
    • Comma (,) - Default: 250ms
    • Period (.) - Default: 600ms
    • Question (?) - Default: 600ms
    • Exclamation (!) - Default: 600ms
    • Colon (:) - Default: 500ms
    • Semicolon (;) - Default: 500ms
    • Newline - Default: 800ms (Hidden, smart auto-adjust)
  3. Settings save automatically

Smart Behavior:

  • Pauses apply only to single punctuation or the last char of a group
  • "..." creates ONE pause (e.g. 600ms), not three
  • "?! creates ONE pause (based on !)
  • Title\n creates a soft pause (300ms)

Exporting to MP3

  1. Open any PDF/EPUB document
  2. Click "Export Audio (MP3)" button
  3. Review time estimate (e.g., "~3 minutes")
  4. Confirm export
  5. Monitor real-time progress
  6. Click "πŸ“‚ Open Folder" to access file

Export Details:

  • Format: MP3, 192 kbps
  • Naming: {document_name}_{voice_name}.mp3
  • Location: userdata/ folder in project directory
  • Speed: ~15 seconds per 1,000 characters

Sleep Timer

  1. Click the Timer Icon (clock) on the right side of the screen.
  2. Set the desired duration in Hours and Minutes.
  3. Click "Start Timer".
  4. The drawer will show a countdown, and the main button will display the remaining minutes.
  5. The application will automatically close when the timer reaches zero.

πŸ”³ Keyboard Shortcuts

Key Action
Space Play/Pause
← Previous Sentence
β†’ Next Sentence
Ctrl+F / Cmd+F Open Search
ESC Close Search

βš™οΈ Technical Details

Architecture

Layer Technology
Frontend Vanilla JavaScript + Tailwind CSS
Backend FastAPI (Python)
TTS Engine Kokoro-82M (ONNX Runtime)
Desktop Wrapper pywebview
PDF Parsing PDF.js (Mozilla)
Audio Export pydub + FFMPEG
EPUB Support ebooklib + xhtml2pdf

File Structure

LocalReader-Pro/
β”œβ”€β”€ build_installer.py           # Master build script
β”œβ”€β”€ installer_logic.py           # setup.exe core logic
β”œβ”€β”€ README.md
β”œβ”€β”€ CHANGELOG.md
β”‚
└── dist/
    β”œβ”€β”€ setup.exe                # One-click installer (~22 MB)
    β”œβ”€β”€ main.py                  # App entry point (FastAPI + WebView)
    β”œβ”€β”€ launch.vbs               # Silent runner
    β”‚
    β”œβ”€β”€ app/
    β”‚   β”œβ”€β”€ server.py            # FastAPI initialization
    β”‚   β”œβ”€β”€ state.py             # Global engine/status singleton
    β”‚   β”œβ”€β”€ routers/             # API Controllers (TTS, Library, Export, etc.)
    β”‚   β”œβ”€β”€ logic/               # Core logic (Normalize, Detector, Cache)
    β”‚   β”œβ”€β”€ locales/             # UI Translations (EN, ES, FR, ZH, JA)
    β”‚   └── ui/
    β”‚       β”œβ”€β”€ index.html       # Main SPA
    β”‚       β”œβ”€β”€ css/style.css    # Premium styling
    β”‚       └── js/modules/      # ES6 Logic modules
    β”‚
    └── userdata/                # User settings and book database

Additional folders created during use:

  • bin/ - FFMPEG binaries (auto-downloaded on first export)
  • models/ - TTS engine models (auto-downloaded based on your choice)
  • userdata/audio_cache.db - SQLite Audio Cache

Storage Requirements

Component Size
Installer ~22 MB
App Files ~10 MB
Python Dependencies ~2 GB (PyTorch, etc.)
TTS Engine (GPU Mode) ~309 MB
TTS Engine (CPU Mode) ~87 MB
Voice Pack (shared) ~30 MB
FFMPEG ~100 MB (optional)
Audio Cache (SQLite) ~200 MB max (auto-managed)
Per Document Cache ~1-5 MB
Exported MP3 ~1 MB per minute of audio

Total (GPU Mode): ~2.6 GB (without exported audio)
Total (CPU Mode): ~2.4 GB (saves ~220MB)
Total (Both Engines): ~2.8 GB (maximum flexibility)

System Requirements

Component Minimum Recommended
OS Windows 10+ / Ubuntu 20.04+ Windows 11 / Ubuntu 22.04+
Python 3.10 - 3.13 3.12.10
RAM 4 GB 8 GB+
Disk Space 3 GB free 5 GB+ free
CPU Dual-core 2.0 GHz Quad-core 2.5 GHz+
Internet Required for setup only Offline after setup

πŸ”˜ Privacy & Security

Data Storage

  • 100% Local: All documents, settings, and exports stored on your machine
  • No Cloud: Zero data sent to external servers
  • No Accounts: No login, no sign-up, no user tracking

Network Usage

  • Setup Only: Internet required for:
    1. Downloading Python (Windows installer only, ~100 MB)
    2. Installing dependencies (~2 GB)
    3. Downloading Kokoro-82M model (~309 MB)
    4. Downloading FFMPEG (~100 MB, optional)
  • Fully Offline: After setup, works without internet indefinitely

Analytics & Telemetry

  • Zero Tracking: No analytics, no usage stats, no crash reports
  • No Cookies: Web UI runs locally
  • No Logs: App doesn't phone home

File Access

  • Read-Only Documents: PDFs/EPUBs are only read (never modified)
  • Writable Folders: Only userdata/, models/, bin/, and .cache/
  • No Background Access: App closes completely when you exit

πŸ”³ License

LocalReader Pro

  • Code: Proprietary (review, modify, use personally)
  • Redistribution: Contact author for permission

Third-Party Components

Component License
Kokoro-82M Apache 2.0
FastAPI MIT
PyTorch BSD-3-Clause
PDF.js Apache 2.0
Tailwind CSS MIT
Lucide Icons ISC
FFMPEG LGPL 2.1+

βšͺ Credits

Core Technologies

Python Libraries

  • FastAPI, uvicorn, torch, onnxruntime, pydub, soundfile, pywebview, ebooklib, beautifulsoup4, and more (see requirements.txt)

πŸ”˜ Support

Found a Bug?

  1. Check Troubleshooting section above
  2. Verify you're on latest version (v2.5.0)
  3. Check CHANGELOG.md for known issues
  4. Contact developer with:
    • Python version (python --version)
    • Error message or screenshot
    • Steps to reproduce

Feature Requests

  • Review CHANGELOG.md to see if already implemented
  • Describe use case and expected behavior
  • Provide examples or mockups if applicable

Version: 3.5.0 (The "Explorer" Update) Engine: Kokoro-82M (Dual-Mode: CPU/GPU) Last Updated: January 6, 2026 Status: πŸ”˜ Stable Release


Enjoy your reading! πŸ”³βšͺ

About

Standalone desktop application for Text-to-Speech (TTS) utilizing the Kokoro-82M AI model for pdf files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages