Bind the paper you already own — notes, letters, manuscripts, public-domain works — into a personal library of clean, reflowable ebooks. Natively on your Mac, 100% offline.
No API keys · No subscriptions · No internet · Your files never leave your machine.
reepub isn't a "PDF→EPUB converter" — it's a capability you own. Studio-grade OCR that lives on your shelf, not in someone's cloud. Every cloud scan-to-ebook service makes you upload your books to a stranger's server, pay per page, and leaves the result in their account. reepub inverts all of it: free, offline, yours.
What reepub is for. reepub is a tool for digitizing documents you own or have the right to digitize — your own writing, notes and correspondence, public-domain works, or books you physically own — into a personal ebook library you keep locally. Everything is processed on your own Mac; nothing is ever uploaded. Please respect copyright and the rights of authors and publishers.
You've got paper worth keeping — your own notes, a stack of letters, an
out-of-print book you own. Most "PDF to EPUB" tools either upload it to a cloud
service, charge per page, or spit out a fixed-layout EPUB that's really just
images glued together — unreadable on a phone. reepub is different — and the
difference is ownership:
- You own it — you don't rent it. Cloud OCR is a borrowed library card:
revocable, priced per page, and your files pass through someone else's servers.
reepubis the book on your own shelf — free, offline, and yours; no one can reprice it, gate it, or switch it off. - Frontier-quality OCR you already paid for. It unlocks Apple's Vision framework and the Neural Engine already in your Mac (M1–M4+) — so a tiny MIT tool matches paid cloud OCR, fully on-device.
- Reflowable output, not image-glued fake EPUB — text is reconstructed into real paragraphs and chapters, so it reflows on any screen size, not a frozen page image.
- There is no pipe. No API key, no account, no network call — your books physically cannot leave the machine. Privacy that's structural, not a promise.
- Traditional Chinese & English recognition out of the box (
zh-Hant+en-US). - Validated EPUB3 — every book is run through a built-in, dependency-free structural validator before it's handed back; one that fails is rejected, not shipped.
- MIT-licensed, self-contained, forkable, free forever.
-
Smart paragraph stitching — uses line bounding boxes, vertical gaps, indents, and punctuation cues to merge OCR lines back into clean paragraphs.
-
Automatic cover — renders page 1 at 2× and wraps it as the EPUB cover.
-
Hybrid text + image pages — pages with little text (illustrations, plates) are preserved as images instead of garbled OCR.
-
Automatic chapter detection — splits on heading cues (e.g.
第一章,Chapter). -
Three ways to use it — a one-click Mac app, a local web UI, or a CLI.
-
Localized app UI — English / 繁體中文 / 日本語
- macOS 13+ (Apple Silicon strongly recommended) for the native app
- Xcode Command Line Tools — for the Swift compiler (
xcode-select --install). No full Xcode required. - Node.js v20+ — only for the optional web UI / CLI path
zip/unzip/xmllint— preinstalled on macOS
git clone https://github.com/CVERInc/reepub.git
cd reepub
make app # builds macos/build/Reepub.app (Command Line Tools only)Option A — Native macOS app (recommended)
make app
open macos/build/Reepub.appPick a PDF (or drag one onto the window), let Vision OCR run, optionally set a title and author, then Save as EPUB… to save the finished book. Everything — OCR, assembly, and validation — happens in the app, fully offline.
Option B — Local web UI
make build # compiles the Swift OCR CLI (bin/scan-ocr) used by the server
npm start # serves http://localhost:30232Open the page, drop in a PDF, enter a title/author, and download the finished EPUB once conversion completes. The conversion log streams live.
Option C — Command line
make build
node src/builder.js <input.pdf> <output.epub> [book-title] [book-author]Example:
node src/builder.js ~/Documents/scanned_book.pdf ~/Desktop/my_book.epub "我的書名" "作者"- OCR extraction —
bin/scan-ocr(Swift) loads the PDF via PDFKit, renders each page to a bitmap at 2× scale, and runs Apple'sVNRecognizeTextRequest. It emits JSON of every recognized line with normalized bounding boxes, saves page 1 as the cover, and saves low-text pages as image plates. - Text reassembly —
src/builder.jsfilters out headers/footers, stitches lines into paragraphs using geometry + punctuation heuristics, detects headings, and groups everything into chapters. - EPUB packaging — writes a standards-compliant EPUB3 (
content.opf,toc.ncx, per-chapter XHTML, cover) and zips it with the uncompressedmimetypeentry first. - Validation —
src/validator.jschecks the ZIP mimetype layout,container.xml, the OPF manifest/spine, XHTML well-formedness (viaxmllint), and orphan files. A book that fails validation is rejected, not shipped.
npm test # run the validator unit tests
npm run validate <file.epub> # validate any EPUB (or unpacked dir)MIT — see LICENSE. © 2026 CVER Inc.