feat: external tool PDF to Markdown import#820
Open
S1933 wants to merge 14 commits into
Open
Conversation
0726289 to
c9dc9b5
Compare
Author
|
First of all, thank you for this project! I use this tool in my daily professional work, and it has been useful for me. The idea for this feature comes from my own use case. I have a lot of PDF documents that I'd like to turn into a knowledge base, and I've found that Markdown files are much easier for AI agents to use than raw PDFs, especially with MCP. This is my first feature contribution to the project, so please don't hesitate to point me in the right direction if the PR description or the implementation should be improved. |
Add pdf-to-markdown external tool integration that converts PDF files to Markdown and imports them into the vault. Includes context menu, command palette, file preview integration, localized UI copy, PostHog events, tests, and ADR 0138.
afc7fad to
d3c3ec2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add PDF to Markdown import using runtime-detected external tools (Poppler + Tesseract). Users can convert any PDF in the vault into an editable Markdown note, with optional OCR for scanned pages. The source PDF is never modified.
What changed
convert_pdf_to_markdown_noteTauri command (pdf_import_cmds.rs,pdf_import_extract.rs)PdfMarkdownImportDialogcomponent with OCR mode/language selection (pdfImport.*locale keys in all 17 languages)pdf_markdown_import_started,pdf_markdown_import_completed,pdf_markdown_import_failedpdfMarkdownImportutility (src/utils/pdfMarkdownImport.ts) with typed request/response interfacesWhy
Users need to turn PDFs into editable Markdown notes while keeping source PDFs in the vault. Bundling a native PDF/OCR stack would increase packaging complexity before demand is proven. Runtime detection of
pdfinfo,pdftotext,pdftoppm, andtesseractkeeps Tolaria lean while still offering text extraction, page-by-page OCR, and a clear upgrade path.How to test
brew install poppler) and optionally Tesseract (brew install tesseract).Screenshots
Checklist