-
Notifications
You must be signed in to change notification settings - Fork 0
Add PDF table extraction support #1
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Feature Request
Currently PDF processing extracts text via pypdfium2, but tables in PDFs (common in invoices, financial docs) are lost.
Consider integrating table detection:
- camelot-py or tabula-py for rule-based table extraction
- Or pass table regions to LLM separately
Motivation
Invoices and bank statements heavily rely on tabular data.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request