This guide explains how to add your own custom bank statement templates without modifying the core codebase.
FREE Tier Note: In the FREE tier, only templates with IBAN patterns can be used. Templates without IBAN patterns (e.g., generic fallback templates) will be automatically disabled and logged. This ensures that all processed PDFs have valid IBANs for proper transaction tracking.
-
Create a directory for your custom templates:
mkdir custom_templates
-
Create a JSON file for your bank (e.g.,
custom_templates/mybank.json):{ "id": "mybank", "name": "My Bank", "enabled": true, "detection": { "iban_patterns": ["IE[0-9]{2}MYBK[0-9A-Z]+"], "header_keywords": ["My Bank", "mybank.com"], "column_headers": ["Date", "Description", "Amount", "Balance"] }, "extraction": { "table_top_y": 250, "table_bottom_y": 750, "columns": { "Date": [30, 100], "Details": [100, 300], "Debit €": [300, 380], "Credit €": [380, 460], "Balance €": [460, 540] } }, "processing": { "supports_multiline": false, "date_format": "%d/%m/%Y", "currency_symbol": "€", "decimal_separator": "." } } -
Set the environment variable to point to your custom templates:
export CUSTOM_TEMPLATES_DIR=./custom_templates -
Run the processor:
python -m src.app
Your custom template will be loaded and used for detection alongside the built-in templates.
Templates are loaded in the following priority order:
- Custom templates (from
CUSTOM_TEMPLATES_DIR) - Highest priority - Built-in templates (from
BANK_TEMPLATES_DIRor./templates) - Default template - Fallback for unrecognized statements
If a custom template has the same ID as a built-in template, the custom version will override the built-in one.
Every template must have these fields:
{
"id": "unique_identifier", // Lowercase, alphanumeric, hyphens, underscores
"name": "Human Readable Name", // Display name
"enabled": true, // Whether template is active
"detection": { ... }, // How to detect this bank's PDFs
"extraction": { ... }, // How to extract transactions
"processing": { ... } // How to process the data
}Detection determines which PDFs match this template. The system tries detection methods in order:
- IBAN Pattern (highest priority)
- Filename Pattern
- Header Keywords
- Column Headers (fallback)
"detection": {
"iban_patterns": [
"IE[0-9]{2}MYBK[0-9A-Z]+" // Regex pattern for your bank's IBANs
],
"filename_patterns": [
"MyBank_Statement_*.pdf", // Glob patterns for filenames
"statement_mybank_*.pdf"
],
"header_keywords": [
"My Bank", // Keywords in the PDF header
"mybank.com"
],
"column_headers": [
"Date", // Expected column headers
"Description",
"Amount",
"Balance"
]
}Best Practices:
- Use specific IBAN patterns (include bank code) to avoid false positives
- FREE Tier: MUST include at least one IBAN pattern (empty array will disable template)
- PAID Tier: Can use empty
[]for generic/fallback templates without IBAN requirement - Include multiple variations in
header_keywords(with/without spaces, abbreviations) - Match at least 70% of
column_headersfor detection to succeed
Extraction defines where to find transaction data on the PDF pages.
"extraction": {
"table_top_y": 250, // Y-coordinate where table starts (from top)
"table_bottom_y": 750, // Y-coordinate where table ends
"enable_page_validation": true, // Validate page structure before extraction
"enable_header_check": true, // Check for header row presence
"header_check_top_y": 200, // Y-coordinate to check for headers
"columns": {
"Date": [30, 100], // [x_start, x_end] coordinates
"Details": [100, 300],
"Debit €": [300, 380],
"Credit €": [380, 460],
"Balance €": [460, 540]
}
}Finding Coordinates:
Use the included coordinate finder tool:
python -m src.tools.coordinate_finder input/your_statement.pdfThis will display the PDF with coordinates overlaid, helping you determine:
- Table boundaries (
table_top_y,table_bottom_y) - Column boundaries (
columnsx-coordinates)
Coordinate System:
- Origin (0, 0) is at the top-left of the page
- X increases to the right
- Y increases downward
- Typical A4 page: 595 points wide × 842 points tall
Processing controls how extracted data is interpreted.
"processing": {
"supports_multiline": false, // Whether transactions can span multiple rows
"date_format": "%d/%m/%Y", // Python strftime format
"currency_symbol": "€", // Currency symbol to expect
"decimal_separator": "." // Decimal separator (. or ,)
}Date Format Examples:
"%d/%m/%Y"→ 31/12/2025"%d %b %Y"→ 31 Dec 2025"%Y-%m-%d"→ 2025-12-31"%d-%m-%Y"→ 31-12-2025
Multiline Support:
- Set
supports_multiline: truefor banks that split long transaction descriptions across multiple rows - Set
supports_multiline: falsefor standard single-row transactions
Path to directory containing your custom templates.
# Relative path
export CUSTOM_TEMPLATES_DIR=./custom_templates
# Absolute path
export CUSTOM_TEMPLATES_DIR=/path/to/my/templates
# Multiple users can have their own custom directories
export CUSTOM_TEMPLATES_DIR=$HOME/.bankstatements/templatesOverride the built-in templates directory (advanced use).
export BANK_TEMPLATES_DIR=./templatesForce a specific template as the default fallback.
export DEFAULT_TEMPLATE=mybankFor a bank with standard format:
{
"id": "simplebank",
"name": "Simple Bank",
"enabled": true,
"detection": {
"iban_patterns": ["IE[0-9]{2}SMPL[0-9]+"],
"header_keywords": ["Simple Bank"],
"column_headers": ["Date", "Description", "Debit", "Credit", "Balance"]
},
"extraction": {
"table_top_y": 300,
"table_bottom_y": 720,
"columns": {
"Date": [26, 78],
"Details": [78, 255],
"Debit €": [255, 313],
"Credit €": [313, 369],
"Balance €": [369, 434]
}
},
"processing": {
"supports_multiline": false,
"date_format": "%d/%m/%Y",
"currency_symbol": "€",
"decimal_separator": "."
}
}For a bank like Revolut with multiline transactions:
{
"id": "multibank",
"name": "Multi Bank",
"enabled": true,
"detection": {
"iban_patterns": ["[A-Z]{2}[0-9]{2}MULT[0-9A-Z]+"],
"header_keywords": ["Multi Bank", "multibank.io"],
"column_headers": ["Date", "Description", "Out", "In", "Balance"]
},
"extraction": {
"table_top_y": 140,
"table_bottom_y": 735,
"enable_page_validation": false,
"columns": {
"Date": [42, 120],
"Details": [124, 330],
"Debit €": [335, 416],
"Credit €": [417, 525],
"Balance €": [526, 556]
}
},
"processing": {
"supports_multiline": true,
"date_format": "%d %b %Y",
"currency_symbol": "€",
"decimal_separator": "."
}
}For catching any unrecognized statement format:
{
"id": "generic",
"name": "Generic Bank Statement",
"enabled": true,
"detection": {
"iban_patterns": [], // No IBAN pattern (fallback only)
"column_headers": ["Date", "Details", "Amount", "Balance"]
},
"extraction": {
"table_top_y": 250,
"table_bottom_y": 750,
"columns": {
"Date": [30, 100],
"Details": [100, 350],
"Debit €": [350, 420],
"Credit €": [420, 490],
"Balance €": [490, 560]
}
},
"processing": {
"supports_multiline": false,
"date_format": "%d/%m/%Y",
"currency_symbol": "€",
"decimal_separator": "."
}
}# Use jq to validate JSON
jq . custom_templates/mybank.jsonexport CUSTOM_TEMPLATES_DIR=./custom_templates
python -c "from src.templates import TemplateRegistry; r = TemplateRegistry.from_default_config(); print(r.list_all())"export CUSTOM_TEMPLATES_DIR=./custom_templates
python -m src.appCheck the logs for:
Loaded template: mybank from mybank.json
Template detected by IBAN: My Bank for statement.pdf
Check the output CSV files to ensure:
- All transactions are extracted
- Dates are parsed correctly
- Amounts are in correct columns
- No missing data
If extraction is incorrect:
-
Use the coordinate finder:
python -m src.tools.coordinate_finder input/statement.pdf
-
Adjust coordinates in your template
-
Test again
Problem: Template is ignored with message "no IBAN patterns configured".
Cause: FREE tier requires all templates to have IBAN patterns for proper transaction tracking.
Solutions:
- Add IBAN pattern to your template's
detection.iban_patternsarray - Use bank-specific pattern (e.g.,
"IE[0-9]{2}MYBK[0-9A-Z]+") - If you need generic templates without IBAN, upgrade to PAID tier
- Check logs for specific template names that were disabled
Example Log Message:
WARNING - FREE tier requires IBAN patterns for PDF processing.
Ignoring 1 template(s) without IBAN patterns: Default Bank Statement
INFO - Template 'Default Bank Statement' (id: default) disabled: no IBAN patterns configured
Problem: Your template isn't being used for your PDFs.
Solutions:
- Check IBAN pattern matches your PDF's IBAN
- Ensure header keywords appear in top 250 points of page
- Verify at least 70% of column headers match
- Check template is
"enabled": true - Review logs for detection attempts
- FREE Tier: Verify template has IBAN patterns configured
Problem: Some transactions are not extracted.
Solutions:
- Verify
table_top_yis above first transaction - Ensure
table_bottom_yis below last transaction - Check if bank uses multiline format (
supports_multiline: true) - Use coordinate finder to verify boundaries
Problem: Transaction details appear in wrong columns.
Solutions:
- Use coordinate finder to get exact column boundaries
- Ensure column x-coordinates don't overlap
- Check for extra spaces or padding in PDF
Problem: Dates are not recognized or parsed incorrectly.
Solutions:
- Verify
date_formatmatches your bank's format - Check for locale-specific date formats
- Look at raw extracted text in debug logs
- Try different format strings
Problem: Custom template doesn't override built-in template.
Solutions:
- Ensure custom template has same
"id"as built-in template - Verify
CUSTOM_TEMPLATES_DIRis set correctly - Check custom template loads first (check logs)
- Confirm template is valid JSON
To use custom templates with Docker:
-
Create custom templates directory on host:
mkdir custom_templates
-
Mount custom templates and set environment variable in docker-compose.yml:
services: bank-statement-processor: volumes: - ./input:/app/input - ./output:/app/output - ./custom_templates:/app/custom_templates # Mount custom templates environment: - CUSTOM_TEMPLATES_DIR=/app/custom_templates # Point to mounted dir
-
Run Docker:
docker-compose up
If you create a template for a popular bank, consider contributing it to the main repository:
- Test your template thoroughly with multiple statement PDFs
- Remove any personal information (IBANs, account numbers, etc.)
- Add example detection patterns (sanitized)
- Submit a pull request to add your template to
templates/ - Include sample PDF structure description in PR
- Never commit custom templates containing real IBANs or account numbers
- Use generic patterns in templates (e.g.,
"IE[0-9]{2}MYBK.*"not your actual IBAN) - Add
custom_templates/to.gitignoreif it contains sensitive patterns
If your custom templates contain proprietary information:
- Keep them in a private directory outside the repository
- Use absolute paths for
CUSTOM_TEMPLATES_DIR - Don't share templates publicly
- Start with built-in templates as reference
- Test incrementally - start with detection, then extraction, then processing
- Use specific patterns - avoid overly generic IBAN/filename patterns
- Document your coordinates - add comments to your JSON (they'll be ignored)
- Version control - keep custom templates in a separate git repository
- Backup regularly - custom templates are valuable for your workflow
For help with custom templates:
- Review built-in templates in
templates/directory - Check logs for detection/extraction errors
- Use coordinate finder tool for troubleshooting
- Open an issue with sanitized template and error description