Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
4c02292
Remove reference to deprecated easy_install
May 3, 2021
9dca60f
Add test and documentation packages
May 3, 2021
06f2ade
Improve performance on large pdfs
May 3, 2021
58ea919
Make errorhandling in annotations generic (isssue 53)
May 3, 2021
c8fb252
Update (un)supported Python versions
May 3, 2021
d870318
Fixed an isort issue
May 3, 2021
9a58286
Fix range() page numbers for Python3 & prevent long cache file names …
May 3, 2021
a6666d4
Remove references to old version of PDFMiner
May 3, 2021
23f1d79
Fix two broken testcases
May 4, 2021
a1e15a6
Update release notes
May 4, 2021
1c122e1
Update readme
May 4, 2021
d67b898
Preparing release 0.5.0
May 4, 2021
aebf3ca
Back to development: 0.5.1
May 4, 2021
e07defe
Remove dist folder
May 4, 2021
199ff9a
Test compatibility with Python 3.5, 3.6, 3.9
koendeleijer Oct 22, 2021
f04869c
Merge pull request #1 from isprojects/python35_36_39
kdleijer Oct 26, 2021
507429c
NEXTPY-569 -- Make pdfquery compatible with Python 3.9 and 3.11
koendeleijer Oct 17, 2023
94b4648
Merge pull request #2 from isprojects/feature/NEXTPY-569_python_311
kdleijer Oct 20, 2023
9d65f0a
Preparing release 0.5.1
koendeleijer Oct 20, 2023
623fff5
Back to development: 0.5.2
koendeleijer Oct 20, 2023
1ba6035
NEXTPY-2433 -- Make "pdfquery" compatible Python 3.11, 3.12 & 3.13
koendeleijer Aug 16, 2025
bfe9880
Merge pull request #3 from isprojects/feature/NEXTPY-2433_compatibility
kdleijer Aug 21, 2025
cbaeeea
Preparing release 0.5.2
koendeleijer Aug 21, 2025
877400f
Back to development: 0.5.3
koendeleijer Aug 21, 2025
c820e43
NEXTPY-4204 -- Make pdfquery compatible with Python 3.11 to 3.14
koendeleijer Mar 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 7 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,23 +1,20 @@
language: python
python:
- "3.5"
- "3.6"
- "3.7"
- "3.8"
- "3.11"
- "3.12"
- "3.13"
- "3.14"
env: CFLAGS="-O0"

cache:
directories:
- $HOME/.cache/pip

install:
- if [[ $TRAVIS_PYTHON_VERSION < 3 ]]; then pip install -r requirements_py2.txt; fi
- if [[ $TRAVIS_PYTHON_VERSION > 3 ]]; then pip install -r requirements_py3.txt; fi
- if [[ $TRAVIS_PYTHON_VERSION == '2.6' ]]; then pip install unittest2; fi
script:
python setup.py test
- pip install -e .
script: python setup.py test
after_success:
- coveralls

# See: http://docs.travis-ci.com/user/migrating-from-legacy/
sudo: false
sudo: false
26 changes: 26 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,29 @@
0.5.3 (unreleased)


- Make pdfquery compatible with Python 3.11 to 3.14


0.5.2 (2025-08-21)


- Make pdfquery compatible with Python 3.11, 3.12 and 3.13


0.5.1 (2023-10-20)


- Make pdfquery compatible with Python 3.9 and 3.11


0.5.0 (2021-05-04)
- #67 Fix range() page numbers for Python3 & prevent long cache file names
- Remove references to old version of PDFMiner
- Fixed an isort issue
- Update (un)supported Python versions
- Improve performance on large pdfs
- Remove reference to deprecated easy_install
- Fix two broken testcases
v0.4.3, 2016-03-27 -- Add laparams parameter to __init__.
v0.4.2, 2016-02-07 -- Annotations bugfix.
v0.4.1, 2015-12-21 -- Annotations bugfix.
Expand Down
12 changes: 9 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,16 @@ PDFs with as little code as possible.

.. contents:: **Table of Contents**

Installation
============
Installation as a package
=========================

``pip install pdfquery``


Installation for development
============================

``easy_install pdfquery`` or ``pip install pdfquery``.
``pip install -e ".[test,flake8,docs,release]"``

Quick Start
===========
Expand Down
12 changes: 5 additions & 7 deletions appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
environment:
matrix:
# https://www.appveyor.com/docs/windows-images-software/#python
# currently lxml does not successfully install in 3.5 and 3.8
# - PYTHON: "C:\\Python35"
- PYTHON: "C:\\Python36"
- PYTHON: "C:\\Python37"
# - PYTHON: "C:\\Python38"
- PYTHON: "C:\\Python311"
- PYTHON: "C:\\Python312"
- PYTHON: "C:\\Python313"
- PYTHON: "C:\\Python314"

build: off

test_script:
- "%PYTHON%\\python.exe setup.py test"
- "%PYTHON%\\python.exe setup.py test"
1 change: 0 additions & 1 deletion dev_requirements.txt

This file was deleted.

Binary file removed dist/pdfquery-0.1.0.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.1.1.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.1.2.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.1.3.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.1.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.2.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.3.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.4.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.5.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.6.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.7.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.2.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.3.0.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.3.1.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.4.0.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.4.1.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.4.2.tar.gz
Binary file not shown.
Binary file removed dist/pdfquery-0.4.3.tar.gz
Binary file not shown.
2 changes: 1 addition & 1 deletion pdfquery/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .pdfquery import PDFQuery
from .pdfquery import PDFQuery
25 changes: 15 additions & 10 deletions pdfquery/cache.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import hashlib
import zipfile

from lxml import etree

class BaseCache(object):

class BaseCache(object):
def __init__(self):
self.hash_key = None

Expand Down Expand Up @@ -32,30 +33,34 @@ class DummyCache(BaseCache):


class FileCache(BaseCache):

def __init__(self, directory='/tmp/'):
def __init__(self, directory="/tmp/"):
self.directory = directory
super(FileCache, self).__init__()

def get_cache_filename(self, page_range_key):
return "pdfquery_{hash_key}{page_range_key}.xml".format(
hash_key=self.hash_key,
page_range_key=page_range_key
hash_key=self.hash_key, page_range_key=page_range_key
)

def get_cache_file(self, page_range_key, mode):
try:
return zipfile.ZipFile(self.directory+self.get_cache_filename(page_range_key)+".zip", mode)
return zipfile.ZipFile(
self.directory + self.get_cache_filename(page_range_key) + ".zip", mode
)
except IOError:
return None

def set(self, page_range_key, tree):
xml = etree.tostring(tree, encoding='utf-8', pretty_print=False, xml_declaration=True)
cache_file = self.get_cache_file(page_range_key, 'w')
xml = etree.tostring(
tree, encoding="utf-8", pretty_print=False, xml_declaration=True
)
cache_file = self.get_cache_file(page_range_key, "w")
cache_file.writestr(self.get_cache_filename(page_range_key), xml)
cache_file.close()

def get(self, page_range_key):
cache_file = self.get_cache_file(page_range_key, 'r')
cache_file = self.get_cache_file(page_range_key, "r")
if cache_file:
return etree.fromstring(cache_file.read(self.get_cache_filename(page_range_key)))
return etree.fromstring(
cache_file.read(self.get_cache_filename(page_range_key))
)
Loading