A set of utility classes and functions to process documents with Python
- Free software: MIT license
- Documentation: https://document-clipper.readthedocs.io.
The document_clipper package uses libraries that relies on several command-line tools included in the poppler-utils package such as: - pdftohtml - pdfimages - pftocairo
Before attempting to use document_clipper, please install the poppler-utils package.
For instance, in Ubuntu, you may do so by running the following command:
$ sudo apt-get install poppler-utils
Then, you may install document_clipper as usual via Python package managers, such as PIP:
$ pip install document_clipper
- Fetch the number of pages associated to a PDF file.
- Extract the coordinates and dimensions of a given text located in a PDF file.
- Combine multiple PDFs into a single PDF.
- Combine multiple PDF and image files into a single PDF.
- Generate a new PDF file containing a subset of a provided source PDF file's pages. Rotations can be applied to each page individually.
- Optionally fix the document(s) involved in the slicing/merging processes beforehand.