If you are looking to get involved improving invoice2data
, this guide
will help you get started quickly.
- Fork main repository (optional)
- Clone repository:
git clone https://github.com/m3nu/invoice2data
- Install as editable:
pip install -e invoice2data
Some little-used dependencies are optional. Like pytesseract
and
pdfminer
. Install if needed.
Major folders in the invoice2data
package and their purpose:
input
: Has modules for extracting plain text from files. Currently mostly PDF files.extract
: Get useful data from plain text using templates. The main class --BaseInvoiceTemplate
-- is inbase_template
. Other classes can add extra functions and inherit from it. E.g.LineInvoiceTemplate
adds support for getting individual items.extract/templates
: Keeps all supported template files. Add new templates here.output
: Modules to output structured data. CurrentlyCSV
,JSON
andXML
are supported.
Every new feature should have a test to make sure it still works after modifications done by you or someone else in the future.
To install dependencies required for tests: pip install ".[test]"
To run tests using the current Python version: pytest
To run tests using all supported Python versions: tox
(needs pyenv
and corresponding Python versions installed.)