This folder contains a (growing) number of text extraction approaches and getting started information.
Currently implemented are;
- GROBID
- TIKA
- xpdf (not tested)
Other options may include;
- https://pd3f.com/docs/
- pdfminer.six
- PyPDF2
These have not been tested, but have build up a reputation for being effective.