Releases: axa-group/Parsr
Releases · axa-group/Parsr
v0.12
Changes:
- PdfJS improved to be compliant with 'Image Detection' module.
- Allow Abbyy as pdf extractor.
- Python3 client improved.
- Added Spanish & Portuguese Readme.
- Headings detection improved.
- API
postDocument
with optionaldefaultConfig
. - Several Bug fixing.
0.11.2
v0.11.1
Changes:
- Fixed dependencies with security vulnerability detected
- Several bug fixes
v0.11
Changes
- Advanced Image detection module that allows scan images using OCR's
- Improved data extraction & reconstruction when a document has pages with rotated content
- Parsr bare-metal installation process automated using just one NodeJs script
- Removed GraphicsMagick & pdf2pic dependencies
- Updated documentation
- Several bug fixes
0.10.1
Security vulnerability fixed
Bump bleach from 3.1.0 to 3.1.1 in /demo/jupyter-notebook
0.10
Changes
- New input file *.docx
- New 'Table of contents' processing module
- UI added button for outputs download
- Added compatibility for PdfMiner '20200124'
- Improved PdfMiner extraction time using xml stream reader
- Allow to run new Ocr's using API by extending configuration file
- Several bug fixes
Breaking changes
- Deprecated pipeline configuration property 'extractor.img'
0.9: Merge branch 'develop'
Changes
-
Integrated new OCR's in GUI
- Google Vision
- Amazon Textract
- Microsoft Cognitive Services
- Abbyy
-
Updated GUI: Added oficial Logo and fixed some cosmetic issues
-
Several bug fixing
-
Updated Readme.md
v0.8: Merge pull request #293 from axa-group/feature/Image_Module_Off
Changes
- Simple Image detection using PdfMiner.
- Allowed *.elm as input to be parsed (message body and attachments are used to extract data).
- GUI can display page margins by activating just a switch.
- Readme in French.
v0.7.1: Merge pull request #263 from axa-group/feature/better-error-trace
Changes
- Removed 'sharp' dependency from API
- Improved errors handling
- Allow Tesseract to run multi pages PDF's
- Some JS vulnerabilities fixed
- Improved Jupyter Notebook document versioning display
v0.7
Changes
- Optimisation of images before tesseract scan (detect rotation & removed shadows)
- New input module option Pdf.js (recommended for large Pdf's)
- Jupyter Notebook: Added document versioning & comparison
- Javascript vulnerability Fixed
- Several GUI & Server bug fixes