Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

1.1

Compare
Choose a tag to compare
@jflesch jflesch released this 31 Jan 00:22

IMPORTANT NOTE FOR WINDOWS USERS:
'paperwork_x.y.z_win64.zip' contains ONLY Paperwork itself, NOT Tesseract. Tesseract and its data files are required to use Paperwork. The list of tesseract's data files depends on which languages you intend to use.
So please do not use this .zip. Use the installer (.exe) instead.


Hello,

I'm pleased to announce the release of Paperwork 1.1. This new release is mostly focused on optimisations.

Main changes are:

  • Paperwork-gui 1.1:
    • Windows: Activation mechanism has been disabled for now
    • Workarounds for Gtk-3.20.x / GLib 2.50 (Ubuntu):
      • Work around weird behavior of GLib.idle_add (multiple calls)
      • Work around lack of refresh of document list
    • Import: Display how many image files, PDFs, documents and pages have been
      imported.
    • Automatic Color Equalization: Reduce the 'circle side-effect' by increasing
      the number of samples used.
    • paperwork-shell scan: Quit after scanning
    • Settings window: "Source" becomes "Default source" (cosmetic)
    • Export: Don't lock the UI + Display the progression of the export
    • Improve keyword highlighting: Highlight words identical to search keywords
      (as before) and also words close enough (example: 'flesh' when 'flesch'
      is being search)
    • Optim: Document list: Only display display the first 100 elements of the
      list, and extend it only when required. Reduces GTK latency and CPU usage
      (GtkListBox doesn't scale very well above 100 elements).
    • Optim: Improve PDF rendering speed: Let the libpoppler take care of the
      rendering size (see backend:page.get_image())
    • Optim: Reduce the number of useless calls to Canvas.redraw()
  • Paperwork-backend 1.1:
    • paperwork-shell: Add commands 'search', 'dump', 'switch_workdir', 'rescan',
      'show', 'import', 'delete_doc', 'guess_labels', 'add_label', 'remove_label',
      'rename'
    • Add methods doc.has_ocr() and page.has_ocr() indicating if OCR has already
      been run on a given doc/page or not yet.
      Used in GUI for the option "Redo OCR on all documents" as it must act only
      on documents where OCR has already been done in the past (ie not PDF with
      text included)
    • Optim: Provides a method page.get_image() returning an already resized
      Pillow image (PDF rendering optimisation)
    • Export: Report progression
    • Optim: PDF thumbnail rendering: Keep a cached version of the first page only.
      The other pages can be rendered on the fly
    • Fix: Label directory name use base64 encoding, and this encoding can result
      in strings containing '/'. Those characters must be replaced (by '_')
    • Fix: util/find_language(): If the system locale is not set properly, pycountry
      may raise UnicodeDecodeError.
    • Import: When importing a single PDF, don't import it if it was already
      previously imported
    • Import: Provides detailed information and statistics regarding what has been
      imported (return value of Importer.import_doc() has changed)

As usual, informations regarding Paperwork installation and update can be found at
https://github.com/jflesch/paperwork#readme .
Detailed ChangeLog for paperwork-gui is available here:
https://github.com/jflesch/paperwork/blob/stable/ChangeLog
Detailed ChangeLog for paperwork-backend is available here:
https://github.com/jflesch/paperwork-backend/blob/stable/ChangeLog

Best regards,