Skip to content

Commit

Permalink
Merge pull request #4 from Monogramm/develop
Browse files Browse the repository at this point in the history
Fix build URL for travis
  • Loading branch information
madmath03 authored Nov 7, 2019
2 parents 8a8f9b1 + 4edfee7 commit d4b27d3
Show file tree
Hide file tree
Showing 6 changed files with 82 additions and 30 deletions.
3 changes: 3 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ before_script:
- env | sort
- dir="test"
- export IMAGE_NAME=docker-erpnext-ext:erpnext_ocr-travis
- export BUILD_BRANCH=${TRAVIS_PULL_REQUEST_BRANCH:-${TRAVIS_BRANCH}}
- export BUILD_URL=https://github.com/${TRAVIS_PULL_REQUEST_SLUG:-${TRAVIS_REPO_SLUG}}

script:
- cd "$dir"
Expand Down Expand Up @@ -41,6 +43,7 @@ script:
- docker-compose -f docker-compose.${DATABASE}.yml logs "erpnext_web"
- docker-compose -f docker-compose.${DATABASE}.yml ps "erpnext_web" | grep "Up"
- echo 'Wait until test finished (1-2 minutes)' && sleep 90
- docker-compose -f docker-compose.${DATABASE}.yml logs "sut"
- docker-compose -f docker-compose.${DATABASE}.yml ps "sut" | grep "Exit 0"
# Test container restart
- docker-compose -f docker-compose.${DATABASE}.yml down
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
### Added

### Changed

### Fixed

### Removed

<a name="0.9.0"></a>
## [0.9.0] - 2019-11-06

### Added
- PDF management in `OCR Read`
- `OCR Language` to manage available tesseract traindata files
Expand Down
88 changes: 63 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,58 +8,62 @@

## ERPNext OCR

OCR with [tesseract](https://github.com/tesseract-ocr/tesseract).
> :alembic: **Experimental** Frappe OCR application with [tesseract](https://github.com/tesseract-ocr/tesseract).
#### License
This project is a fork of [ERPNext-OCR](https://github.com/jvfiel/ERPNext-OCR) by [John Vincent Fiel](https://github.com/jvfiel). Its aim is to fix and cleanup the original source code and add some new features.

MIT# ERPNext-OCR
https://discuss.erpnext.com/t/erpnext-ocr-app/33834/7

## About this project

This project is a fork of [ERPNext-OCR](https://github.com/jvfiel/ERPNext-OCR) by John Vincent Fiel.
Its aim is to fix and cleanup the original source code and add some new features.
## :chart_with_upwards_trend: Changes

**Changes**
* See [CHANGELOG](./CHANGELOG.md)
See [CHANGELOG](./CHANGELOG.md)


**Roadmap**
* See [Taiga.io](https://tree.taiga.io/project/monogrammbot-monogrammerpnext_ocr/ "Taiga.io monogrammbot-monogrammerpnext_ocr")
## :bookmark: Roadmap

See [Taiga.io](https://tree.taiga.io/project/monogrammbot-monogrammerpnext_ocr/ "Taiga.io monogrammbot-monogrammerpnext_ocr")

## Sample Screenshot
![Sample Screenshot](https://github.com/jvfiel/ERPNext-OCR/blob/master/erpnext_ocr/erpnext_ocr/Selection_046.png)

## File Being Read
![Sample Screenshot 2](https://github.com/jvfiel/ERPNext-OCR/blob/master/erpnext_ocr/erpnext_ocr/Selection_047.png)
## :construction: Install

**Pre-requisites: tesseract-python and imagemagick**

## Pre-requisites: tesseract-python and imagemagick

- Install tesseract-ocr, plus imagemagick and ghostscript (to work with pdf files) using this command on Debian:
Install tesseract-ocr, plus imagemagick and ghostscript (to work with pdf files) using this command on Debian:
```
sudo apt-get install tesseract-ocr imagemagick libmagickwand-dev ghostscript
```

## Installation
**Install Frappe application**

```
bench get-app --branch develop erpnext_ocr https://github.com/Monogramm/erpnext_ocr
bench install-app erpnext_ocr
```
```sh
bench get-app --branch develop erpnext_ocr https://github.com/Monogramm/erpnext_ocr
bench install-app erpnext_ocr
```

When installing Frappe app, the following python requirements will be installed:
* python binding for tesseract, [pytesseract](https://pypi.org/project/pytesseract/)
* image processing library in python, [pillow](https://pypi.org/project/Pillow/)
* HTTP library in python, [requests](https://pypi.org/project/requests/)
* python binding for imagemagick, [wand](https://pypi.org/project/Wand/)

## Tesseract trained data
## :rocket: Usage

**Sample Screenshot**:

![Sample Screenshot](./erpnext_ocr/erpnext_ocr/Selection_046.png)


**File Being Read**:

![Sample Screenshot 2](./erpnext_ocr/erpnext_ocr/Selection_047.png)

### Tesseract trained data

In order to use OCR with different languages, you need to install the appropriate trained data files.
Check tesseract Wiki for details https://github.com/tesseract-ocr/tesseract/wiki/Data-Files
Check tesseract Wiki for details: https://github.com/tesseract-ocr/tesseract/wiki/Data-Files

## Known issues
### Known issues

* `wand.exceptions.PolicyError: not authorized '/opt/sample.pdf' @ error/constitute.c/ReadImage/412`
* This can happen due to security configuration in imagemagick, preventing it to read PDF files.
Expand All @@ -73,3 +77,37 @@ Check tesseract Wiki for details https://github.com/tesseract-ocr/tesseract/wiki
* `OSError: encoder error -2 when writing image file`
* This might happen when trying to open a TIFF image, but the real error is "_hidden_" and only displayed in console.
* If the original error in console is `Fax3SetupState: Bits/sample must be 1 for Group 3/4 encoding/decoding.` that usually happens when TIFF image compression is not valid / recognized.

## :white_check_mark: Run tests

```sh
bench bench run-tests --profile --app erpnext_autoinstall
```

## :bust_in_silhouette: Authors

**Monogramm**

* Website: https://www.monogramm.io
* Github: [@Monogramm](https://github.com/Monogramm)

**John Vincent Fiel**

* Github: [@jvfiel](https://github.com/jvfiel)

## :handshake: Contributing

Contributions, issues and feature requests are welcome!<br />Feel free to check [issues page](https://github.com/Monogramm/erpnext_ocr/issues).
[Check the contributing guide](./CONTRIBUTING.md).<br />

## :thumbsup: Show your support

Give a :star: if this project helped you!

## :page_facing_up: License

Copyright © 2019 [Monogramm](https://github.com/Monogramm).<br />
This project is [MIT](uri_license) licensed.

***
_This README was generated with :heart: by [readme-md-generator](https://github.com/kefranabg/readme-md-generator)_
2 changes: 1 addition & 1 deletion erpnext_ocr/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

__version__ = '0.0.1'
__version__ = '0.9.0'

4 changes: 2 additions & 2 deletions test/docker-compose.mariadb.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ services:
context: ./
dockerfile: Dockerfile.${VARIANT}
args:
- BUILD_BRANCH=${TRAVIS_BRANCH}
- BUILD_URL=https://github.com/${TRAVIS_REPO_SLUG}
- BUILD_BRANCH=${BUILD_BRANCH}
- BUILD_URL=${BUILD_URL}
image: ${IMAGE_NAME}
container_name: erpnext_app
command: app
Expand Down
4 changes: 2 additions & 2 deletions test/docker-compose.postgres.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ services:
context: ./
dockerfile: Dockerfile.${VARIANT}
args:
- BUILD_BRANCH=${TRAVIS_BRANCH}
- BUILD_URL=https://github.com/${TRAVIS_REPO_SLUG}
- BUILD_BRANCH=${BUILD_BRANCH}
- BUILD_URL=${BUILD_URL}
image: ${IMAGE_NAME}
container_name: erpnext_app
#restart: always
Expand Down

0 comments on commit d4b27d3

Please sign in to comment.