- Overview
- Dependencies
- Additional Info
- Citations
A Cinder application that allows you to write text and convert it to standard, digital text. It also provides the user with the option to send their converted output to a file of their choice.
- g++/gcc 4.8 and above, clang 3.4 and above
- cmake
- OpenCV 4.5.0
- Tesseract 4.1.1
$ sudo apt-get install gcc cmake opencv tesseract
- Can be built using CLion
$ brew install gcc cmake opencv tesseract
- Can be built using CLion
- Check out the links above for more info on how to install dependencies for your system. You will need MSVC-2015 or higher to build this project on Windows.
- After obtaining the necessary dependencies, clone the repository with
git clone https://github.com/shruthikmusukula/DigitalTextConverter.git
on your local machine.
This codebase was written in accordance with the Google C++ Style Guide (https://google.github.io/styleguide/cppguide.html). Code from the OpenCV and Tesseract frameworks may not adhere to these guidelines.
Below is a full demo of the application and its major features.
By passing in appropriate image input paths and textfile output paths, a user can translate the text in an image of their choice to digital text.
Here are some additional resources if you are looking to build upon this codebase or encounter any errors:
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
Please refer to this Github Issue. You need to add this training data file to your tesseract installation.
On Ubuntu, download and place this file at /usr/share/tesseract-ocr/4.1.1/tessdata/eng.traineddata
.
On MacOS, download and place this file at /usr/local/Cellar/tesseract/4.1.1/tessdata/eng.traineddata
.
Excerpt from Tesseract Docs:
Tesseract does various image processing operations internally (using the Leptonica library) before doing the actual OCR. It generally does a very good job of this, but there will inevitably be cases where it isn’t good enough, which can result in a significant reduction in accuracy.
Please look here for additional information.
All necessary citations are listed wherever applied in the codebase.