-
Notifications
You must be signed in to change notification settings - Fork 9.6k
AddOns
For GUI interface to Tesseract and other 3rd Party projects, please see User Projects - 3rd Party
Platform support depends on used language and experience of user.
Name | Last update | Language | Multipage support |
---|---|---|---|
jTessBoxEditor | 2017 | Java | yes |
QT Box Editor | 2016 | C++, Qt4/Qt5 | yes |
tesseract-box-editor | 2013 | .NET 4 | yes |
Tesseract-OCR boxfile AJAX editor | 2012 | online tool | |
cowboxer | 2012 | C++, Qt4 | no |
moshPyTT | 2011 | Python, GTK2 | no |
pytesseracttrainer | 2011 | Python, GTK2 | no |
Name | Last update | Language |
---|---|---|
Tesseract-OCR boxfile AJAX editor | March 2012 | online tool |
owlboxer | Jul 2010 | C++, Qt4 |
Tessboxer | Jun 2009 | .NET |
boxfilereader.php | Mar 2009 | php |
tessboxes | Nov 2008 | C |
JTesseract | Oct 2008 | C# |
wx-tetra | Sep 2008 | perl, wx |
bbtesseract | Jul 2008 | VB.NET 2008 |
-
jTessBoxEditor | Box Editor and Training Tool
-
MzTesseract - MS Windows program that can train new langauge from top to bottom
-
FrankenPlus - tool for creating font training for Tesseract OCR engine from page images. More information about Franken+ is at at IT'S ALIVE! and Franken+ homepage.
-
python-tesseract-3.02-training - script to automate the generation of Tesseract 3.02 training files
-
tesseract-box-file - autoit script to make editing the box file easier
-
Serak Tesseract Trainer for Tesseract 3.02 - a front end GUI for training tesseract 3.02
-
BoxMaker is online tool for generating image&box pair. Offline version is available in download section of PersianOCR project
-
boxFactory is a tool for quickly creating box files to train the Tesseract OCR engine. You can identify characters in the image by simply drawing boxes around them..
-
https://github.com/BaltoRouberol/TesseractTrainer - TesseractTrainer is a simple Python API, taking over the tedious process of manually training Tesseract3
-
tess_school - a set of handy scripts to make the tesseract training process a bit easier
-
txt2img: Qt GUI application that generate image and box file based on text imput
-
DangAmbigs Generator: Creates a DangAmbigs file automatically given a set of OCR text output and correct text. Requirements: Python
-
train.ps1: Windows powershell script for Automate Tesseract 3.01 language data pack generation process.
-
Update unicharambigs.exe: A small (windows) C# program for editing "lang.unicharambigs" file
-
train_tess.pl: perl script to facilitate training
-
boxedit A web-based editor for Tesseract box files
-
TrainYourTesseract | Free online "no-hassle" TTF file to trainedata converter
- Tesseract-MICR-OCR: https://github.com/BigPino67/Tesseract-MICR-OCR
- MRZ: https://groups.google.com/group/tesseract-ocr/attach/10d7c711c9cc80/mrz.traineddata
- Latin: https://github.com/ryanfb/latinocr-lattraining
- tesseract-georgian: https://github.com/ddohler/tesseract-georgian
- Polish Fraktur: training as result of the IMPACT project, trained dataset
- Ancient Greek: http://ancientgreekocr.org
- Indic: http://code.google.com/p/tesseractindic/, https://github.com/debayan/Tesseract-Indic-OCR/, http://code.google.com/p/parichit/ (All are Obsolete)
- Indic-OCR http://indic-ocr.github.io/tessdata/
- Irish uncial: https://github.com/jimregan/tesseract-gle-uncial
- Polish: http://code.google.com/p/tesseract-polish/
- Fraktur (dan, deu, swe): https://github.com/paalberti/tesseract-dan-fraktur
- Myanmar: http://code.google.com/p/myaocr/
- Persian (Farsi): https://github.com/reza1615/PersianOcr
- 7 segments font: https://github.com/arturaugusto/display_ocr/tree/master/letsgodigital
- Project Naptha
- tesseract.js-core - Emscripten port of Tesseract C++ API
- tesseract.js - Pure Javascript OCR
C
- Tesseract versions 3.02 and up include C API
.Net
- charlesw/tesseract - project offers also tesseract-ocr 64bit Windows library
- http://code.google.com/p/tesseractdotnet/
Python
- tesserocr - A Python wrapper (Pillow-friendly) for the tesseract-ocr API
- pyocr - A python wrapper for Tesseract and Cuneiform
- tesserwrap - Python bindings to the Tesseract API
- tesseract-sip - A python SIP wrapper for libtesseract (Apache license)
- http://code.google.com/p/pytess/ - A simple SWIG-based interface to Tesseract
- http://code.google.com/p/python-tesseract/ is a wrapper class for Tesseract OCR that allows any conventional image files (SWIG based)
R
- tesseract Bindings to the C++ API for the R programming language
Ruby
- ruby-tesseract-ocr - wrapper for tesseract 3.0x using the C++ API
- rtesseract
Java
- tess4j - JNA wrapper. Docs and discussions - http://tess4j.sourceforge.net/
Node.js
- penteract - The native node.js bindings to the Tesseract OCR project.
PHP
Objective-C
Clojure
Python
- https://github.com/hoffstaetter/python-tesseract/wiki
- http://code.google.com/p/pytesser/
- http://code.google.com/p/tesseract-python (pytesser clone)
- https://github.com/hoffstaetter/python-tesseract/wiki
- http://pokerai.org/pf3/viewtopic.php?f=3&t=2677&start=0&st=0&sk=t&sd=a
- patches of SWIG wrapper for python
.NET
Java
- tess4j (0.4) - JNA wrapper. Docs and discussions - http://tess4j.sourceforge.net/
Old wiki - no longer maintained. The pages were moved, see the new documentation.
These wiki pages are no longer maintained.
All pages were moved to tesseract-ocr/tessdoc.
The latest documentation is available at https://tesseract-ocr.github.io/.