OCR(Optical Character Recognition) consists of text localization + text recognition. (text localization finds where the characters are, and text recognition reads the letters.)
You can use this text localizaion model I have studied.
After performing localization, each text area is cropped and used as input for text recognition. An example of text recognition is typically the CRNN
Combining the text detector with a CRNN makes it possible to create an OCR engine that operates end-to-end.
CRNN is a network that combines CNN and RNN to process images containing sequence information such as letters.
It is mainly used for OCR technology and has the following advantages.
- End-to-end learning is possible.
- Sequence data of arbitrary length can be processed because of LSTM which is free in size of input and output sequence.
- There is no need for a detector or cropping technique to find each character one by one.
You can use CRNN for OCR, license plate recognition, text recognition, and so on. It depends on what data you are training.
I used a slightly modified version of the original CRNN model. (Input size : 70x30 -> 128x64 & more CNN Layer)
Extracts features through CNN Layer (VGGNet, ResNet ...).
Splits the features into a certain size and inserts them into the input of the Bidirectional LSTM or GRU.
Conversion of Feature-specific predictions to Label using CTC (Connectionist Temporal Classification).
I used CRNN to recognize catpcha.
I updated the captcha generator for those who lacked captcha pictures.
CRNN works well for license plate recognition as follows.
First, you need a lot of cropped captcha images.
(The captcha 1234 is indicated as "1234.jpg").
After creating training data in this way, put it in 'DB/train' directory and run training.py.
os : Ubuntu 16.04.4 LTS
GPU : Telas V100 (16GB)
Python : 3.6.5
Tensorflow : 1.9.0
Keras : 2.1.3
CUDA, CUDNN : 9.0, 7.0
File | Description |
---|---|
Model .py | Network using CNN (VGG) + Bidirectional LSTM |
Model_GRU. py | Network using CNN (VGG) + Bidirectional GRU |
Image_Generator. py | Image batch generator for training |
parameter. py | Parameters used in CRNN |
training. py | CRNN training |
Prediction. py | CRNN prediction |