Indic OCR

Pre-requisites

Python 3.7+
git clone <repo> and cd <repo>
pip install -r dependencies.txt
Check Installation documentation to install OCR dependencies/models

Running OCR

python run.py <config.json> <input_folder> [<output_folder> <preprocessors>]

Check configs folder for sample configs.

Pre-processors suppported:
(Most of them are not fully reliable, and order is important)

deskew - To auto-deskew images
auto_rotate - To auto-rotate images
doc_crop - To automatically crop only document region
remove_bg - To automatically erase background from foreground

Evaluation

Computing Detection Scores

python evaluate.py -d -gt <ground_truth_json_folder> -det <detections_json_folder>

Computing Recognition Accuracies

Using OCR's JSON Format

python evaluate.py -r -gt <ground_truth_json_folder> -cfg <config_json_file>

Using a TSV file

python evaluate.py -r --gt-txt <ground_truth_tsv> -cfg <config_json_file>

Parameters:

--gt-txt: Tab-separated file with each line having image_path and corresponding text_label

Running UI Server

Ensure StreamLit is installed (pip install streamlit)
Run streamlit run ocr_ui.py --server.port 80

It should automatically open the UI in your browser.

Running API Server

To host the OCR as an API.

Development mode:

uvicorn api_server:app --host 0.0.0.0 --reload

Visit http://localhost:8000/docs for API documentation.

Example - Testing the API using Python:

payload = {'additional_langs': ['hi']}
files = {'image': open('your_image.jpg, 'rb')}

response = requests.post('http://localhost:8000/ocr', data=payload, files=files) #auth=('admin', 'pass'))

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
configs		configs
documentation		documentation
indic_ocr		indic_ocr
libs		libs
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
api_server.py		api_server.py
credentials.json		credentials.json
dependencies.txt		dependencies.txt
evaluate.py		evaluate.py
ocr_ui.py		ocr_ui.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Indic OCR

Pre-requisites

Running OCR

Evaluation

Computing Detection Scores

Computing Recognition Accuracies

Using OCR's JSON Format

Using a TSV file

Running UI Server

Running API Server

About

Languages

OneFourthLabs/Indic-OCR

Folders and files

Latest commit

History

Repository files navigation

Indic OCR

Pre-requisites

Running OCR

Evaluation

Computing Detection Scores

Computing Recognition Accuracies

Using OCR's JSON Format

Using a TSV file

Running UI Server

Running API Server

About

Resources

Stars

Watchers

Forks

Languages