- Python 3.7+
git clone <repo>
andcd <repo>
pip install -r dependencies.txt
- Check
Installation documentation
to install OCR dependencies/models
python run.py <config.json> <input_folder> [<output_folder> <preprocessors>]
Check configs
folder for sample configs.
Pre-processors suppported:
(Most of them are not fully reliable, and order is important)
deskew
- To auto-deskew imagesauto_rotate
- To auto-rotate imagesdoc_crop
- To automatically crop only document regionremove_bg
- To automatically erase background from foreground
python evaluate.py -d -gt <ground_truth_json_folder> -det <detections_json_folder>
python evaluate.py -r -gt <ground_truth_json_folder> -cfg <config_json_file>
python evaluate.py -r --gt-txt <ground_truth_tsv> -cfg <config_json_file>
Parameters:
--gt-txt
: Tab-separated file with each line havingimage_path
and correspondingtext_label
- Ensure StreamLit is installed (
pip install streamlit
) - Run
streamlit run ocr_ui.py --server.port 80
It should automatically open the UI in your browser.
To host the OCR as an API.
Development mode:
uvicorn api_server:app --host 0.0.0.0 --reload
Visit http://localhost:8000/docs for API documentation.
Example - Testing the API using Python:
payload = {'additional_langs': ['hi']}
files = {'image': open('your_image.jpg, 'rb')}
response = requests.post('http://localhost:8000/ocr', data=payload, files=files) #auth=('admin', 'pass'))