Skip to content

Latest commit

 

History

History
88 lines (67 loc) · 2.9 KB

README.md

File metadata and controls

88 lines (67 loc) · 2.9 KB

Text Renderer

Generate text images for training deep learning OCR model (e.g. CRNN). example

  • Modular design. You can easily add Corpus, Effect, Layout.
  • Support generate lmdb dataset which compatible with PaddleOCR, see Dataset
  • Support render multi corpus on image with different font, font size or font color. Layout is responsible for the layout between multiple corpora
  • Generate vertical text
  • Corpus sampler: helpful to perform character balance

Quick Start

To use text_renderer, you should prepare:

  • Font file: .ttf or...
  • Background image
  • Text: Optional. Depends on the corpus you use.
  • Character set: Optional. Depends on the corpus you use.

Run following command to generate image using example data:

git clone https://github.com/oh-my-ocr/text_renderer
cd text_renderer
python3 setup.py develop
pip3 install -r docker/requirements.txt
python3 main.py \
    --config example_data/example.py \
    --dataset img \
    --num_processes 2 \
    --log_period 10

The data is generated in the example_data/output directory.

main.py script only has 4 arguments:

  • config:Python config file path
  • dataset: Dataset format img/lmdb
  • num_processes: Number of processes used
  • log_period: Period of log printing. (0, 100)

All parameters related to the example image generation process are all configured in example.py

Learn more at documentation

Run in Docker

Build image

docker build -f docker/Dockerfile -t text_renderer .

Config file is provided by CONFIG environment. In example.py file, data is generated in example_data/output directory, so we map this directory to the host.

docker run --rm \
-v `pwd`/example_data/docker_output/:/app/example_data/output \
--env CONFIG=/app/example_data/example.py \
--env DATASET=img \
--env NUM_PROCESSES=2 \
--env LOG_PERIOD=10 \
text_renderer

Build docs

cd docs
make html

Open _build/html/index.html

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,
  author =       {weiqing.chu},
  title =        {text_renderer},
  howpublished = {\url{https://github.com/oh-my-ocr/text_renderer}},
  year =         {2021}
}