Skip to content

inference

Solène Tarride edited this page Sep 11, 2023 · 3 revisions

Use an existing model for inference

We published some models on HuggingFace (trained on Norwegian historical documents). To use them, follow these simple steps:

1. Clone the HuggingFace repository

git clone https://huggingface.co/Teklia/pylaia-huginmunin

2. Prepare your test data

Download these two test images and create a text file (test_img_list.txt) containing the paths to your test images.

example_0_resized example_1_resized

mkdir data
wget https://user-images.githubusercontent.com/100838858/219007024-f45433e7-99fd-43b0-bce6-93f63fa72a8f.jpg -O data/example01.jpg
wget https://user-images.githubusercontent.com/100838858/219008758-c0097bb4-c55a-4652-ad2e-bba350bee0e4.jpg -O data/example02.jpg
ls data/* > test_img_list.txt

Note that these images are already resized to a fixed height of 128 pixels. If you want to use your own images, make sure they are resized first.

3. Create your custom configuration file

Basic decoding

Copy the following YAML file into my_decode_config.yaml

common:
  experiment_dirname: pylaia-huginmunin
  model_filename: pylaia-huginmunin/model
decode:
  convert_spaces: true
  join_string: ''
  use_language_model: False
img_list: test_img_list.txt
syms: pylaia-huginmunin/syms.txt

With an ARPA language model

Copy the following YAML file into my_decode_config_with_lm.yaml

common:
  experiment_dirname: pylaia-huginmunin
  model_filename: pylaia-huginmunin/model
decode:
  convert_spaces: true
  join_string: ''
  use_language_model: True
  language_model_path: pylaia-huginmunin/language_model.arpa.gz
  language_model_weight: 1.5
  tokens_path: pylaia-huginmunin/tokens.txt 
  lexicon_path: pylaia-huginmunin/lexicon.txt
img_list: test_img_list.txt
syms: pylaia-huginmunin/syms.txt

4. Predict using PyLaia

Basic decoding

To decode without language model (faster), use the following command:

pylaia-htr-decode-ctc --config my_decode_config.yaml | tee predict.txt

In this configuration, you can also get confidence scores:

  • at line-level
pylaia-htr-decode-ctc --config my_decode_config.yaml --decode.print_line_confidence_scores true | tee predict.txt
data/example01.jpg 0.99 og Valstad kan vi vist                                                                                                                                                             
data/example02.jpg 0.98 ikke gjøre Regning paa,
  • at word-level
pylaia-htr-decode-ctc --config my_decode_config.yaml --decode.print_word_confidence_scores true | tee predict.txt
data/example01.jpg ['1.00', '1.00', '1.00', '1.00', '1.00'] og Valstad kan vi vist                                                                                                                         
data/example02.jpg ['1.00', '0.91', '1.00', '0.99'] ikke gjøre Regning paa,

With an ARPA language model

This will combine PyLaia (the optical model) with the ARPA language model located in pylaia-huginmunin/language_model.arpa.gz.

pylaia-htr-decode-ctc --config my_decode_config_with_lm.yaml | tee predict.txt

The predictions in the predict.txt file should look like :

data/example01.jpg og Valstad kan vi vist
data/example02.jpg ikke gjøre Regning paa,