Skip to content

train ctc

Solène Tarride edited this page Feb 15, 2023 · 2 revisions

Train your own model

To train your own model, you will need to follow three main steps:

  1. Prepare your dataset
  2. Initialize your model
  3. Train your model

1. Format your dataset

  • To train PyLaia, you will need line-level images and their corresponding transcription.
  • All images should be resized to a fixed height. The recommended value is 128 pixels.
  • Your images should be divided in three sets: train, validation and test.

For each split, 3 files are required:

  • train.txt mapping images and transcription tokens. This file is used to train the model.
train/im01 f o r <space> d e t <space> t i l f æ l d e <space> d e t <space> s k u l d e <space> l y k k e s <space> D i g
train/im02 a t <space> o p d r i v e <space> d e t <space> o m s k r e v n e <space> e x p l : <space> a f
train/im03 « F r u <space> I n g e r » , <space> a t <space> s e n d e <space> m i g <space> s a m m e
  • train_ids.txt listing image names. This file is used for prediction.
train/im01
train/im02
train/im03
  • train_eval.txt mapping images and transcription. This file is used during evaluation.
train/im01 for det tilfælde det skulde lykkes Dig
train/im02 at opdrive det omskrevne expl: af
train/im03 «Fru Inger», at sende mig samme

Finally, you need to generate the alphabet mapping table, beginning by the <ctc> token. You can find an example of the syms.txt file on HuggingFace.

Initialize your model

Use the following command to initialize the model. Note that you need to provide the syms.txt file for building the model, which is used to compute the size of the alphabet in order to initialize the last linear layer.

The model can be fully configured using a configuration file or command-line arguments.

  • Run the following command to get the full list of command-line arguments
pylaia-htr-create-model --help
  • Or use a YAML configuration file and run the following command:
adaptive_pooling: avgpool-16
crnn:
  cnn_activation:
  - LeakyReLU
  - LeakyReLU
  - LeakyReLU
  - LeakyReLU
  cnn_batchnorm:
  - true
  - true
  - true
  - true
  cnn_dilation:
  - 1
  - 1
  - 1
  - 1
  cnn_kernel_size:
  - 3
  - 3
  - 3
  - 3
  cnn_num_features:
  - 12
  - 24
  - 48
  - 48
  cnn_poolsize:
  - 2
  - 2
  - 0
  - 2
  lin_dropout: 0.5
  rnn_dropout: 0.5
  rnn_layers: 3
  rnn_type: LSTM
  rnn_units: 256
fixed_input_height: 128
save_model: true
syms: syms.txt
pylaia-htr-create-model --config config_create_model.yaml 2>&1 | tee pylaia_create_model.log

Train

Training can also be configured using a configuration file or command-line arguments.

  • Run the following command to get the full list of command-line arguments
pylaia-htr-train-ctc --help
  • Or create a YAML configuration file config_train.yaml:
syms: data/syms.txt
tr_txt_table: data/train.txt
va_txt_table: data/val.txt
common:
  experiment_dirname: experiment
data:
  batch_size: 8
  color_mode: L
optimizer:
  learning_rate: 0.0005
  name: RMSProp
scheduler:
  active: true
  monitor: va_loss
train:
  augment_training: true
  early_stopping_patience: 80
trainer:
  auto_select_gpus: true
  gpus: 1
  max_epochs: 600
  • Train the model:
pylaia-htr-train-ctc --config config_train.yaml 2>&1 | tee pylaia_train.log