This repository contains an add-on to dhSegment torch to use it with text embeddings maps.
For more details about text embeddings map, please see the following publication:
Barman, Raphaël, Ehrmann, Maud, Clematide, Simon, Ares Oliveira, Sofia, and Kaplan, Frédéric (2020).
Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers.
Journal of Data Mining and Digital Humanities. https://arxiv.org/abs/2002.06144
This repository introduces new input code for using text embeddings maps as well as new networks.
Using dhSegment torch training script, the following config parameters must be changed:
train_dataset
andval_dataset
should now be of typeimage_text_csv
(note that patches datasets are not supported).train_loader
andval_loader
must be set totext_data_loader
.model
type should be set totext_segmentation_model
.- Either the
encoder
ordecoder
should be set to thetext_
variant (currently supported architectures aretext_resnet50
andtext_unet
). - The
text_
encoder
ordecoder
should have the following additional parameters:"embeddings_encoder": {"target_embeddings_size": 300}
set to the size of the embeddings (here 300)."embeddings_level": 0
set to the level in the network where the embeddings map should be input (here 0)
An example config file can be found in example_conf.json
.
In addition to these changes to the config file, the training script should be modified to by adding import dh_segment_text_torch
to the top.