The repo is a model that converts Korean into polite sentences.(using OpenNMT Transformer)
!python preprocess.py
The source text file(src
) and target text file(tgt
)
default tokenize : Mecab
+SentencePiece
.
!python train.py
If you want to continue training the model, add --train_from (model path)/model.pt
later.
!python translate.py -model data/model/model.pt -src data/src-test.txt -tgt data/tgt-test.txt -replace_unk -verbose -gpu 0
!python ./onmt/tools/spacing.py -i ./data/pred.txt -o ./data/pred_sp.txt
!perl tools/multi-bleu.perl data/tgt-test.txt < data/pred.txt