This git repo contains data and two approaches to solve the second task of challenge 1 (translate menu text from Vietnamese to English).
Two approaches to solve the translation problem are SMT and NMT. Performance of a translation model is measured by BLEU score as shown in the table below.
Approach | BLEU score | Detail format: overall_score, BLEU-1/BLEU-2/BLEU-3/BLEU-4 |
---|---|---|
NMT | 17.53 | 17.53, 67.6/37.9/20.8/14.0 (BP=0.596, ratio=0.659, hyp_len=3505, ref_len=5319) |
SMT | 42.10 | 42.10, 71.4/47.3/34.4/28.3 (BP=0.989, ratio=0.989, hyp_len=5260, ref_len=5319) |
Reason for low performance in NMT: lack of data. NMT requires much more data compared with SMT. From experimental results, SMT is the recommended approach.