GitHub

Tested with Python 2.7.3 and Tensorflow 1.1
Talk: https://www3.nd.edu/~tnguye28/naacl18.pdf

General

This is the code for the paper Improving Lexical Choice in Neural Machine Translation (accepted at NAACL HLT 2018). The branches are:

master: baseline NMT
tied_embedding: baseline NMT with tied embedding
fixnorm: fixnorm model in paper
fixnorm_lex: fixnorm+lex model in paper
arthur: apply the method of Arthur et al. on top of tied_embedding NMT

To train a model:

write a configuration function in configurations.py
run: python -m nmt --proto your_config_func

Depending on your config function, the code generates a direction under nmt/saved_models/your_model_name and saves all dev validations there, as well as dev perplexities, train perplexities, best model checkpoint, checkpoint so far (I've tested with saving 1 best checkpoint, not sure about > 1). You should use this checkpoint to translate on any other input.

To translate with UNK replacement:

run: python -m nmt --proto your_config_func --mode translate --unk-repl --model-file path_your_saved_checkpoint.cpkt --input-file path_to_input_file

Remember the checkpoint includes data file, meta file, ... but just link to .cpkt, ignore the extension.

References

Code & scripts might be inspired/borrowed from some sources:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
images		images
layers		layers
nmt		nmt
scripts		scripts
.gitignore		.gitignore
LICENSE.md		LICENSE.md
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

General

References

About

Releases

Packages

Languages

License

tnq177/improving_lexical_choice_in_nmt

Folders and files

Latest commit

History

Repository files navigation

General

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages