seq2seq

Attention-based sequence to sequence learning

Dependencies

TensorFlow 1.2+ for Python 3
YAML and Matplotlib modules for Python 3: sudo apt-get install python3-yaml python3-matplotlib

How to use

Train a model (CONFIG is a YAML configuration file, such as config/default.yaml):

./seq2seq.sh CONFIG --train -v

Translate text using an existing model:

./seq2seq.sh CONFIG --decode FILE_TO_TRANSLATE --output OUTPUT_FILE

or for interactive decoding:

./seq2seq.sh CONFIG --decode

Example English→French model

This is the same model and dataset as Bahdanau et al. 2015.

config/WMT14/download.sh    # download WMT14 data into raw_data/WMT14
config/WMT14/prepare.sh     # preprocess the data, and copy the files to data/WMT14
./seq2seq.sh config/WMT14/baseline.yaml --train -v   # train a baseline model on this data

./seq2seq.sh config/S2S/dev.yaml --train -v   # train a baseline model on this data
./seq2seq.sh config/S2S/dev.yaml --decode -v   # test a baseline model on this data

You should get similar BLEU scores as these (our model was trained on a single Titan X I for about 4 days).

Dev	Test	+beam	Steps	Time
25.04	28.64	29.22	240k	60h
25.25	28.67	29.28	330k	80h

Download this model here. To use this model, just extract the archive into the seq2seq/models folder, and run:

 ./seq2seq.sh models/WMT14/config.yaml --decode -v

Example German→English model

This is the same dataset as Ranzato et al. 2015.

config/IWSLT14/prepare.sh
./seq2seq.sh config/IWSLT14/baseline.yaml --train -v

Dev	Test	+beam	Steps
28.32	25.33	26.74	44k

The model is available for download here.

Features

YAML configuration files
Beam-search decoder
Ensemble decoding
Multiple encoders
Hierarchical encoder
Bidirectional encoder
Local attention model
Convolutional attention model
Detailed logging
Periodic BLEU evaluation
Periodic checkpoints
Multi-task training: train on several tasks at once (e.g. French->English and German->English MT)
Subwords training and decoding
Input binary features instead of text
Pre-processing script: we provide a fully-featured Python script for data pre-processing (vocabulary creation, lowercasing, tokenizing, splitting, etc.)
Dynamic RNNs: we use symbolic loops instead of statically unrolled RNNs. This means that we don't mean to manually configure bucket sizes, and that model creation is much faster.

Credits

This project is based on TensorFlow's reference implementation
We include some of the pre-processing scripts from Moses
The scripts for subword units come from github.com/rsennrich/subword-nmt

Name		Name	Last commit message	Last commit date
Latest commit History 789 Commits
config		config
rawdata/DiaTest		rawdata/DiaTest
scripts		scripts
seq2seq_models		seq2seq_models
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
run-tests.py		run-tests.py
seq2seq.sh		seq2seq.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seq2seq

Dependencies

How to use

Example English→French model

Example German→English model

Features

Credits

About

Releases

Packages

Languages

License

SixingWu/seq2seq

Folders and files

Latest commit

History

Repository files navigation

seq2seq

Dependencies

How to use

Example English→French model

Example German→English model

Features

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages