Hierarchical Attention Networks

Data from Yelp is downloaded from Duyu Tang's homepage (the same dataset used in Yang's paper)
Download link: http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp-2015-data.7z

Put data in a directory named "data/yelp_YEAR/" (where "YEAR" is the year)
Run "yelp-preprocess.ipynb" to preprocess the data. The format becomes "label \t\t sentence1 \t sentence2...".
Then run "word2vec.ipynb" to train word2vec model from training set.
Run "HAN.ipynb" to train the model.
Run "case_study.ipynb" to run visualization of some examples from validation set, including attention vector(sentence-level and word-level) and the prediction results.

Now we get about 65% accuracy on the yelp2013 test set. After fine-tuning hyperparameters, it can be better.

Hyperparameters we used

epoches	batch size	GRU units	word2vec size	optimizer	learning rate	maximum sentence length
50	32	128	50	Adam	4e-4	200

train	validation	test
66.891%	64.659%	64.734%

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
parameters		parameters
.gitignore		.gitignore
Dataset_util.py		Dataset_util.py
HAN.ipynb		HAN.ipynb
HAN.py		HAN.py
HAN_improved.ipynb		HAN_improved.ipynb
README.md		README.md
case1.PNG		case1.PNG
case2.PNG		case2.PNG
case_study.ipynb		case_study.ipynb
model-acc.png		model-acc.png
model-loss.png		model-loss.png
word2vec.ipynb		word2vec.ipynb
yelp-preprocess.ipynb		yelp-preprocess.ipynb