CLAN: Conditional Language Adversarial Networks

This is the Pytorch implementation for our paper Improving Cross-Lingual Sentiment Analysis via Conditional Language Adversarial Training of Neural Networks.

This work is accepted to the 3rd Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP) at NAACL 2021.

Abstract

Sentiment analysis has come a long way for high-resource languages due to the availability of large annotated corpora. However, it still suffers from lack of training data for low-resource languages. To tackle this problem, we propose Conditional Language Adversarial Network (CLAN), an end-to-end neural architecture for cross-lingual sentiment analysis without cross-lingual supervision. CLAN differs from prior work in that it allows the adversarial training to be conditioned on both learned features and the sentiment prediction, to increase discriminativity for learned representation in the cross-lingual setting. Experimental results demonstrate that CLAN outperforms previous methods on the multilingual multi-domain Amazon review dataset.

Python Requirements (Tested with follwoing versions)

Pytorch v1.4.0
PyYAML v5.3.1
NumPy v1.15.2
Mecab v0.996.3 (Japanese tokenization)
NLTK v3.4.5 (English / French / German tokenization)

Framework

CLAN

Dataset

Download

Download the Amazon review dataset:

git clone https://github.com/hemanthkandula/Conditional-Language-Adversaral-Networks.git
cd Conditional-Language-Adversaral-Networks
wget -P data/ https://zenodo.org/record/3251672/files/cls-acl10-unprocessed.tar.gz
tar xvf data/cls-acl10-unprocessed.tar.gz -C data/

Then run the following script to preprocess data:

python helper_utils/pre_process.py

Run CLAN In-Domain settings:

Using all language data

python train_clan_id.py --gpu_id 0 --sup_dom music --seed 0

Adapting specific languages

python train_clan_id.py --gpu_id 0 --source_lang en --target_lang ja --seed 0

Run CLAN Cros-Domain settings:

python train_clan_cd.py --gpu_id 0 --source_lang en --target_lang de --source_domain dvd --target_domain music --seed 0

Citing

@inproceedings{kandula2021improving,
  title={Improving Cross-Lingual Sentiment Analysis via Conditional Language Adversarial Nets},
  author={Kandula, Hemanth and Min, Bonan},
  booktitle={Proceedings of the Third Workshop on Computational Typology and Multilingual NLP},
  pages={32--37},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
figures		figures
scripts		scripts
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLAN: Conditional Language Adversarial Networks

This work is accepted to the 3rd Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP) at NAACL 2021.

Abstract

Python Requirements (Tested with follwoing versions)

Framework

CLAN

Dataset

Download

Run CLAN In-Domain settings:

Run CLAN Cros-Domain settings:

Citing

About

Releases

Packages

Languages

License

hemanthkandula/CLAN

Folders and files

Latest commit

History

Repository files navigation

CLAN: Conditional Language Adversarial Networks

This work is accepted to the 3rd Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP) at NAACL 2021.

Abstract

Python Requirements (Tested with follwoing versions)

Framework

CLAN

Dataset

Download

Run CLAN In-Domain settings:

Run CLAN Cros-Domain settings:

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages