University of Amsterdam Deep Learning for Natural Language Processing Fall 2020 Mini Project
Part-of-speech (POS) tagging is an import pre-processing step in Natural Language Processing. State-of-the-art neural approaches typically produce rich, context-sensitive word encodings with recurrent networks. A recently proposed and highly successful meta recurrent architecture integrates sentence-level context from both character and word-based representations. In this work, we exploit Bayesian model averaging to analyze the uncertainty of the different components of a recurrent meta-architecture in the context of POS tagging. We find that the meta component mediates the signals from the word and character-based components. Most importantly, we show that the meta model is highly uncertain when its input signals disagree.
- Leila F.C. Talha
- Michael J. Neely
- Stefan F. Schouten
Prepare a Python virtual environment and install the necessary packages.
python3 -m venv v-dl4nlp-pos-tagging
source v-dl4nlp-pos-tagging/bin/activate
pip install torch
pip install -r requirements.txt
python -m spacy download en
-
Download the train and tests sets to the
datasets/conll200
directory and run thescripts/split_conll2000_train.py
script. Provide the percentage of the train set to use as the validation set with a positional argument. Default: 0.1
Train the Meta-BiLSTM morphosyntactic tagger, calculate its uncertainty on the test set, and generate some interesting figures by running:
allennlp uncertainty-experiment experiments/conll2000_meta_tagger_separate_mcdrop.jsonnet
By default, generated artifacts are saved in the outputs/
directory.