Empowering Active Learning to Jointly Optimize System and User Demands

Ji-Ung Lee, Christian M. Meyer, and Iryna Gurevych

UKP Lab, TU Darmstadt

Source code and user models from our experiments of our ACL 2020 article.

Running the code here will train a model from scratch in a simulated interactive learning setup. At each iteration, a C-Test will be sampled and scored according to the proficiency of the current learner model. The model will then be updated on the newly scored sample and used to sample the next C-Test. The implementation includes a total of five different learner models, five different sampling strategies, and four different learner behaviours.

Abstract: Existing approaches to active learning maximize the system performance by sampling unlabeled instances for annotation that yield the most efficient training. However, when active learning is integrated with an end-user application, this can lead to frustration for participating users, as they spend time labeling instances that they would not otherwise be interested in reading. In this paper, we propose a new active learning approach that jointly optimizes the seemingly counteracting objectives of the active learning system (training efficiently) and the user (receiving useful instances). We study our approach in an educational application, which particularly benefits from this technique as the system needs to rapidly learn to predict the appropriateness of an exercise to a particular user, while the users should receive only exercises that match their skills. We evaluate multiple learning strategies and user types with data from real users and find that our joint approach better satisfies both objectives when alternative methods lead to many unsuitable exercises for end users.

Contact person: Ji-Ung Lee, lee@ukp.informatik.tu-darmstadt.de
- UKP Lab: http://www.ukp.tu-darmstadt.de/
- TU Darmstadt: http://www.tu-darmstadt.de/

Drop me a line or report an issue if something is broken (and shouldn't be) or if you have any questions.

For license information, please see the LICENSE and README files.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project structure

active_learning — Our active learning strategies
data — Folder to put the data
learner_models — Our simulated learner models
models — Folder for storing our trained deep learning models
readers — Datareader
results — Result folder
user_simulation — Code the handing simulated learner models

Setting up the experiments

pip install -r requirements.txt

Running the experiments

python train_model.py

Parameters

The code offers a range of parameters which can be set:

--train — Path to the (initially unlabeled) training data.

--test — Path to the test data.

--seed — Random seed to use.

--epochs — Number of epochs.

--init-weights — Path to the initial model weights.

--best-model — Path to store best model on the validation set.

--sampling-strategy — Sampling strategy: random, uncertainty, user, combined, or tradeoff

--user-static-class — Learner proficiency for the static.

--user-step-size — Step size t for the increasing or decreasing learner behavior.

--user-strategy — Learner behavior: increasing (motivated), decreasing, interruped, or static

--al-iterations — Number of active learning iterations.

--history — Path to the active learning history; stores the user proficiency at each iteration and the individual gap predictions

--results — Path to the result file.

--uncertainty — Parameter for selecting the uncertainty computation. Set to softmax for U_soft.

--lambda-schedule — Schedule for an adaptive lambda. None (static lambda) or root for a decreasing lambda.

Data

Unfortunately, we are not allowed to share the data we used for our experiments due to privacy protection regulations. data/example_file.tc contains an examplary file. The feature extractor from Dr. Beinborn resulted in specific DKPro .tc files for which we wrote a separate reader as implemented in readers.load_data(input_file).

To use your own data, you can adapt the reader to read sentences or documents where each token consists of a triple: (token: str, label: int, features: numpy.array) Don't forget to adapt the model in train_model.py accordingly to match the input dimensions.

Citing the paper

Please cite our paper as:

@inproceedings{lee-etal-2020-empowering,
    title = "{E}mpowering {A}ctive {L}earning to {J}ointly {O}ptimize {S}ystem and {U}ser {D}emands",
    author = "Lee, Ji-Ung  and
      Meyer, Christian M.  and
      Gurevych, Iryna",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.390",
    pages = "4233--4247",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Empowering Active Learning to Jointly Optimize System and User Demands

Ji-Ung Lee, Christian M. Meyer, and Iryna Gurevych

UKP Lab, TU Darmstadt

Project structure

Setting up the experiments

Running the experiments

Parameters

Data

Citing the paper

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
active_learning		active_learning
data		data
learner_models		learner_models
readers		readers
user_simulation		user_simulation
License.txt		License.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
requirements.txt		requirements.txt
train_model.py		train_model.py

License

UKPLab/acl2020-empowering-active-learning

Folders and files

Latest commit

History

Repository files navigation

Empowering Active Learning to Jointly Optimize System and User Demands

Ji-Ung Lee, Christian M. Meyer, and Iryna Gurevych

UKP Lab, TU Darmstadt

Project structure

Setting up the experiments

Running the experiments

Parameters

Data

Citing the paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages