LanguageTransfer

Negative language transfer dataset

To access the negative language transfer annotations added to the errors made by Chinese native speakers whose essays are part of the FCE dataset, you need to have downloaded the FCE dataset from https://www.ilexir.co.uk/datasets/index.html.

After downloading, and as consequence signing the FCE dataset's license agreement, fill and submit this form. Once the form is submitted you will be granted access to the negative language transfer annotated errors dataset.

Data format

The negative language transfer annotated data follows the format described in the data_description.md file.

Citation

When using the negative language transfer annotated data, you are encouraged to cite the following papers:

@inproceedings{farias-wanderley-etal-2021-negative,
    title = "Negative language transfer in learner {E}nglish: A new dataset",
    author = "Farias Wanderley, Leticia  and
      Zhao, Nicole  and
      Demmans Epp, Carrie",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-main.251",
    pages = "3129--3142"
}

@inproceedings{yannakoudakis2011fce,
 title={A new dataset and method for automatically grading ESOL texts},
 author={Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben},
 booktitle={Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1},
 pages={180--189},
 year={2011},
 organization={Association for Computational Linguistics}
}

Negative language transfer classification

To view the code used to analyse the predictive power of logistic regression and random forest models in classifying learner errors as negative language transfer, go to the nlt-classification repository.

Identifying negative language transfer in learner errors using POS information

In this work, learner errors were converted into part-of-speech (POS) tag sequences and analysed by language models that represented POS tag sequences found in the first and second languages. To learn more about using POS tag sequences to detect negative language transfer, follow this link to the identifying-nlt repository.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
identifying-nlt @ 85236df		identifying-nlt @ 85236df
nlt-classification @ 51affa4		nlt-classification @ 51affa4
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
data_description.md		data_description.md
extract_chinese_data.py		extract_chinese_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LanguageTransfer

Negative language transfer dataset

Data format

Citation

Negative language transfer classification

Identifying negative language transfer in learner errors using POS information

About

Releases

Packages

Contributors 2

Languages

EdTeKLA/LanguageTransfer

Folders and files

Latest commit

History

Repository files navigation

LanguageTransfer

Negative language transfer dataset

Data format

Citation

Negative language transfer classification

Identifying negative language transfer in learner errors using POS information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages