GitHub - imperialite/cmcl2022-unified-eye-tracking-ipa

NU HLT at CMCL 2022 Shared Task: Multilingual and Crosslingual Prediction of Human Reading Behavior in Universal Language Space

This repository contains the singly Python notebook for extracting features for eye-tracking prediction such as frequencies, n-grams, information theoretic, and psycholinguistically-motivated predictors. From the title, it is worth noting that these feature values were extracted from the coverted IPA form of the words.

Paper: https://arxiv.org/abs/2202.10855

Requirements

Epitran for converting words to IPA form. Can be done for English, German, Hindi, Dutch, Russian, and Mandarin.
Imageability and concreteness estimates from word embedding from the work of Ljubešić et al, 2018. Download the files here.
If you want to reproduce the results for the crosslingual task, you need phonetic transcriptions of the surprise language (Danish) data. You may subscribe to this paid service or.... you may email me for the file 😉.

Please refer to the official Shared Task website for more information and to get the train/valid/test dataset: https://cmclorg.github.io/shared_task

Contact

If you need any help reproducing the results, please don't hesitate to contact me through

Joseph Marvin Imperial
jrimperial@national-u.edu.ph
www.josephimperial.com

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
CMCL_Cross_Lingual_Eye_Tracking_Feature_Extraction.ipynb		CMCL_Cross_Lingual_Eye_Tracking_Feature_Extraction.ipynb
LICENSE		LICENSE
README.md		README.md
bigram_dict.csv		bigram_dict.csv
cmcl_ipa_sentences.txt		cmcl_ipa_sentences.txt
trigram_dict.csv		trigram_dict.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NU HLT at CMCL 2022 Shared Task: Multilingual and Crosslingual Prediction of Human Reading Behavior in Universal Language Space

Requirements

Contact

About

Releases

Packages

Languages

License

imperialite/cmcl2022-unified-eye-tracking-ipa

Folders and files

Latest commit

History

Repository files navigation

NU HLT at CMCL 2022 Shared Task: Multilingual and Crosslingual Prediction of Human Reading Behavior in Universal Language Space

Requirements

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages