-
Notifications
You must be signed in to change notification settings - Fork 0
/
readme.txt
21 lines (17 loc) · 920 Bytes
/
readme.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Estonian TIMEX Annotated Corpora
* ERY2012_t3-olp-ajav_modified
113 texts from the Reference Corpus of Estonian, with manually
corrected temporal expression annotations. Texts cover various
subgenres: news, historical articles, parliament transcripts,
and legalese texts.
See "ERY2012_t3-olp-ajav_modified/readme.txt" for details.
* MThesis_2010_tml_mod_2
315 Estonian newspaper articles with manually corrected temporal
expression annotations. Majority of the articles come from the
Reference Corpus of Estonian; a small part comes from an online
news portal.
See "MThesis_2010_tml_mod_2/readme.txt" for details.
* scripts
Scripts for converting TIMEX annotated corpora to EstNLTK's JSON
files, and for evaluating EstNLTK's TimexTagger on the corpus.
EstNLTK v1.6.6+ is required for running the scripts.