SunoikisisDC 2019, University of Zagreb: From Annotated Text to Vocabulary Exercises

Authors: Neven Jovanović, Petar Soldo, Department of Classical Philology, Faculty of Humanities and Social Sciences, University of Zagreb, Croatia

A Sunoikisis Digital Classics Session, Summer 2019

Zenodo record 3244012

Synopsis

Demonstrate how to use BaseX and XQuery to produce Anki spaced repetition vocabulary exercises from a set of morphologically annotated and lemmatized short texts in Greek.

Concentrate on reoccurring words, and on words which are very frequent in Greek (according to the Dickinson College Core Vocabulary list).

Produce three types of exercises:

from the form to the lemma
from the form to the grammatical description
from words in the text to entries in the DC Greek Core vocabulary list (Croatian version, converted to XML)

How to use

The Greek texts, annotated in Arethusa (on Perseids), are in data directory.

The Croatian translation of Greek and Latin DC Core lists, converted to XML with some additional fields, is in grclatcore

The BaseX scripts are in scripts.

Activities

Create the main database sunGreek with linguistically annotated Greek texts: createDbGreek.xq
Create a DCC Greek list (with Croatian translations) as a BaseX database grclatcore: createDbGrcLatCore.xq

Analyze the collection

For a given lemma, get a list of forms and POS tags in the collection: forLemmaGetFormPOStag.xq
Create a list of lemmata: findLemma.xq
Create a list of lemmata, order by frequency: findLemmaFrequency.xq
Narrow the list to lemmata whose forms occur at least twice (and exclude punctuation): findLemmaFrequencyTwoPlus.xq
Explore frequencies of linguistic annotations: getFrequenciesAttributes.xq (lemma, form, postag)

From repeated lemmata to Anki exercises

For lemmata where f >= 2, get a list of occurring forms: fromLemmaToForms.xq
For a pair of form and lemma, produce an Anki exercise: fromLemmaToAnki.xq
Narrow to a specific number of occurrences: fromLemmaToAnkiNarrowNumber.xq
Narrow to specific types of words (e. g. just inflected words: nouns, verbs, adjectives, pronouns): fromLemmaToAnkiNarrowMorphology.xq

Here a list of codes / attributes used for Greek in Arethusa is quite helpful.

From POS tags to Anki exercises

Create a list of morphological descriptions (parts of speech, POS tags): findPOStag.xq
Get frequency of morphological configurations: findPOStagFrequency.xq
Select only POS tags for inflected forms, select frequent configurations (e. g. where f >= 14): findPOStagInflectedFrequency.xq
For a set of POS tags, get forms, lemma, POS: retrievePOS.xq
Produce Anki exercises asking for the lemma and morphological description of a given form: retrievePOSmapToWords.xq (with Arethusa / Alpheios morphological codes expanded)

From one text to vocabulary reoccurring in other texts

Get vocabulary of one text: vocabularyOneText.xq
Find lemmata reoccurring in other texts: vocabularyRepeatedInOtherTexts.xq
Prepare Anki exercises for such lemmata: vocabularyRepeatedInOtherTexts.xq

From vocabulary to the DCC Greek list

Find all DCC lemmata occurring in our texts: findWordsInDCCore.xq
Produce a set of Anki exercises for these lemmata: DCCoreToAnki.xq

Anki

About the program: the Anki User Manual

Form of exercises to be imported into Anki (no field names necessary; the "tag" field can be omitted):


question ; answer ; tag
αὐτός αὐτή αὐτό ; on, isti ; grmorf01
καί ; i ; grmorf01
δέ ; a ; grmorf01
οὗτος αὕτη τοῦτο ; ovaj ; grmorf01

The results of BaseX scripts (...ToAnki) can be saved as text files (extension is not important), edited in a text editor (recommended, but just for pedagogical reasons -- to select what we want to teach and learn), and then imported into the Anki database (File / Import).

For better control, it is recommended to first add new user to Anki (Add / Open on the welcome screen).

License

CC-BY

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
grclatcore		grclatcore
scripts		scripts
slides-xelatex		slides-xelatex
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SunoikisisDC 2019, University of Zagreb: From Annotated Text to Vocabulary Exercises

Synopsis

How to use

Activities

Analyze the collection

From repeated lemmata to Anki exercises

From POS tags to Anki exercises

From one text to vocabulary reoccurring in other texts

From vocabulary to the DCC Greek list

Anki

License

About

Releases 2

Packages

Languages

License

nevenjovanovic/sunoikisis2019zg-eklogai

Folders and files

Latest commit

History

Repository files navigation

SunoikisisDC 2019, University of Zagreb: From Annotated Text to Vocabulary Exercises

Synopsis

How to use

Activities

Analyze the collection

From repeated lemmata to Anki exercises

From POS tags to Anki exercises

From one text to vocabulary reoccurring in other texts

From vocabulary to the DCC Greek list

Anki

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages