Add/use @lemma to <w> tokens in corpus #114
Labels
enhancement
final output goals
Goals for tasks to do to achieve best possible output of project and contribution to community
to-do
This will greatly enhance the content of the corpus however major decisions have to be made about what form to reference as the lemma. Given the homographs due to tone (and lack of representation thereof in orthography adopted), this would probably require tone diacritics to be used as minimal distinguishing markers to be able to have entirely unique forms in the @lemma.
More study and planning needed.
The text was updated successfully, but these errors were encountered: