Engine

Jump to bottom

Niko Partanen edited this page Mar 20, 2017 · 1 revision

This page documents conventions, standards and relevant workflows used for the annotation of our corpus data with the help of the Giellatekno toolkit, specifically an annotation engine incl. preprocessing-tokenizing, morphological analysis and disambiguation.

Intro

FST is…

Workflows

ELAN-->FST-->ELAN

Scripts

sending ELAN-data to the analyzer
sending analyzed data back into ELAN
creating new tiers and annotations on them based to search results on higher level tier