Annotating tone #92

iljackb · 2019-11-03T13:59:24Z

The way that I annotate by default is to tag the orthography. Given that there are many items that in Mixtec don't explicitly mark certain features, the annotations are underspecific as to what is expressing the given feature, eg. in the example below the verb "sketa" is actually present tense and 1sg which don't show up in the orthography, but the entire form is just tagged for those features:

            <u who="#TS" xml:id="d1e112" n="2" start="1.48" end="2.98" xml:lang="mix">
               <seg xml:lang="mix" xml:id="d1e113" notation="orth" type="S">
                  <w xml:id="d1e114" synch="#T14">sketa</w>
                  <w xml:id="d1e116" synch="#T19">ntikii</w>
               </seg>
               ......
            </u>
            <spanGrp type="annotations">
                ....
               <span type="translation" target="#d1e114" xml:lang="en" ana="#INFL">I run</span>
               <span type="translation" target="#d1e114" xml:lang="es" ana="#INFL">corro</span>
               <span type="gram" target="#d1e114" ana="#V #INTRANS #INCOMPL #1PERS #SG"/>
               .........
            </spanGrp>

If however there is a phonetic transcription included, I tag both the orthographic forms (as above) as well as explicitly tagging the tone contours (encoded as <m> with @xml:id's), which specifically labels the linguistic feature.

            <u who="#TS" xml:id="d1e112" n="2" start="1.48" end="2.98" xml:lang="mix">
               <seg xml:lang="mix" xml:id="d1e113" notation="orth" type="S">
                  <w xml:id="d1e114" synch="#T14">sketa</w>
                  <w xml:id="d1e116" synch="#T19">ntikii</w>
               </seg>
               <seg xml:lang="mix" xml:id="d1e118" notation="ipa" type="S" sameAs="#d1e113">
                  <w xml:id="d1e119" synch="#T14" sameAs="#d1e114">skɛ<m xml:id="d1e225">˥</m>t̪a<m xml:id="d1e120">↘</m></w>
                  <w xml:id="d1e132" synch="#T19" sameAs="#d1e116">nd̪i↘kiː↘↗ꜛ</w>
               </seg>
            </u>
            <spanGrp type="annotations">
                 ....
               <span type="translation" target="#d1e114" xml:lang="en" ana="#INFL">I run</span>
               <span type="translation" target="#d1e114" xml:lang="es" ana="#INFL">corro</span>
               <span type="gram" target="#d1e114" ana="#V #INTRANS #INCOMPL #1PERS #SG"/>
               <span type="gram" target="#d1e125" ana="#INCOMPL"/>
               <span type="gram" target="#d1e120" ana="#1PERS #SG"/>
                 ....
            </spanGrp>

However, I'm not sure what value of  to give it (currently labeling it "gram" the same as the general grammatical annotations, but I'm wondering if I should call it "tone" or something so that a retrieval script can just look for the presence of a  value rather that looking at whether the target is a <m> which is an ancestor of //seg[@notation='ipa']..

@laurent, what do you think?

The text was updated successfully, but these errors were encountered:

iljackb · 2019-11-04T15:52:32Z

solution is to use , this requires a schema alteration and for  to be added to att.typed.

I am thinking that there should be at least two possible values of @subtype, the first "tone" (for the case discussed above in this issue) and the other possibly "morph" for when pointing to a morphological unit on an inflected, or maybe derived form.

Here is an example showing both uses of @subtype. to tag:

the presence of the future/potentive prefix "kun-" (which is realized phonetically as "ũː↗↘") in front of the verb, but which is only tagged in the phonetic transcription (annotated below as: ):
and
The presence of the tone inflection marking 1st person singular on the verb, which isn't marked in the orthography, annotated below as :

               <seg xml:lang="mix" xml:id="d1e41" notation="orth" type="phrase">
                  <w xml:id="d1e42" synch="#T2">kunkanta</w>
               </seg>
               <seg xml:lang="mix" xml:id="d1e46" notation="ipa" type="phrase" sameAs="#d1e41">
                  <w xml:id="d1e47" synch="#T1" sameAs="#d1e42"><m xml:id="d1e157">ũː↗↘</m>k̬a˩nd̪a<m xml:id="d1e172">˩</m></w>
               </seg>
            </u>
            <spanGrp type="annotations">
               <span type="translation" target="#d1e42" xml:lang="en" ana="#INFL">I will jump</span>
               <span type="translation" target="#d1e42" xml:lang="es" ana="#INFL">saltaré</span>
               <span type="translation" target="#d1e42" xml:lang="es" ana="#INFL">voy a saltar</span>
               <span type="gram" target="#d1e42" ana="#V #INTRANS #FUT #1PERS #SG">
                  <gloss type="igt">fut- jump\1s</gloss>
               </span>
               <span type="gram" subtype="morph" target="#d1e157" ana="#FUT"/>
               <span type="gram" subtype="tone" target="#d1e172" ana="#1PERS #SG"/>
            </spanGrp>

Note that (in relation to issue #93 ), the <gloss type="igt"> will still only be placed in the 's annotating the orthographic content

iljackb added help wanted to-do linguistic issues pertaining to linguistic description labels Nov 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotating tone #92

Annotating tone #92

iljackb commented Nov 3, 2019

iljackb commented Nov 4, 2019

Annotating tone #92

Annotating tone #92

Comments

iljackb commented Nov 3, 2019

iljackb commented Nov 4, 2019