Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treebank: option to show all inflections, highlighting the one from the tree #540

Closed
balmas opened this issue Oct 14, 2020 · 12 comments
Closed
Labels

Comments

@balmas
Copy link
Member

balmas commented Oct 14, 2020

What Alpheios does is disambiguate the lexemes identified by the parser with the lexeme in the treebank.

For the lexemes themselves (i.e. the distinct lemma entries, as far as we can determine them) it identifies which lexeme is the one chosen by the treebank with the little red triangle, but still shows the other lexemes.

However, for the inflections within the disambiguated lexeme, it filters the inflections to show only the one that was chosen by the treebank.

@vgorman1 requests that we have the possibility to show the various possible inflections and mark the one in the tree with a similar triangle.

@balmas balmas added enhancement New feature or request treebank labels Oct 14, 2020
@balmas balmas self-assigned this Dec 9, 2020
balmas pushed a commit that referenced this issue Jan 5, 2021
keep  all inflections, display disambiguated inflection separately
balmas pushed a commit that referenced this issue Jan 5, 2021
added classes to inflection sets in morph for display flexibility
balmas pushed a commit that referenced this issue Jan 6, 2021
@balmas balmas mentioned this issue Jan 6, 2021
@balmas
Copy link
Member Author

balmas commented Jan 6, 2021

this is fixed in Alpheios Components 3.3.1-qa.20210106574

For now, the default has been changed to always display all inflections, and to highlight the selected inflection with the same icons we use for the lemma:

c6f057a5-bbc9-4357-8cde-7030030eb139

It's not perfect -- when there are more than 1 possible inflection identified by morpheus, those are displayed below the selected one from the tree, including (again) the one from the tree. Ideally, I would like to dedupe the inflection from the tree out of the list of the inflections identified by morpheus, but that is too big of a change for me to make right now.

Still to be determined is whether we want this changed default behavior to apply to the treebanked texts at texts.alpheios.net -- need to verify that with @abrasax - but it's ready to test otherwise.

You can use treebanked texts at texts-test.alpheios.net to test as well as the treebank-test page at https://alpheios-misc-dev.s3.us-east-2.amazonaws.com/treebank-test-page/test.html

@balmas balmas closed this as completed Jan 6, 2021
@balmas balmas assigned monzug and unassigned balmas Jan 6, 2021
@monzug
Copy link
Contributor

monzug commented Jan 7, 2021

I really like this enhancement!!!

@monzug
Copy link
Contributor

monzug commented Jan 11, 2021

in the case of nullis (first sentence of https://texts-test.alpheios.net/text/urn:cts:latinLit:phi0620.phi001.alpheios-text-lat1/passage/1.1), we are adding again the inflection. see screenshot
as Bridget said: It's not perfect -- when there are more than 1 possible inflection identified by morpheus, those are displayed below the selected one from the tree, including (again) the one from the tree. Ideally, I would like to dedupe the inflection from the tree out of the list of the inflections identified by morpheus, but that is too big of a change for me to make right now.

in texts-test version vs the live one
nullis-cynthia

Screen Shot 2021-01-11 at 3 41 38 PM

@monzug
Copy link
Contributor

monzug commented Jan 11, 2021

In the case of fugiendo (line 9), we are adding too much, I believe.

fugiendo

this is the line from the treebank xml file:
<word id="3" form="fugiendo" lemma="fugio1" postag="v-spgpnb-" head="10" relation="ADV"/>

so, my question is: are we ok to add the voice pres. pass to a verb that is only fut. pass.?

@monzug monzug assigned balmas and unassigned monzug Jan 11, 2021
@monzug monzug added question Further information is requested and removed waiting_verification labels Jan 11, 2021
@balmas
Copy link
Member Author

balmas commented Jan 11, 2021

Well our requirements right now call us to consider the treebank data, which is manually annotated, to be considered more correct than the parser output. So the code is doing the expected thing here -- it's adding the present passive inflection provided by the treebank to the form, and saying that's the "correct" one.

@monzug
Copy link
Contributor

monzug commented Jan 11, 2021 via email

@balmas
Copy link
Member Author

balmas commented Jan 11, 2021

When you say "form is in morphology" do you mean, check that the form is one that was returned by the morphology parser (in this case whitaker)?

If so, the problem is that we don't have a way to differentiate which source is right -- i.e. the parser or the treebank. Ultimately this is why we need active annotation support in Alpheios, which is the subject of our next major release. (see lengthy ongoing discussions of the design to support that at alpheios-project/documentation#40 )

@vgorman1
Copy link

vgorman1 commented Jan 11, 2021 via email

@balmas
Copy link
Member Author

balmas commented Jan 11, 2021

another interesting point here, which @rgorman helped clarify -- the treebank reports this as mood=gerundive. The Whitake parser used by Alpheios reports all Gerunds as verb participles (see alpheios-project/morphsvc#11).

I think perhaps we should make a change to the Alpheios treebank adapter to consider a Latin verb with mood=gerundive as being the same as a verb participle. That way we will at least be comparing apples to apples. We'd still have a disconnect here, because the tense in the treebank is present. But we would at least be matching the part of speech.

@monzug
Copy link
Contributor

monzug commented Jan 12, 2021

gerundive verbs have been hunting me for ever! issue #608 is very welcome.

@balmas
Copy link
Member Author

balmas commented Jan 12, 2021

Wrt to #540 (comment) there are actually two issues here, one which I didn't see at first

(1)
Morphology Service says the lemma has inflection A and inflection B
Treebank says it is inflection B
We should show inflection B only once in the popup.

Originally I thought that was what was going on with nullis. That has to wait for the changes we're making to support annotations. However, looking more closely at the output, I realize that is actually a different scenario:

(2)
Morphology service says inflection A m,f,n
Treebank service says inflection A f
We should recognize that Inflection A f is not different than Inflection A m, f n

This scenario is more similar to #608 and #609. Will enter a new issue for it.

@monzug monzug added verified and removed question Further information is requested labels Jan 12, 2021
@monzug
Copy link
Contributor

monzug commented Jan 12, 2021

verified. All comments have been addressed in separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants