Skip to content
This repository has been archived by the owner on May 7, 2021. It is now read-only.

strip header, extra attributes from published treebank files #19

Open
balmas opened this issue May 2, 2016 · 0 comments
Open

strip header, extra attributes from published treebank files #19

balmas opened this issue May 2, 2016 · 0 comments
Assignees

Comments

@balmas
Copy link
Collaborator

balmas commented May 2, 2016

From alpheios-project/arethusa#748:

When I try to import into Arethusa a treebank file from the Perseus Latin repository (e. g. phi1221.phi007.perseus-lat1.tb.xml from https://github.com/PerseusDL/treebank_data/blob/master/v2.1/Latin/texts/phi1221.phi007.perseus-lat1.tb.xml) using the "Upload Base XML Treebank / from file" button, I get the message: ERROR!! CHANGES NOT SAVED! errorunexpected attribute "oldId". When I change the file, removing the header element (with all its children) and body (I put in the annotator element from one of my exported treebank annotations), the file is read and displayed OK.
The users often want to review or change already annotated trees from the "gold standard". Arethusa is the logical choice of environment to do so. Perhaps this could be achieved with an XSL or XQuery stylesheet layer which would, on import, strip out from the base XML treebank file everything above the sentence element, and add a treebank element that is acceptable to Arethusa.

@balmas balmas self-assigned this May 2, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant