Releases: bowersd/otw
Initial suite of analyzers for Nishnaabemwin
This is a set of 6 finite state transducers, for analyzing Ojibwe, both the syncopated Nishnaabemwin dialects and the unsyncopated Anishinaabemowin dialects. Most of the development focus has been on the transducers for Nishnaabemwin, the unsyncopated Anishinaabemowin transducers are created by simply compiling after turning off the syncope rules.
The 6 transducers are set up to read the language and output narrow grammatical tags. See https://bowersd.github.io/textAnalysis/explanation.html for descriptions of the narrow grammatical tags. The analyzers vary in what dialects, spelling conventions, and degree of strictness to the conventions are expected. Specifically, there are:
- Syncopated/vowel deletion analyzers
- Base "Rhodes-style" analyzer (reflecting the spelling conventions used in the 1985 dictionary by Richard Rhodes)
- Relaxed "Rhodes-style" analyzer (relaxing the spelling conventions used in the Rhodes dictionary)
- Base "Corbiere-style" analyzer (reflecting the spelling conventions popularized by Mary Ann Corbiere, and used in the Nishnaabemwin Online Dictionary)
- Relaxed "Corbiere-style" analyzer
- Unsyncopated/full vowel analyzers
- Base "Unsyncopated" analyzer (primarily lacking the syncope/vowel deletion rule, as in the texts of Angeline Williams and many Western Ojibwe dialects)
- Relaxed "Unsyncopated" analyzer (this relaxes the spelling conventions in a way that is most intuitive for syncopating dialects, and so may be of marginal usefulness)
The finite state transducers are compiled in the HFST optimized lookup format (hfstol). As such, they can be used by the HFST command line tools, or in scripts using the HFST Python API. They can also be used by PyHFST (which does not use plain hfst files). They cannot be inverted. Inverted versions will be added in later releases.