Skip to content

Latest commit

 

History

History
60 lines (46 loc) · 1.67 KB

CHANGELOG.md

File metadata and controls

60 lines (46 loc) · 1.67 KB

Change log

All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning. It follows some conventions.

[1.7.1] - 20221209

Fixed

  • fixed analysis of ewts string starting with space returning no token

[1.5.0] - 20190614

Added

  • added PaBaFilter and use it in the constructor

Changed

  • remove bo from the list of stop words to avoid side effects with the PaBaFilter

[1.4.4] - 20181203

Fixed

  • fixed fetching of jar file

[1.4.3] - 20181023

Fixed

  • fixed offsets of ewts conversion
  • fixed resource inclusion

Added

  • possibility to index DTS and ALALC encodings

[1.2.0] - 20180118

Fixed

  • fixed offsets of ewts conversion

[1.2.0] - 20180118

Fixed

  • fixed offsets of ewts conversion

Added

  • Tibetan lexicon now included in the .jar file
  • maven option -DincludeDeps=true to include ewts-converter
  • possibility to specify a stop word list file in the constructor
  • fromEwts replaced by inputMode in constructor, see README.md, allowing DTS and ALA-LC Transliteration schemas

[1.1.1] - 2017-09-20

Fixed

  • fixed constructor

[1.1.0] - 2017-09-20

Added

  • possibility to index EWTS text

[1.0.0] - 2017-04-14

Added

  • Maven packaging
  • TibAffixedFilter: handle affixed འིའོ, འམ and འང

Fixed

  • TibAffixedFilter: when removing an affixed particle, keep the suffix འ if it was in the original syllable (ex: དགའི -> དགའ)

Changed

  • all: adaptation to Lucene 6.4.1
  • TibSyllableTokenizer: consider characters in range Ux0F84 - Ux0F8F to be part of the syllable