Skip to content

1.3.3: Custom Word Tokenizer

Compare
Choose a tag to compare
@SoluMilken SoluMilken released this 13 Mar 06:47
· 106 commits to master since this release

Add a new operation: CustomWordTokenizer

This operation tokenizes the input string according to the given user words.
If the substring of the input string matches the user words, it would be chunked as a single token.
Otherwise, the substring would be tokenized as a list of characters.