Release 2.7.0-rc1

Major Features and Improvements

Added new tokenizer: FastWordpieceTokenizer that is considerably faster than the original WordpieceTokenizer
WhitespaceTokenizer was rewritten to increase speed and smaller kernel size
Ability to convert WhitespaceTokenizer & FastWordpieceTokenizer to TF Lite
Added Keras layers for tokenizers: UnicodeScript, Whitespace, & Wordpiece

Bug Fixes and Other Changes

(Generated change) Update tf.Text versions and/or docs.
tiny change for variable name in transformer tutorial
Update nmt_with_attention.ipynb
Add vocab_size for wordpiece tokenizer to have consistency with sentence piece.
This is a general clean up to the build files. The previous tf_deps paradigm was confusing. By encapsulating everything into a single call lib, I'm hoping this makes it easier to understand and follow.
This adds the builder for the new WhitespaceTokenizer config cache. This is the first in a series of changes to update the WST for mobile.
C++ API for new WhitespaceTokenizer. The updated API is more useful (accepts strings instead of ints), faster, and smaller in size.
Adds pywrap for WhitespaceTokenizer config builder.
Simplify the configure.bzl. Since for each platform we build with C++14, let's just make it easier to default to it across the board. This should be easier to understand and maintain.
Remove most of the default oss deps for kernels as they are no longer required for building.
Updating this BERT tutorial to use model subclassing (easier for students to hack on it this way).
Adds kernels for TF & TFLite for the new WhitespaceTokenizer.
Fix a problem with the WST template that was causing members to be exported as undefined symbols. After this change they become a unique global symbol in the shared object file.
Update whitespace op to use new kernel. This change still allows for building the old kernel as well so current users can continue to use it, even though we cannot make new calls to it.
Update whitespace op to use new kernel. This change still allows for building the old kernel as well so current users can continue to use it, even though we cannot make new calls to it.
Convert the TFLite kernel for ngram with STRING_JOIN mode to use tfshim so the same code is now used for TF and TFLite kernels.
fix: masked_ids -> masked_lm_ids
Save the transformer.
Remove the sentencepiece patch in OSS
fix vocab_table arg is not used in bert_pretrain_preprocess()
Disable TSAN for one more tutorial test that may run for >900sec when TSAN is
Remove the sentencepiece patch in OSS
internal
(Generated change) Update tf.Text versions and/or docs.
Update deps to fix broken build.
Remove --gen_report flag.
Small typo fixed
Explain that all heads are handled with a single Dense layer
internal change, should be a noop in github.
Update whitespace op to use new kernel. This change still allows for building the old kernel as well so current users can continue to use it, even though we cannot make new calls to it.
Creates tf Lite registrar and adds TF Lite tests for mobile ops.
Fix nmt_with_attention start_index
Export LD_LIBRARY_PATH when configuring for build.
Update tf lite test to use the function rather than having to globally share the linked library symbols so the interpreter can find the name since this is only available on linux.
Temporarily switch to the definition of REGISTER_TF_OP_SHIM while it updates.
Update REGISTER_TF_OP_SHIM macro to remove unnecessary parameter.
Remove temporary code and set back to using the op shim macro.
Updated import statement
Internal change
pushed back forward compatibility date for tf_text.WhitespaceTokenizer.
Add .gitignore
The --keep_going flag will make bazel run all tests instead of stopping
Add missing blank line between test and doctest.
Adds a regression test for model server for the replaced WST op. This ensures that current models using the old kernel will continue to work.
Fix the build by adding a new dependency required by TF to kernel targets.
Add sentenepiece detokenize op to stateful allowlist.
Fix broken build. This occurred because of a change on TF that updated the compiler infra version (tensorflow/tensorflow@e0940f2).
Clean up code now that the build horizon has passed.
Add pywrap dependency for tflite ops.
Update TextVectorization layer
Allows overridden get_selectable to be used.
Update version

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Aaron Siddhartha Mondal, Abhijeet Manhas, Dominik Schlösser, jaymessina3, Mao, Xiaoquan Kong, Yasir Modak

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.7.0-rc1

Release 2.7.0-rc1

Major Features and Improvements

Bug Fixes and Other Changes

Thanks to our Contributors