v2.12.0
Release 2.12.0
Major Features and Improvements
- New PhraseTokenizer.
- New ByteSplitter.split_by_offsets which splits a string using byte offsets.
- New
concatenate_segments
op.
Bug Fixes and Other Changes
- Updated kernel code and Python API for BoiseTagsToOffsets op
- Fix the bug that we should not re-build the config in the create function.
- Register kernel and ops for phrase tokenizer.
- fix the issue of conversion.
- Fix typos in nmt_with_attention.ipynb
- MacOS TF library was renamed. Update build configuration.
- Update tokenization_layers_test.py
- (Generated change) Update tf.Text versions and/or docs.
- Update TF Text's TF Lite guide with ops that are convertible to TF Lite.
- Update transformer test size.
- Fix typos in uncertainty_quantification_with_sngp_bert.ipynb
- (Generated change) Update tf.Text versions and/or docs.
- Adds LastNItemSelector an ItemSelector that selects the last n items in the batch.
- Temporarily remove tests for EOS offset since this is being changed in SP.
- Update test files for new ICU version.
- New helper function in the Op Kernel Shim for writing out data to the output tensors.
- Adds configuration flags to enable switch to Fast Wordpiece Tokenizer implementation alternative for on device
- New kernels to enable TF Lite conversion for SentenceFragmenterV2 op.
- Fix possible heap overflow bug in sentence fragmenter op.
- Deprecate PY37 support for TF-Text
- Fix BUILD file by moving tf dep in the appropriate place for FBN to prevent conflicts when building on mobile.
- Clean up a couple dependencies in the kernel BUILD file.
- C++ API for new kernel for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
- New kernels for the RoundRobinTrimmer which fixes a bug and makes it available for conversion to TF Lite.
- Add two functions to implementations of the OpKernelShim for accessing the name & doc string. Accessing internals directly causes problems when trying to use techniques like Object composition as the op template. In particular, this change is needed for improvements to the polymorphic wrapper.
- Allow int32 or int64 as types for RoundRobinTrimmer ops' splits.
- Extend RoundRobinTrimmer kernels to allow any type as the value.
- Return empty results if get_offsets is false.
- Skip-uncompressing of bazel to try and locate error for mac ci tests.
- Fix scraping full commit from short commit sha
- Update tensorflow-text notebooks from 2.8 to 2.11
- Fix bazel version scrapping logic for .bazelversion in install_bazel.sh
- Fix conditional so it works better with Apple silicon. See issue #1077 for more details.
- Force osname check to always be in lower-case. See #1077
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
synandi, tilakrayal