Releases · tensorflow/text

08 Nov 20:22

broken

v2.0.0

cc52342

v2.0.0

Major Updates

Added a regex_split op.
Fixes a bug in case_fold_utf8 and normalize_utf8 ops where they were unable to locate the ICU data file.
Fixed a problem with the BertTokenizer where it was using merge_dims which is unreleased for the corresponding version of TensorFlow.
Updated the BertTokenizer to use regex_split to match the exact regex used by original BERT.

Assets 2

08 Nov 20:21

broken

v1.15.0

2885536

v1.15.0

Major Updates

Added a regex_split op.
Fixes a bug in case_fold_utf8 and normalize_utf8 ops where they were unable to locate the ICU data file.
Fixed a problem with the BertTokenizer where it was using merge_dims which is unreleased for the corresponding version of TensorFlow.
Updated the BertTokenizer to use regex_split to match the exact regex used by original BERT.

Assets 2

19 Oct 00:08

broken

v2.0.0-rc0

0bfb819

v2.0.0-rc0

Please note that moving forward our releases and branches will match the major & minor versions of core TensorFlow. This should prevent future confusion. As such, this (previously 1.0) release is 2.0, and we will be skiping straight to 1.15 for the next 1.x release to support TF 1.15.

Major Updates:

SentencepieceTokenizer has been added. Please see https://github.com/google/sentencepiece for more information on Sentencepiece.
New ToDense Keras layer for RaggedTensor conversion
Pipeline for generating a Wordpiece Vocabulary has been added to tools.
New Rouge-L metric op for measuring text similarity. A new colab has been added to the examples directory which provides usage examples.
New BertTokenizer which mimics the preprocessing performed in the original BERT model.
New Detokenizer abstract class has been added to the TF.Text Tokenizer API.
Many previously released ops have been added TensorFlow Serving model server. Please see https://github.com/tensorflow/serving for more information.

Minor Updates:

API docs have received an update that should make finding relevant information easier.
Wordpiece: Add support for splitting unknown characters
Wordpiece: Add support for max characters per token
Wordshape: Fix finding of currency symbols.
Update Whitespace & UnicodeScript Tokenizers to accept scalar values.
Build includes CC library targets. Useful for statically linking in TF.Text custom ops. Specifically useful for building into TF.Serving's model server.
Build environment: Updated to match core TF's update.

Assets 2

19 Oct 00:13

broken

v1.15.0-rc0

50b4df4

v1.15.0-rc0

Please note that moving forward our releases and branches will match the major & minor versions of core TensorFlow. This should prevent future confusion. As such, we are skipping straight to v1.15 for our TF 1.15 support.

Major Updates:

SentencepieceTokenizer has been added. Please see https://github.com/google/sentencepiece for more information on Sentencepiece.
New ToDense Keras layer for RaggedTensor conversion
Pipeline for generating a Wordpiece Vocabulary has been added to tools.
New Rouge-L metric op for measuring text similarity. A new colab has been added to the examples directory which provides usage examples.
New BertTokenizer which mimics the preprocessing performed in the original BERT model.
New Detokenizer abstract class has been added to the TF.Text Tokenizer API.

Minor Updates:

API docs have received an update that should make finding relevant information easier.
Update Whitespace & UnicodeScript Tokenizers to accept scalar values.
Build includes CC library targets. Useful for statically linking in TF.Text custom ops. Specifically useful for building into TF.Serving's model server.
Build environment: Updated to match core TF's update.
Many previously released ops have been added TensorFlow Serving model server, and should be in a coming 1.x release. Please see https://github.com/tensorflow/serving for more information.

Assets 2

09 Oct 00:03

broken

v0.1.0

bc60c2e

v0.1.0 (TF 1.14)

Minor Updates:

Wordpiece: Add support for splitting unknown characters
Wordpiece: Add support for max characters per token
Wordshape: Fix finding of currency symbols

Assets 2

01 Aug 19:40

broken

v0.1.0-rc2

b20aa71

v0.1.0-rc2 Pre-release

Pre-release

This is a TF 1.14 compatible release that has everything in TF.Text 1.0.0-beta2. Most ongoing TF.Text development is for TensorFlow 2.0, but we wanted to provide a library for those that have not transitioned yet.

Assets 2

01 Aug 19:16

broken

v1.0.0-beta2

3a85d45

v1.0.0-beta2 Pre-release

Pre-release

Major updates:

Fixes problem in build from beta1

Assets 2

23 Jul 18:53

broken

v1.0.0-beta1

76002bd

v1.0.0-beta1 Pre-release

Pre-release

Major updates:

Include data necessary for normalize & case folding ops.

Minor updates:

Update unit tests to work with Python3.
Fix some wordpiece corner case bugs.
Wordpiece efficiency improvements.
Add missing shape_fn to Wordpiece's tokenizeWithOffsets.

Assets 2

10 Jun 04:29

broken

v1.0.0-beta0

f49331f

v1.0.0-beta0 Pre-release

Pre-release

Initial prerelease for TF.Text library.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major Updates

Major Updates

Releases: tensorflow/text

v2.0.0

Major Updates

v1.15.0

Major Updates

v2.0.0-rc0

v1.15.0-rc0

v0.1.0 (TF 1.14)

v0.1.0-rc2

v1.0.0-beta2

v1.0.0-beta1

v1.0.0-beta0