Release Release 0.8.0 · tensorflow/transform

Major Features and Improvements

Add TFTransformOutput utility class that wraps the output of tf.Transform for
use in training. This makes it easier to consume the output written by
tf.Transform (see update examples for usage).
Increase efficiency of quantiles (and therefore bucketize).

Change tft.sum/tft.mean/tft.var to only support basic numeric types.
Widen the output type of tft.sum for some input types to avoid overflow
and/or to preserve precision.
For int32 and int64 input types, change the output type of tft.mean/
tft.var/tft.scale_to_z_score from float64 to float32 .
Change the output type of tft.size to be always int64.
Context now accepts passthrough_keys which can be used when additional
information should be attached to dataset instances in the pipeline which
should not be part of the transformation graph, for example: instance keys.
In addition to using TFTransformOutput, the examples demonstrate new workflows
where a vocabulary is computed, but not applied, in the preprocessing_fn.
Added dependency on the absl-py package.
TransformTestCase test cases can now be parameterized.
Add support for partitioned variables when loading a model.
Export the coders subpackage so that users can access it as tft.coders,
e.g. tft.coders.ExampleProtoCoder.
Setting dtypes for numpy arrays in tft.coders.ExampleProtoCoder and
tft.coders.CsvCoder.
tft.mean, tft.max and tft.var now support tf.SparseTensor.
Update examples to use "core" TensorFlow estimator API (tf.estimator).
Depends on protobuf>=3.6.0<4.

apply_saved_transform is removed. See note on
partially_apply_saved_transform in the Deprecations section.
No longer set vocabulary_file in IntDomain when using
tft.compute_and_apply_vocabulary or tft.apply_vocabulary.
Requires pre-installed TensorFlow >=1.8,<2.

The expected_asset_file_contents of
TransformTestCase.assertAnalyzeAndTransformResults has been deprecated, use
expected_vocab_file_contents instead.
transform_fn_io.TRANSFORMED_METADATA_DIR and
transform_fn_io.TRANSFORM_FN_DIR should not be used, they are now aliases
for TFTransformOutput.TRANSFORMED_METADATA_DIR and
TFTransformOutput.TRANSFORM_FN_DIR respectively.
partially_apply_saved_transform is deprecated, users should use the
transform_raw_features method of TFTransformOuptut instead. These differ
in that partially_apply_saved_transform can also be used to return both the
input placeholders and the outputs. But users do not need this functionality
because they will typically create the input placeholders themselves based
on the feature spec.
Renamed tft.uniques to tft.vocabulary, tft.string_to_int to
tft.compute_and_apply_vocabulary and tft.apply_vocab to
tft.apply_vocabulary. The existing methods will remain for a few more minor
releases but are now deprecated and should get migrated away from.