Skip to content

Tags: sputney13/datasets

Tags

v3.2.1

Toggle v3.2.1's commit message
Update TFDS to 3.2.1

v3.2.0

Toggle v3.2.0's commit message
Update TFDS version to 3.2.0

API:

 * Add a `tfds.ImageFolder` and `tfds.TranslateFolder` to easily create custom datasets with your custom data.
 * Add a `tfds.ReadConfig(input_context=)` to shard dataset, for better multi-worker compatibility (tensorflow#1426).
 * The default `data_dir` can be controlled by the `TFDS_DATA_DIR` environment variable.
 * Better usability when developing datasets outside TFDS
   * Downloads are always cached
   * Checksum are optional
 * Added a `tfds.show_statistics(ds_info)` to display [FACETS OVERVIEW](https://pair-code.github.io/facets/). Note: This require the dataset to have been generated with the statistics.
 * Open source various scripts to help deployment/documentation (Generate catalog documentation, export all metadata files,...)

Documentation:

 * Catalog display images ([example](https://www.tensorflow.org/datasets/catalog/sun397#sun397standard-part2-120k))
 * Catalog shows which dataset have been recently added and are only available in `tfds-nightly` <span class="material-icons">nights_stay</span>

Breaking compatibility change:

 * Fix deterministic example order on Windows when path was used as key (this only impact a few datasets). Now example order should be the same on all platforms.
 * Remove `tfds.load('image_label_folder')` in favor of the more user-friendly `tfds.ImageFolder`

Other:

 * Various performances improvements for both generation and reading (e.g. use `__slot__`, fix parallelisation bug in `tf.data.TFRecordReader`,...)
 * Various fixes (typo, types annotations, better error messages, fixing dead links, better windows compatibility,...)

PiperOrigin-RevId: 320672697

v3.1.0

Toggle v3.1.0's commit message
Update version to `3.1.0`

PiperOrigin-RevId: 309069766

v3.0.0

Toggle v3.0.0's commit message
Update TFDS version

Breaking changes:
* Legacy mode `tfds.experiment.S3` has been removed
* New  `tfds.image_classification` section and move there some datasets from `tfds.images`.
* `in_memory` argument removed from `as_dataset`/`tfds.load` (small datasets are auto-cached).
* DownloadConfig do not append the dataset name anymore (manual data should be in `<manual_dir>/` instead of `<manual_dir>/<dataset_name>/`)
* Tests now check that all `dl_manager.download` urls has registered checksums. To opt-out, add `SKIP_CHECKSUMS
 = True` to your `DatasetBuilderTestCase`.
* `tfds.load` now always returns `tf.compat.v2.Dataset`. If you're using still using `tf.compat.v1`:
   * Use `tf.compat.v1.data.make_one_shot_iterator(ds)` rather than `ds.make_one_shot_iterator()`
   * Use `isinstance(ds, tf.compat.v2.Dataset)` instead of `isinstance(ds, tf.data.Dataset)`
* `tfds.Split.ALL` has been removed from the API.

Future breaking change:
* The tfds.features.text encoding API is deprecated. Please use [tensorflow_text](https://www.tensorflow.org/tutorials/tensorflow_text/intro) instead.
* `num_shards` argument of `tfds.core.SplitGenerator` is currently ignored and will be removed in the next version.

Features:
* `DownloadManager` is now pickable (can be used inside Beam pipelines)
* `tfds.features.Audio`:
  * Support float as returned value
  * Expose sample_rate through `info.features['audio'].sample_rate`
  * Support for encoding audio features from file objects
* Various bug fixes, better error messages, documentation improvements
* More datasets

Thank you to all our contributors for helping us make TFDS better for everyone!

PiperOrigin-RevId: 306768189

v2.1.0

Toggle v2.1.0's commit message
Update TFDS to 2.1.0

PiperOrigin-RevId: 297186194

v2.0.0

Toggle v2.0.0's commit message
Update tfds to 2.0.0

PiperOrigin-RevId: 291409971

v1.3.2

Toggle v1.3.2's commit message
Version 1.3.2

v1.3.0

Toggle v1.3.0's commit message
Bump version to 1.3.0

PiperOrigin-RevId: 275899385

v1.2.0

Toggle v1.2.0's commit message
TFDS: cut new version `1.2.0`. Update documentation.

PiperOrigin-RevId: 264236299

v1.1.0

Toggle v1.1.0's commit message
Add dataset_versioning section to doc

PiperOrigin-RevId: 259525915