Skip to content

Commit

Permalink
Update add dataset instructions
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 229694746
  • Loading branch information
Ryan Sepassi authored and Copybara-Service committed Jan 17, 2019
1 parent 8d38ad6 commit b1acbeb
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions docs/add_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,15 +356,15 @@ when their module is imported such that they can be accessed through
If you're contributing the dataset to `tensorflow/datasets`, add the module
import to `tensorflow_datasets/__init__.py`.

### 2. Package `DatasetInfo` and metadata files
### 2. Run `download_and_prepare` locally.

All datasets that ship with `tensorflow-datasets` have their
`dataset_info.json` and metadata files packaged in so that users can access
statistics and other information without needing to generate the dataset.
Run `download_and_prepare` locally to ensure that data generation works:

```
python -m tensorflow_datasets.scripts.download_and_prepare \
--datasets=my_new_dataset
```

Run [`tensorflow_datasets/scripts/download_and_prepare`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/scripts/download_and_prepare.py)
to generate the dataset and then copy in the `dataset_info.json` and other
metadata files to [`tensorflow_datasets/dataset_info`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/dataset_info/).

### 3. Double-check the citation

Expand Down

0 comments on commit b1acbeb

Please sign in to comment.