-
Notifications
You must be signed in to change notification settings - Fork 7.1k
add tests for prototype builtin datasets #4682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks ok to me, but I'm slightly worried that this might add some overhead for users adding datasets no?
Also, is there a way to disable shufflers for testing? These tend to make things quite slow.
raise AssertionError(f"Loading the dataset should return an IterDataPipe, but got {type(dataset)} instead.") | ||
|
||
@builtin_datasets | ||
def test_sample(self, dataset, mock_info): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to test just a single sample?
In the function below we test the num_samples
, but not the validity of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it is reasonable to assume that if the first sample is valid, all the other will also be. Otherwise we would need to have some weird branching within the dataset that could cause this and this should be caught during code review.
Of course we can also merge test_sample
and test_num_samples
to check the validity of every sample. But then the question arises, do we need to check all samples for every single test? For example, what about the decoding tests? Should we also check every sample there? If not, what is the difference to test_sample
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's hard to do these without clogging up the CI. I'd say the first one is ok for 90% of the cases, but I think we should re-think these for the video datasets with clips to include somehow randomly drawn sample bc these can be a bit unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are thinking about the video utilities you added in #4838, I don't think we need to change anything here. Since they are separate from the datasets, we can also test them separately.
f"No mock data available for dataset '{name}'. " | ||
f"Did you add a new dataset, but forget to provide mock data for it? " | ||
f"Did you register the mock data function with `@DatasetMocks.register_mock_data_fn`?" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I don't understand the decorators as well, but it looks like you put most of the mock data function here - won't this make this file huge and somewhat unreadable once we add more and more datasets?
This also adds one more hoop to go through to adding datasets. Should we perhaps add this mock data in the datasets themselves?
Would it make more sense to add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I don't understand the decorators as well, but it looks like you put most of the mock data function here - won't this make this file huge and somewhat unreadable once we add more and more datasets?
Unfortunately, yes.
Should we perhaps add this mock data in the datasets themselves?
That is a fair question. In general I like the idea to have the mock data close the "actual" data. The problem I see, is that we would need to pull some pure test utilities such as make_tar
or make_image_folder
inside the torchvision
package. I'm not opposed to that, but that would break the current paradigm that our tests are kind of "standalone".
PyTorch core also has a testing
namespace with an _internal
submodule where a lot of this functionality lives. We could also do that. In the light of #4721 we could also place some specialized comparison / generation functions there for downstream libraries to use. @NicolasHug
Overhead as discussed in #4682 (comment) or overhead in general, because the contributor also has to provide mock data along with the dataset? If you mean the latter, than yes, this is "overhead", but it is well spent. If you have a look at the changes in this PR, I've found bugs in ImageNet, MNIST (and variants), as well as Caltech101 that were not caught be me or anyone else during code review. Plus, we currently also require a contributor to add tests including mock data for our legacy datasets API. In fact, IMO, it will be easier than it currently is, since the users only need to add a single function for the mock data, rather than provide a test case and override custom methods.
I don't think the option is already available. IIRC, it will be available through the |
Correct. I think we will provide a switch in |
One should never use a dataset without shuffling and shuffling has to happen inside our implementation, i.e. before decoding, for memory reasons. So unless we have a switch to turn it off through the data loader, we need to keep it in. |
Understood. And DataLoader does have an option to optionally attach a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Let me know if you want to get this merged now or wait to refactor following the comment from Bruno
One thing to keep in mind, ideally, our datasets wouldn't be stochastic in the standard case (so |
def _decoder(self, dataset_type): | ||
if dataset_type == datasets.utils.DatasetType.RAW: | ||
return datasets.decoder.raw | ||
else: | ||
return lambda file: file.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjuncek This is the decoder we are passing to each dataset during the tests unless we manually specify a different one. Thus, although the video datasets do not use a decoder by default through the normal API, they will use one during test.
Summary: Addresses #69 for `torchvision`. Note that the workflow will fail for another ~24 hours, since pytorch/vision#4682 will only then be included in the nightly release. Pull Request resolved: #96 Reviewed By: wenleix Differential Revision: D32171429 Pulled By: ejguan fbshipit-source-id: 3b5cd0a56bbcc86672a62d047e4491a433811d6a
Summary: * add tests for builtin prototype datasets * fix caltech101 * fix emnist * fix mnist and variants * add iopath as test requirement * fix MNIST warning * fix qmnist data generation * fix cifar data generation * add tests for imagenet * cleanup Reviewed By: kazhang Differential Revision: D32216667 fbshipit-source-id: 4efc2b61574f4523ce31d70c85cc2d5150f3a721
* add tests for builtin prototype datasets * fix caltech101 * fix emnist * fix mnist and variants * add iopath as test requirement * fix MNIST warning * fix qmnist data generation * fix cifar data generation * add tests for imagenet * cleanup
cc @pmeier @bjuncek