Added CUB200-2010 and 2011 version #279

vishwakftw · 2017-10-01T19:18:16Z

Following this pull request, I have decided to first add the CUB200-2010 version of the CUB200 dataset. Details can be found here.

Hope this fits in well with the current setup. This code doesn't leverage any existing datasets.

If this works out well, I am also considering adding the CUB200-2011 version of the CUB200 dataset.

vishwakftw · 2017-10-02T05:16:30Z

The next commit will include the CUB200-2011 version of the CUB200 dataset. More information can be found here.

fmassa · 2017-10-02T08:52:41Z

Hi,
Thanks for the PR!

I believe both datasets are the same except for the downloadable files. Could you merge both in a single class and have a year argument in the constructor?

Also, there is a lot of copy-paste in the download function, I think it would be better to clean this up / reuse it from another dataset. Maybe refactor it and add it somewhere where the other datasets that use it can call from?

One last thing, I suppose you put all the images in a single file because the dataset is small and can fit in memory. But are the original images 64x64, or explicitly stated in the documentation of the dataset that we should use this setup? If not, I'd rather avoid doing this resize, and instead load the raw images from disk at every __getitem__.

vishwakftw · 2017-10-02T09:29:19Z

Thanks for the feedback. I have a few queries:

The images in the CUB200 datasets (both years) have variably sized and are generally large. I think it would not work in a general neural network if the images are variably-sized. This is why I resized it to 64x64 and stored it in one file. If you have any suggestions/improvements of this idea, let me know.
The datasets CUB200-2010 and CUB200-2011 have different file arrangements/structures. For example: in CUB200-2011 there is a proper file separating the train and test images, whereas in CUB200-2010 there are separate files indicating train and test images. This is why I thought I would not be able to use pre-written download functions. Any suggestions/improvements are welcome here too.
In your second remark, are you suggesting the inclusion of a utils.py file, wherein the canonical functions are present, such as download, __len__ and __getitem__?

fmassa · 2017-10-02T09:50:09Z

If the images have different sizes, I think it's not a good idea to restrict it to be 64x64, as other users might want to use a higher size, or keep the original aspect ratio, etc. I'd suggest not resizing the images, and loading them from disk as is done in ImageFolder, or in the original PR that you referred to
I haven't looked in detail, but it should be possible to factor out the differences in a common code, and use the same interface for both datasets once those (small) differences are addressed.
I was thinking about having a utils.py file for example, which contains a generic download function that is used by all the other datasets that have a download function. This might require some refactoring, but from a quick view I have the impression that we would have a much cleaner code.

vishwakftw · 2017-10-02T12:02:21Z

@fmassa I have changed a few things after your suggestions. Please let me know what you think.

vishwakftw · 2017-10-12T12:34:42Z

@fmassa My PR helps cover issues like these too w.r.t. documentation. #288

vishwakftw

Resolve possible merge conflicts

fmassa

Hi,

I did a quick pass, and there seems to be a problem with the current code.
Once you address it, I'll have a more in-depth look.
Thanks!

README.rst

-Contributing
-============
-We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. If you plan to contribute new features, utility functions or extensions, please first open an issue and discuss the feature with us.
+You can find the API doucmentation on the pytorch website: http://pytorch.org/docs/master/torchvision/


torchvision/datasets/cub200.py

+        Returns:
+            tuple: (image, target) where target is index of the target class.
+        """
+        path, target = self.data_set[index]


vishwakftw added 5 commits October 2, 2017 00:42

Added CUB200-2010 version

4d97d58

fix indent errors

68c7c17

fix indent errors - 2 for build pass

011ddbc

fix indent errors - 3 for build pass

c85530b

add CUB200-2011 version

e28b55d

vishwakftw changed the title ~~Added CUB200-2010 version~~ Added CUB200-2010 and 2011 version Oct 2, 2017

minor indent changes and add CUB2002011 in __init__.py

5301601

vishwakftw added 2 commits October 2, 2017 17:27

refactor code after suggestions

8b19dbb

update __init.py__ and minor indent changes

fa0f648

vishwakftw added 2 commits October 2, 2017 17:39

fix build issues

41744e7

match imagefolder characteristics and README consistency changes

92a3213

Merge branch 'master' into master

9356d87

vishwakftw commented Nov 8, 2017

View reviewed changes

fmassa requested changes Nov 12, 2017

View reviewed changes

vishwakftw added 3 commits November 13, 2017 00:02

fix naming issues

3d4e9a5

fix lint issue

d3c4095

fix lint issue

4873df6

vishwakftw closed this Jan 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added CUB200-2010 and 2011 version #279

Added CUB200-2010 and 2011 version #279

Uh oh!

vishwakftw commented Oct 1, 2017 •

edited

Loading

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

fmassa commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

fmassa commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 12, 2017

Uh oh!

vishwakftw left a comment

Uh oh!

fmassa left a comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Added CUB200-2010 and 2011 version #279

Added CUB200-2010 and 2011 version #279

Uh oh!

Conversation

vishwakftw commented Oct 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

fmassa commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

fmassa commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 2, 2017

Uh oh!

vishwakftw commented Oct 12, 2017

Uh oh!

vishwakftw left a comment

Choose a reason for hiding this comment

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

vishwakftw commented Oct 1, 2017 •

edited

Loading