Skip to content

Country dataset #5138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 10, 2022
Merged

Country dataset #5138

merged 12 commits into from
Jan 10, 2022

Conversation

puhuk
Copy link
Contributor

@puhuk puhuk commented Dec 30, 2021

cc @pmeier

@facebook-github-bot
Copy link

facebook-github-bot commented Dec 30, 2021

💊 CI failures summary and remediations

As of commit f0883a5 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @puhuk, thanks a lot for the PR. I've added a few comments inline. There are three more issues:

  1. You used find_classes and make_dataset to create the dataset. This is far better than doing it yourself, but it would be even better to simply subclass ImageFolderDataset so you don't have to worry about that at all. Basically, you only need to handle the input parsing and download and simply pass os.path.join(root, "country", self._split) to super().__init__() (along with the transforms of course). If you do that, you don't need to define __len__ and __getitem__ at all.
  2. Could you add the dataset to our documentation? For that, simply add the class name to the list in docs/source/datasets.rst.
  3. Could you add tests for the dataset? You can create a Country211TestCase in this file. You basically only need to provide a minimal configuration as well as provide fake data. You can read this docstring or look at the other test cases for examples. Due to the very easy structure, all the fake data can be created by datasets_utils.create_image_folder.

@puhuk
Copy link
Contributor Author

puhuk commented Jan 3, 2022

@pmeier Thank you for your kind review. Let me check and correct it soon :)

@puhuk
Copy link
Contributor Author

puhuk commented Jan 6, 2022

@pmeier I resend PR with your suggestion and test code.
But there are errors, could you advise me about how to resolve this.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@puhuk I suggested some changes inline. After that, my comment for inheriting from ImageFolderDataset (#5138 (review)) is still open. Let me know if you need help with that.

Don't worry about the merge conflicts yet, we can fix them before merge. Plus, there are some unrelated failing test that you also don't need to worry about.

Update with review
@puhuk
Copy link
Contributor Author

puhuk commented Jan 8, 2022

@pmeier I resend the PR with your review.
But I don't exactly understand how to implement by inheriting ImageFolderDataset.
May I ask for your help with this.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@puhuk I pushed the changes to inherit from ImageFolder. There are two things left to fix in the docstring. Afterwards this is good to go.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @puhuk for the PR and @pmeier for the review!

In 9eb4f11 I pushed some minor changes to address @pmeier 's last comments, and to make the tests slightly more robust. LGTM, I'll merge when green. Thanks again!

@puhuk
Copy link
Contributor Author

puhuk commented Jan 10, 2022

@pmeier Really thanks to your review and re-creation of my humble code.
Hope to contribute with better code in next time.

@NicolasHug NicolasHug merged commit ffd0d23 into pytorch:main Jan 10, 2022
facebook-github-bot pushed a commit that referenced this pull request Jan 17, 2022
Summary:
* Add Country211 dataset

To addresses issue #5108.

* Add Country211 dataset

To addresses issue #5108.

* Update country211.py

* Update country211.py

* Code review reflected

Reflect code review

* Update test_datasets.py

* Update with review

Update with review

* inherit from ImageFolder

* Update test/test_datasets.py

* Docstring + minor test update

Reviewed By: NicolasHug

Differential Revision: D33618167

fbshipit-source-id: 04de3c5290b966ff97f21ea32b2f678079aa2a6c

Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Nicolas Hug <nicolashug@fb.com>
@puhuk puhuk deleted the country_dataset branch May 3, 2022 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants