Skip to content

add CUB200 prototype datasets #5154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 17, 2022
Merged

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Jan 3, 2022

This is the only popular vision dataset (according to this list) that we are currently missing.

cc @pmeier @bjuncek

@facebook-github-bot
Copy link

facebook-github-bot commented Jan 3, 2022

💊 CI failures summary and remediations

As of commit a3a7f12 (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet.


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@vadimkantorov
Copy link

will also fix this: #1654

@vadimkantorov
Copy link

Another frequent dataset for metric learning is Stanford Online Products...

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmeier , some minor comments / questions but LGTM

bounding_box=BoundingBox(
[int(content["bbox"][coord]) for coord in ("left", "bottom", "right", "top")], format="xyxy"
),
segmentation=Feature(content["seg"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help me understand why we use the decoder for the Feature(...) in _2011_decode_ann, but not in this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoder is used to turn an open file handle into a the pixel values as tensor. In this case the raw pixel values are already present as a numpy.ndarray so there is nothing left to decode.

@pmeier pmeier linked an issue Jan 17, 2022 that may be closed by this pull request
@pmeier
Copy link
Collaborator Author

pmeier commented Jan 17, 2022

Test failure is unrelated.

@pmeier pmeier merged commit 28f72f1 into pytorch:main Jan 17, 2022
@pmeier pmeier deleted the datasets/cub-200-2011 branch January 17, 2022 08:02
@vadimkantorov
Copy link

vadimkantorov commented Jan 17, 2022

I guess it would be nice to include some okay metric learning tutorial / reference impl using CUB/Stanford Online Products, this could be a useful example of new-style dataset usage (I've had a bit simplified but working impl of https://arxiv.org/abs/1706.07567 in https://github.com/vadimkantorov/metriclearningbench)

facebook-github-bot pushed a commit that referenced this pull request Jan 17, 2022
Summary:
* add CUB200 prototype datasets

* address review comments

Reviewed By: NicolasHug

Differential Revision: D33618169

fbshipit-source-id: d2212728e21578eacdad9a186a5638750fbe3f79
@pmeier pmeier mentioned this pull request May 2, 2022
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[proposal] CUB 200-2011 dataset
4 participants