Open
Description
🚀 Feature
This is a proposal to add more highly cited datasets. Thanks to papers with code datasets which made this search easy.
Motivation
These datasets are used quite frequently and would provide benefits to both researchers as well as people who work in computer vision. I'm not sure of the citation metric, but we can verify the count of papers once.
Pitch
The following datasets can be considered. Papers are reported as per the last 5 years count on papers with code. They can be inaccurate, feel free to edit. I'm also adding previously approved or proposed ones
- KITTI 1709 papers Added KITTI dataset #3640
- iNaturalist Add iNaturalist dataset #3292 Add iNaturalist dataset #4123
- LFW Labeled Faces in Wild 640 papers Added LFW Dataset #4255
- Caltech-UCSD Birds-200-2011 839 papers Add support for Caltech-UCSD Birds-200-2011 dataset #4128 New dataset added (Caltech-UCSD Birds 200) regarding issue #147 #60829 #4126 add CUB200 prototype datasets #5154
- ADE20K
- Tiny-Imagenet
- Ego4d
- Market-1501 492 papers
- MPII Human Pose
- VGGFace2 Earlier requested in Add VGGface2 dataset #1193 Add vggface2 dataset #2910 Here is tar.gz file. Hopefully we can add it
- MovingMNIST Perviously approved in toy video datasets for generative models #2676 Add MovingMNIST #2690.
- LVIS
- CamVid
- Div2k
- Berkley Segmentation Dataset BSD
- ffhq-dataset
- SmallNORB
See #5108
Probably, we should think and add these, one by one. Also support downloading, not just loading of the dataset.
Additional context
Please feel free to discuss about datasets before opening PRs!
cc @pmeier