|
| 1 | +--- |
| 2 | +# Date the article was last updated like this: |
| 3 | +date: 2021-04-27 # YYYY-MM-DD |
| 4 | +# Article's title: |
| 5 | +title: Open Source Datasets |
| 6 | +--- |
| 7 | +This is an article to teach you how to make your own dataset or where to find open-source datasets that are free to use and download. |
| 8 | + |
| 9 | +## Creating a Custom Dataset |
| 10 | +Capture your own images with a camera then create labels for each image that indicates the bounding boxes and IDs of the object class captured. |
| 11 | + |
| 12 | +*Option 1:* |
| 13 | +Create labels for all of the images using Yolo_mark [1]. The repo and instructions for use can be found [here](https://github.com/AlexeyAB/Yolo_mark). These labels will be made in the darknet format. |
| 14 | + |
| 15 | +*Option 2:* |
| 16 | +Use Innotescus, a Pittsburgh startup working on high-performance image annotation. They offer free academic accounts to CMU students. You can upload datasets and have multiple people working on annotations. There are task metrics that track how many of each class of image are annotated and show heat maps of their relative locations within an image so you can ensure proper data distributions. |
| 17 | + |
| 18 | +Create a free beta account [here](https://innotescus.io/demo/) |
| 19 | + |
| 20 | + |
| 21 | +## Open-Source Datasets: |
| 22 | +### General Datasets |
| 23 | +[OpenImages](https://storage.googleapis.com/openimages/web/index.html) |
| 24 | + |
| 25 | +[MS COCO](https://cocodataset.org/#home) |
| 26 | + |
| 27 | +[Labelme](http://labelme.csail.mit.edu/Release3.0/browserTools/php/dataset.php) |
| 28 | + |
| 29 | +[ImageNet](http://image-net.org/) |
| 30 | + |
| 31 | +[COIL100](http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php) |
| 32 | + |
| 33 | +Image to Language: |
| 34 | +[Visual Genome](http://visualgenome.org/) |
| 35 | +[Visual Qa](http://www.visualqa.org/) |
| 36 | + |
| 37 | +[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) |
| 38 | + |
| 39 | + |
| 40 | +### Specific Application Datasets: |
| 41 | + |
| 42 | +[Chess Pieces](https://public.roboflow.com/object-detection/chess-full) |
| 43 | + |
| 44 | +[BCCD](https://public.roboflow.com/object-detection/bccd) |
| 45 | + |
| 46 | +[Mountain Dew](https://public.roboflow.com/object-detection/mountain-dew-commercial) |
| 47 | + |
| 48 | +[Pistols](https://public.roboflow.com/object-detection/pistols) |
| 49 | + |
| 50 | +[Packages](https://public.roboflow.com/object-detection/packages-dataset) |
| 51 | + |
| 52 | +[6-sided dice](https://public.roboflow.com/object-detection/dice) |
| 53 | + |
| 54 | +[Boggle board](https://public.roboflow.com/object-detection/boggle-boards) |
| 55 | + |
| 56 | +[Uno Cards](https://public.roboflow.com/object-detection/uno-cards) |
| 57 | + |
| 58 | +[Lego Bricks](https://www.kaggle.com/joosthazelzet/lego-brick-images) |
| 59 | + |
| 60 | +[YouTube](https://research.google.com/youtube8m/index.html) |
| 61 | + |
| 62 | +[Synthetic Fruit](https://public.roboflow.com/object-detection/synthetic-fruit) |
| 63 | + |
| 64 | +[Fruit](https://public.roboflow.com/classification/fruits-dataset) |
| 65 | + |
| 66 | +Flowers: |
| 67 | +[Flower Classification 1](https://public.roboflow.com/classification/flowers_classification) |
| 68 | +[Flower Classification 2](https://public.roboflow.com/classification/flowers) |
| 69 | +[Flower Classification 3](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) |
| 70 | + |
| 71 | +Plants: |
| 72 | +[Plant Doc](https://public.roboflow.com/object-detection/plantdoc) |
| 73 | +[Plant Analysis](https://www.plant-image-analysis.org/dataset) |
| 74 | + |
| 75 | +[Wildfire smoke](https://public.roboflow.com/object-detection/wildfire-smoke) |
| 76 | + |
| 77 | +[Aerial Maritime Drone](https://public.roboflow.com/object-detection/aerial-maritime) |
| 78 | + |
| 79 | +[Anki Vector Robot](https://public.roboflow.com/object-detection/robot) |
| 80 | + |
| 81 | +[Home Objects](http://www.vision.caltech.edu/pmoreels/Datasets/Home_Objects_06/) |
| 82 | + |
| 83 | +Indoor Room Scenes: |
| 84 | +[Princeton lsun](http://lsun.cs.princeton.edu/2016/) |
| 85 | +[MIT toralba](http://web.mit.edu/torralba/www/indoor.html) |
| 86 | + |
| 87 | +[Places](http://places.csail.mit.edu/index.html) |
| 88 | + |
| 89 | +[Parking Lot](https://public.roboflow.com/object-detection/pklot) |
| 90 | + |
| 91 | +[Car Models](http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/index.html) |
| 92 | + |
| 93 | +[Improved Udacity Self Driving Car](https://public.roboflow.com/object-detection/self-driving-car) |
| 94 | + |
| 95 | +[Pothole](https://public.roboflow.com/object-detection/pothole) |
| 96 | + |
| 97 | +[Hard Hat](https://public.roboflow.com/object-detection/hard-hat-workers) |
| 98 | + |
| 99 | +[Masks](https://public.roboflow.com/object-detection/mask-wearing) |
| 100 | + |
| 101 | +#### People and Animals: |
| 102 | +[Aquarium](https://public.roboflow.com/object-detection/aquarium) |
| 103 | + |
| 104 | +[Brackish Underwater](https://public.roboflow.com/object-detection/brackish-underwater) |
| 105 | + |
| 106 | +[Racoon](https://public.roboflow.com/object-detection/raccoon) |
| 107 | + |
| 108 | +[Thermal Cheetah](https://public.roboflow.com/object-detection/thermal-cheetah) |
| 109 | + |
| 110 | +[ASL](https://public.roboflow.com/object-detection/american-sign-language-letters) |
| 111 | + |
| 112 | +[RPS](https://public.roboflow.com/classification/rock-paper-scissors) |
| 113 | + |
| 114 | +[Human Hands](https://public.roboflow.com/object-detection/hands) |
| 115 | + |
| 116 | +[Human Faces](http://vis-www.cs.umass.edu/lfw/) |
| 117 | + |
| 118 | +[Celebrity Faces](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) |
| 119 | + |
| 120 | +[Thermal Dogs and People](https://public.roboflow.com/object-detection/thermal-dogs-and-people) |
| 121 | + |
| 122 | +[Dogs](http://vision.stanford.edu/aditya86/ImageNetDogs/) |
| 123 | + |
| 124 | +[Dogs and Cats](https://public.roboflow.com/object-detection/oxford-pets) |
| 125 | + |
| 126 | + |
| 127 | +## Summary |
| 128 | +We reviewed how to create labels for custom images to build a dataset. We also reviewed where to access specific and general open-source datasets depending on your application. |
| 129 | + |
| 130 | +## See Also: |
| 131 | +- Using your [custom dataset to train YOLO on darknet for object detection](https://github.com/RoboticsKnowledgebase/roboticsknowledgebase.github.io.git/wiki/machine-learning/train-darknet-on-custom-dataset) |
| 132 | + |
| 133 | +## References |
| 134 | +[1] AlexeyAB (2019) Yolo_mark (Version ea049f3). <https://github.com/AlexeyAB/Yolo_mark>. |
| 135 | + |
| 136 | + |
0 commit comments