Skip to content

Commit e741798

Browse files
Updated Yolo guide with suggested changes
1 parent 5b4cab8 commit e741798

File tree

1 file changed

+2
-9
lines changed

1 file changed

+2
-9
lines changed

guide/14-deep-learning/yolov3_object_detector.ipynb

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"\n",
1616
"In the earlier works for object detection, models used to either use a sliding window technique or region proposal network. Sliding window, as the name suggests choses a Region of Interest (RoI) by sliding a window across the image and then performs classification in the chosen RoI to detect an object. Region proposal networks work in two steps - first, they extract region proposals and then using CNN features, classify the proposed regions. Sliding window method is not very precise and accurate and though some of the region-based networks can be highly accurate they tend to be slower.\n",
1717
"\n",
18-
"Then came along the one-shot object detectors such as [SSD](https://arxiv.org/abs/1512.02325), [YOLO](https://arxiv.org/pdf/1506.02640.pdf) and [RetinaNet](https://arxiv.org/abs/1708.02002). These models detect objects in a single pass of the image and, thus, are considerably faster, and can match up the accuracy of region-based detectors. The [SSD guide](https://developers.arcgis.com/python/guide/how-ssd-works/) explains the essentials components of a one-shot object detection model. You can also read up the RetinaNet guide [here](https://developers.arcgis.com/python/guide/how-retinanet-works/). These models are already a part of arcgis python API and the addition of [**YOLOv3**](https://arxiv.org/abs/1804.02767) provides another tool in our deep learning toolbox.\n",
18+
"Then came along the one-shot object detectors such as [SSD](https://arxiv.org/abs/1512.02325), [YOLO](https://arxiv.org/pdf/1506.02640.pdf) and [RetinaNet](https://arxiv.org/abs/1708.02002). These models detect objects in a single pass of the image and, thus, are considerably faster, and can match up the accuracy of region-based detectors. The [SSD guide](https://developers.arcgis.com/python/guide/how-ssd-works/) explains the essentials components of a one-shot object detection model. You can also read up the RetinaNet guide [here](https://developers.arcgis.com/python/guide/how-retinanet-works/). These models are already a part of ArcGIS API for Python and the addition of [**YOLOv3**](https://arxiv.org/abs/1804.02767) provides another tool in our deep learning toolbox.\n",
1919
"\n",
2020
"The biggest advantage of YOLOv3 in `arcgis.learn` is that it comes preloaded with weights pretrained on the [COCO dataset](https://cocodataset.org/). This makes it ready-to-use for the 80 common objects (car, truck, person, etc.) that are part of the COCO dataset."
2121
]
@@ -41,7 +41,7 @@
4141
"source": [
4242
"YOLOv3 uses **Darknet-53** as its backbone. This contrasts with the use of popular ResNet family of backbones by other models such as SSD and RetinaNet. Darknet-53 is a deeper version of Darknet-19 which was used in [YOLOv2](https://arxiv.org/pdf/1612.08242.pdf), a prior version. As the name suggests, this backbone architecture has 53 convolutional layers. Adapting the ResNet style residual layers has improved its accuracy but still maintaining the speed advantage. This feature extractor performs better than ResNet101 and similar to ResNet152 while being about 1.5x and 2x faster, respectively [2].\n",
4343
"\n",
44-
"YOLOv3 has incremental improvements over its prior versions [2]. It uses upsampling and concatenation of feature layers with earlier feature layers which preserves fine-grained features. Another improvement is using three scales for detection. This has made the model good at detecting objects of varying scales in an image. There are other improvements in anchor box selections, loss function, etc. For a detailed analysis of the YOLOv3 architecture, please refer to this [blog](https://towardsdatascience.com/yolo-v3-object-detection-53fb7d3bfe6b)."
44+
"YOLOv3 has incremental improvements over its prior versions [2]. It uses upsampling and concatenation of feature layers with earlier feature layers which preserve fine-grained features. Another improvement is using three scales for detection. This has made the model good at detecting objects of varying scales in an image. There are other improvements in anchor box selections, loss function, etc. For a detailed analysis of the YOLOv3 architecture, please refer to this [blog](https://towardsdatascience.com/yolo-v3-object-detection-53fb7d3bfe6b)."
4545
]
4646
},
4747
{
@@ -120,13 +120,6 @@
120120
"* [2] Joseph Redmon, Ali Farhadi: \"YOLOv3: An Incremental Improvement\", 2018; [https://arxiv.org/abs/1804.02767 arXiv:1804.02767].\n",
121121
"* [3] Ayoosh Katuria, \"What’s new in YOLO v3?\", https://towardsdatascience.com/yolo-v3-object-detection-53fb7d3bfe6b."
122122
]
123-
},
124-
{
125-
"cell_type": "code",
126-
"execution_count": null,
127-
"metadata": {},
128-
"outputs": [],
129-
"source": []
130123
}
131124
],
132125
"metadata": {

0 commit comments

Comments
 (0)