|
1 |
| -# Grad-CAM-in-TensorFlow |
| 1 | +## Grad-CAM implementation in Tensorflow |
| 2 | + |
| 3 | +This repo is a TensorFlow implementation of Gradient class activation maps(Grad-CAM[1]), |
| 4 | +one of visualization techniques for deep learning networks. |
| 5 | + |
| 6 | +This repo is based on [Torch](https://github.com/ramprs/grad-cam) and [Keras](https://github.com/jacobgil/keras-grad-cam) versions of Grad-CAM. |
| 7 | + |
| 8 | +### Requirements |
| 9 | +* Python3.x |
| 10 | +* Tensorflow 1.x |
| 11 | +* [tensorflow-vgg](https://github.com/machrisaa/tensorflow-vgg) |
| 12 | +(which includes the pretrained(using the Imagenet dataset) VGG16 classification model file `VGG16.npy`(see the README on how to download it)) |
| 13 | + |
| 14 | +### Usage |
| 15 | + |
| 16 | +```bash |
| 17 | +python grad-cam-tf.py <path_to_image> <path_to_VGG16_npy> [top_n] |
| 18 | +``` |
| 19 | +* `path_to_image`: an image for which Grad-CAM is calculated. |
| 20 | +* `path_to_VGG16_npy`: path to the pretrained VGG16 model data provided in [`tensorflow-vgg`](https://github.com/machrisaa/tensorflow-vgg) |
| 21 | +* `top_n`: Optional. Grad-CAM is calculated for each 'top_n' class, which is predicted by VGG16. |
| 22 | + |
| 23 | +The following images are saved to the same directory as `path_to_image`: |
| 24 | +* input image overlaid by Grad-CAM heatmap with the suffix `gradcam` appended to the `path_to_image`. |
| 25 | +* black-and-white Grad-CAM heatmap(suffix: `heatmap`) |
| 26 | +* segmented image by Grad-CAM heatmap(suffix: `segmented`) |
| 27 | +* Guided Backpropagation[2] (suffix: `guided_bprop`) |
| 28 | +* Also rank of predicted class, the class name and the probability are appended as suffix to each file names above. |
| 29 | + |
| 30 | + |
| 31 | +### Examples |
| 32 | + |
| 33 | +* Example image from the [original implementation](https://github.com/ramprs/grad-cam): |
| 34 | + |
| 35 | +the most probable class: 'boxer'(242) |
| 36 | + |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | + |
| 41 | +* Image from [3]: |
| 42 | + |
| 43 | +the most probable class: 'desktop computer'(527) |
| 44 | + |
| 45 | + |
| 46 | + |
| 47 | + |
| 48 | + |
| 49 | +ground truth(the second probable class): 'desk'(526) |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | + |
| 54 | +* Image from [fatal crash by Uber self-driving SUV](https://www.wthr.com/article/police-release-video-of-fatal-crash-by-uber-self-driving-suv): |
| 55 | + |
| 56 | +the most probable class: 'traffic_light'(920) |
| 57 | + |
| 58 | + |
| 59 | + |
| 60 | + |
| 61 | + |
| 62 | +the 6-th probable class: 'motor_scooter'(670) |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | + |
| 67 | +the 19-th probable class: 'moped'(665) |
| 68 | + |
| 69 | + |
| 70 | + |
| 71 | + |
| 72 | +Lines below are the top 20 classes predicted VGG16 model. |
| 73 | + |
| 74 | +As you can see, VGG16 fails to detect bicycle related objects from this picture(though `motor_scooter` is detected with low probability). |
| 75 | + |
| 76 | +```text |
| 77 | +0: class id: 920, class name: traffic_light, probability: 0.105, synset: n06874185 traffic light, traffic signal, stoplight |
| 78 | +1: class id: 627, class name: limousine, probability: 0.092, synset: n03670208 limousine, limo |
| 79 | +2: class id: 475, class name: car_mirror, probability: 0.039, synset: n02965783 car mirror |
| 80 | +3: class id: 818, class name: spotlight, probability: 0.038, synset: n04286575 spotlight, spot |
| 81 | +4: class id: 971, class name: bubble, probability: 0.037, synset: n09229709 bubble |
| 82 | +5: class id: 670, class name: motor_scooter, probability: 0.027, synset: n03791053 motor scooter, scooter |
| 83 | +6: class id: 656, class name: minivan, probability: 0.024, synset: n03770679 minivan |
| 84 | +7: class id: 845, class name: syringe, probability: 0.024, synset: n04376876 syringe |
| 85 | +8: class id: 864, class name: tow_truck, probability: 0.022, synset: n04461696 tow truck, tow car, wrecker |
| 86 | +9: class id: 867, class name: trailer_truck, probability: 0.018, synset: n04467665 trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi |
| 87 | +10: class id: 555, class name: fire_engine, probability: 0.016, synset: n03345487 fire engine, fire truck |
| 88 | +11: class id: 468, class name: cab, probability: 0.016, synset: n02930766 cab, hack, taxi, taxicab |
| 89 | +12: class id: 407, class name: ambulance, probability: 0.015, synset: n02701002 ambulance |
| 90 | +13: class id: 736, class name: pool_table, probability: 0.013, synset: n03982430 pool table, billiard table, snooker table |
| 91 | +14: class id: 836, class name: sunglass, probability: 0.013, synset: n04355933 sunglass |
| 92 | +15: class id: 772, class name: safety_pin, probability: 0.010, synset: n04127249 safety pin |
| 93 | +16: class id: 837, class name: sunglasses, probability: 0.010, synset: n04356056 sunglasses, dark glasses, shades |
| 94 | +17: class id: 754, class name: radio, probability: 0.010, synset: n04041544 radio, wireless |
| 95 | +18: class id: 665, class name: moped, probability: 0.010, synset: n03785016 moped |
| 96 | +19: class id: 444, class name: bicycle-built-for-two, probability: 0.009, synset: n02835271 bicycle-built-for-two, tandem bicycle, tandem |
| 97 | +``` |
| 98 | + |
| 99 | +I think Uber has built in-house classification models for self-driving cars. |
| 100 | +But detecting persons on roads could be difficult only by visible light especially at night. |
| 101 | +As reported in [Report: Uber's Self-Driving Car Sensors Ignored Cyclist In Fatal Accident](https://gizmodo.com/report-ubers-self-driving-car-sensors-ignored-cyclist-1825832504) |
| 102 | +it seems a woman crossing the street on a bicycle was ignored. |
| 103 | + |
| 104 | +Detector using an infrared camera is also necessary to avoid such tragic accidents. |
| 105 | + |
| 106 | +## Note |
| 107 | + |
| 108 | +For your information, |
| 109 | +`Grad-CAM++`, a variant of `Grad-CAM`, is presented in [4]. The authors provided their [implementation](https://github.com/adityac94/Grad_CAM_plus_plus). |
| 110 | +As pointed by [Questions for computing of derivatives](https://github.com/adityac94/Grad_CAM_plus_plus/issues/1), |
| 111 | +their implementation is weird. For example the second derivative(`d2f/dx2`) in [4] is coded like (`(df/dx)^2`), |
| 112 | +though the second derivatives should be computed by [`tf.hessians`](https://www.tensorflow.org/api_docs/python/tf/hessians). |
| 113 | + |
| 114 | +So I am not sure if their experimented results in [4] are correct or not. |
| 115 | + |
| 116 | +## References |
| 117 | + |
| 118 | +[1] |
| 119 | +Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra. |
| 120 | +Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, |
| 121 | +[arXiv](https://arxiv.org/abs/1610.02391), 2016 |
| 122 | + |
| 123 | +[2] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for Simplicity: |
| 124 | +The All Convolutional Net, [arXiv](https://arxiv.org/abs/1412.6806), 2014 |
| 125 | + |
| 126 | +[3] https://pair-code.github.io/saliency/ |
| 127 | + |
| 128 | +[4] Aditya Chattopadhyay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian. |
| 129 | +Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks, |
| 130 | +[arXiv](https://arxiv.org/abs/1710.11063), 2017 |
0 commit comments