|
1 | 1 | # Computer Pointer Controller
|
2 |
| -https://youtu.be/ZJ8y--zcBag |
3 |
| -*TODO:* Write a short introduction to your project |
| 2 | +_______________ |
| 3 | +## Introduction |
| 4 | +Computer Pointer Controller is an application which use a gaze detection model to control the mouse pointer of you computer. |
| 5 | +The position of mouse pointer will change by following the user's Gaze. The [Gaze Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html) model is used to estimate the gaze of the user's eyes and then feed the result into `pyautogui` module to change the position of mouse pointer. |
| 6 | + |
| 7 | +The pipline of application as shown below: |
| 8 | + |
| 9 | + |
| 10 | +### LiveDemo: |
| 11 | +Recorded video of running the project: [Demo video of project](https://youtu.be/ZJ8y--zcBag) |
| 12 | + |
| 13 | +### Screenshot: |
| 14 | + |
4 | 15 |
|
5 | 16 | ## Project Set Up and Installation
|
6 |
| -*TODO:* Explain the setup procedures to run your project. For instance, this can include your project directory structure, the models you need to download and where to place them etc. Also include details about how to install the dependencies your project requires. |
| 17 | +_______________ |
| 18 | +**1.Prerequisites** |
| 19 | +- [Install Intel® Distribution of OpenVINO™ toolkit for Windows* 10](https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_windows.html#model_optimizer_configuration_steps) or you can choose install in Linux system. |
| 20 | +- The `requirments.txt` in project directory needs to be installed. Using command: `pip3 install -r requirements.txt` |
| 21 | + |
| 22 | +**2.Environment setup** |
| 23 | +Initialize openVINO environment (command in cmd) |
| 24 | +**Important!!!** |
| 25 | +```sh |
| 26 | +cd C:\Program Files (x86)\IntelSWTools\openvino\bin\ && setupvars.bat |
| 27 | +``` |
| 28 | +**3.Download the required model** |
| 29 | +- Download the required models: |
| 30 | + - [Face Detection](https://docs.openvinotoolkit.org/latest/_models_intel_face_detection_adas_binary_0001_description_face_detection_adas_binary_0001.html) |
| 31 | + - [Head Pose Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_head_pose_estimation_adas_0001_description_head_pose_estimation_adas_0001.html) |
| 32 | + - [Facial Landmarks Detection](https://docs.openvinotoolkit.org/latest/_models_intel_landmarks_regression_retail_0009_description_landmarks_regression_retail_0009.html) |
| 33 | + - [Gaze Estimation Model](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html) |
| 34 | + |
| 35 | +- These can be downloaded using the `model downloader`. |
| 36 | +- cd to project directory and follow command below to download the models. |
| 37 | + ```sh |
| 38 | + python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name face-detection-adas-binary-0001 |
| 39 | + |
| 40 | + python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name head-pose-estimation-adas-0001 |
| 41 | + |
| 42 | + python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name landmarks-regression-retail-0009 |
| 43 | + |
| 44 | + python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name gaze-estimation-adas-0002 |
| 45 | +``` |
| 46 | + |
| 47 | + |
| 48 | +The source structure of project as showm below: |
| 49 | +``` |
| 50 | +E:\Intel-AI\Computer-Pointer-Controller-OpenVINO>tree /a /f |
| 51 | +Folder PATH listing for volume entertainment |
| 52 | +Volume serial number is 0003-B93B |
| 53 | +E:. |
| 54 | +| .Instructions.md.swp |
| 55 | +| README.md |
| 56 | +| requirements.txt |
| 57 | +| |
| 58 | ++---bin |
| 59 | +| .gitkeep |
| 60 | +| demo.mp4 |
| 61 | +| pipeline.png |
| 62 | +| show_app.PNG |
| 63 | +| |
| 64 | ++---intel |
| 65 | +| +---face-detection-adas-binary-0001 |
| 66 | +| | \---FP32-INT1 |
| 67 | +| | face-detection-adas-binary-0001.bin |
| 68 | +| | face-detection-adas-binary-0001.xml |
| 69 | +| | |
| 70 | +| +---gaze-estimation-adas-0002 |
| 71 | +| | +---FP16 |
| 72 | +| | | gaze-estimation-adas-0002.bin |
| 73 | +| | | gaze-estimation-adas-0002.xml |
| 74 | +| | | |
| 75 | +| | +---FP16-INT8 |
| 76 | +| | | gaze-estimation-adas-0002.bin |
| 77 | +| | | gaze-estimation-adas-0002.xml |
| 78 | +| | | |
| 79 | +| | \---FP32 |
| 80 | +| | gaze-estimation-adas-0002.bin |
| 81 | +| | gaze-estimation-adas-0002.xml |
| 82 | +| | |
| 83 | +| +---head-pose-estimation-adas-0001 |
| 84 | +| | +---FP16 |
| 85 | +| | | head-pose-estimation-adas-0001.bin |
| 86 | +| | | head-pose-estimation-adas-0001.xml |
| 87 | +| | | |
| 88 | +| | +---FP16-INT8 |
| 89 | +| | | head-pose-estimation-adas-0001.bin |
| 90 | +| | | head-pose-estimation-adas-0001.xml |
| 91 | +| | | |
| 92 | +| | \---FP32 |
| 93 | +| | head-pose-estimation-adas-0001.bin |
| 94 | +| | head-pose-estimation-adas-0001.xml |
| 95 | +| | |
| 96 | +| \---landmarks-regression-retail-0009 |
| 97 | +| +---FP16 |
| 98 | +| | landmarks-regression-retail-0009.bin |
| 99 | +| | landmarks-regression-retail-0009.xml |
| 100 | +| | |
| 101 | +| +---FP16-INT8 |
| 102 | +| | landmarks-regression-retail-0009.bin |
| 103 | +| | landmarks-regression-retail-0009.xml |
| 104 | +| | |
| 105 | +| \---FP32 |
| 106 | +| landmarks-regression-retail-0009.bin |
| 107 | +| landmarks-regression-retail-0009.xml |
| 108 | +| |
| 109 | +\---src |
| 110 | + | face_detection.py |
| 111 | + | facial_landmarks_detection.py |
| 112 | + | file_explain.md |
| 113 | + | gaze_estimation.py |
| 114 | + | head_pose_estimation.py |
| 115 | + | input_feeder.py |
| 116 | + | main.py |
| 117 | + | model.py |
| 118 | + | mouse_controller.py |
| 119 | + | Project_log.log |
| 120 | + | |
| 121 | + \---__pycache__ |
| 122 | + face_detection.cpython-37.pyc |
| 123 | + facial_landmarks_detection.cpython-37.pyc |
| 124 | + gaze_estimation.cpython-37.pyc |
| 125 | + head_pose_estimation.cpython-37.pyc |
| 126 | + input_feeder.cpython-37.pyc |
| 127 | + mouse_controller.cpython-37.pyc |
| 128 | +
|
| 129 | +``` |
7 | 130 |
|
8 | 131 | ## Demo
|
9 |
| -*TODO:* Explain how to run a basic demo of your model. |
10 |
| -FP32 |
| 132 | +_______________ |
| 133 | +**1. cd to `src` folder first** |
| 134 | + |
| 135 | +**2. Template of run the `main.py`** |
| 136 | +``` |
| 137 | +python main.py -f <Path of xml file for face detection model> -fl <Path of xml file for facial landmarks detection model> -hp <Path of xml file for head pose estimation model> -g <Path of xml file for gaze estimation model> -i <Path of input video file or enter cam for feeding input from webcam> -d <choose the device to run the model (default CPU)> -flags <select the visualization: fd fld hp ge> |
| 138 | +``` |
| 139 | + |
| 140 | +**3. Examples of run `the main.py`:** |
| 141 | +- Running model with percision **FP32** in **CPU**: |
11 | 142 | ```sh
|
12 | 143 | python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
|
13 | 144 | ```
|
14 |
| -FP16 |
| 145 | + |
| 146 | +- Running model with percision **FP16** in **CPU**: |
15 | 147 | ```sh
|
16 | 148 | python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
|
17 | 149 | ```
|
18 |
| -FP16-INT8 |
| 150 | + |
| 151 | +- Running model with percision **FP16-INT8** in **CPU**: |
19 | 152 | ```sh
|
20 | 153 | python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16-INT8\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16-INT8\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16-INT8\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
|
21 | 154 | ```
|
| 155 | + |
| 156 | +- Running model with percision **FP32** in **CPU** and input through **webcam** |
| 157 | +```sh |
| 158 | +python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i cam -d CPU -flags fd fld hp ge |
| 159 | +``` |
| 160 | + |
22 | 161 | ## Documentation
|
23 |
| -*TODO:* Include any documentation that users might need to better understand your project code. For instance, this is a good place to explain the command line arguments that your project supports. |
| 162 | +_______________ |
24 | 163 |
|
| 164 | +### Command line agruments |
| 165 | +**Try command `python main.py -h` to get help for command line arguments of the application** |
| 166 | +``` |
| 167 | +E:\Intel-AI\Computer-Pointer-Controller-OpenVINO\src>python main.py -h |
| 168 | +usage: main.py [-h] -f FACEDETECTIONMODEL -fl FACIALLANDMARKMODEL -hp |
| 169 | + HEADPOSEMODEL -g GAZEESTIMATIONMODEL -i INPUT |
| 170 | + [-flags PREVIEWFLAGS [PREVIEWFLAGS ...]] [-l CPU_EXTENSION] |
| 171 | + [-prob PROB_THRESHOLD] [-d DEVICE] |
| 172 | +
|
| 173 | +optional arguments: |
| 174 | + -h, --help show this help message and exit |
| 175 | + -f FACEDETECTIONMODEL, --facedetectionmodel FACEDETECTIONMODEL |
| 176 | + Specify Path to .xml file of Face Detection model. |
| 177 | + -fl FACIALLANDMARKMODEL, --faciallandmarkmodel FACIALLANDMARKMODEL |
| 178 | + Specify Path to .xml file of Facial Landmark Detection |
| 179 | + model. |
| 180 | + -hp HEADPOSEMODEL, --headposemodel HEADPOSEMODEL |
| 181 | + Specify Path to .xml file of Head Pose Estimation |
| 182 | + model. |
| 183 | + -g GAZEESTIMATIONMODEL, --gazeestimationmodel GAZEESTIMATIONMODEL |
| 184 | + Specify Path to .xml file of Gaze Estimation model. |
| 185 | + -i INPUT, --input INPUT |
| 186 | + Specify Path to video file or enter cam for webcam |
| 187 | + -flags PREVIEWFLAGS [PREVIEWFLAGS ...], --previewFlags PREVIEWFLAGS [PREVIEWFLAGS ...] |
| 188 | + Specify the flags from fd, fld, hp, ge like --flags fd |
| 189 | + hp fld (Seperate each flag by space)for see the |
| 190 | + visualization of different model outputs of each |
| 191 | + frame,fd for Face Detection, fld for Facial Landmark |
| 192 | + Detectionhp for Head Pose Estimation, ge for Gaze |
| 193 | + Estimation. |
| 194 | + -l CPU_EXTENSION, --cpu_extension CPU_EXTENSION |
| 195 | + MKLDNN (CPU)-targeted custom layers.Absolute path to a |
| 196 | + shared library with thekernels impl. |
| 197 | + -prob PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD |
| 198 | + Probability threshold for model to detect the face |
| 199 | + accurately from the video frame. |
| 200 | + -d DEVICE, --device DEVICE |
| 201 | + Specify the target device to infer on: CPU, GPU, FPGA |
| 202 | + or MYRIAD is acceptable. Sample will look for a |
| 203 | + suitable plugin for device specified (CPU by default) |
| 204 | +``` |
25 | 205 | ## Benchmarks
|
26 |
| -*TODO:* Include the benchmark results of running your model on multiple hardwares and multiple model precisions. Your benchmarks can include: model loading time, input/output processing time, model inference time etc. |
27 |
| -FP32 |
| 206 | +_______________ |
| 207 | + |
| 208 | +- FP32, FP16, FP16-INT8 tested on my CPU: Intel(R) Core(TM)i7-3632QM CPU @ 2.20GHz |
| 209 | +- Checked total load model, total inference time, FPS, model size |
| 210 | + |
| 211 | +#### FP32 Project_log |
28 | 212 | ```
|
29 | 213 | INFO:root:Model Load time: 0.9885485172271729
|
30 | 214 | INFO:root:Inference time: 28.26241898536682
|
31 | 215 | INFO:root:FPS: 2.0877565463552723
|
32 | 216 | ERROR:root:VideoStream ended...
|
33 | 217 | ```
|
34 |
| -FP16 |
| 218 | + |
| 219 | +#### FP16 Project_log |
35 | 220 | ```
|
36 | 221 | INFO:root:Model Load time: 0.9749979972839355
|
37 | 222 | INFO:root:Inference time: 28.06306004524231
|
38 | 223 | INFO:root:FPS: 2.1026372059871705
|
39 | 224 | ERROR:root:VideoStream ended...
|
40 | 225 | ```
|
41 |
| -FP16-INT8 |
| 226 | +#### FP16-INT8 Project_log |
42 | 227 | ```
|
43 | 228 | INFO:root:Model Load time: 1.3344995975494385
|
44 |
| -INFO:root:Inference time: 28.29212784767151 |
45 |
| -INFO:root:FPS: 2.0855425945563804 |
| 229 | +INFO:root:Inference time: 27.815059900283813 |
| 230 | +INFO:root:FPS: 2.12077641984184 |
46 | 231 | ERROR:root:VideoStream ended...
|
47 | 232 | ```
|
| 233 | + |
| 234 | +### FP32 |
| 235 | +|Model|Type|Size| |
| 236 | +| ------ | ------ | ------ | |
| 237 | +|face-detection-adas-binary-0001|FP32-INT1|1.86 MB| |
| 238 | +|head-pose-estimation-adas-0001 |FP32|7.34 MB| |
| 239 | +|landmarks-regression-retail-0009|FP32|786 KB| |
| 240 | +|gaze-estimation-adas-0002|FP32|7.24 MB| |
| 241 | + |
| 242 | +### FP16 |
| 243 | +|Model|Type|Size| |
| 244 | +| ------ | ------ | ------ | |
| 245 | +|face-detection-adas-binary-0001|FP32-INT1|1.86 MB| |
| 246 | +|head-pose-estimation-adas-0001 |FP16|3.69 MB| |
| 247 | +|landmarks-regression-retail-0009|FP16|413 KB| |
| 248 | +|gaze-estimation-adas-0002|FP16|3.69 MB| |
| 249 | + |
| 250 | +### FP16-INT8 |
| 251 | +|Model|Type|Size| |
| 252 | +| ------ | ------ | ------ | |
| 253 | +|face-detection-adas-binary-0001|FP32-INT1|1.86 MB| |
| 254 | +|head-pose-estimation-adas-0001 |FP16-INT8|2.05 MB| |
| 255 | +|landmarks-regression-retail-0009|FP16-INT8|314 KB| |
| 256 | +|gaze-estimation-adas-0002|FP16-INT8|2.09 MB| |
| 257 | + |
| 258 | +### Comparison |
| 259 | +||Total Model Load time (sec)|Total Inference Time (sec)|FPS| |
| 260 | +| ------ | ------ | ------ | ------ | |
| 261 | +|FP32|0.9885485172271729|28.26241898536682|2.0877565463552723| |
| 262 | +|FP16|0.9749979972839355|28.06306004524231|2.1026372059871705| |
| 263 | +|FP16-INT8|1.3344995975494385|27.815059900283813|2.12077641984184| |
| 264 | + |
48 | 265 | ## Results
|
49 |
| -*TODO:* Discuss the benchmark results and explain why you are getting the results you are getting. For instance, explain why there is difference in inference time for FP32, FP16 and INT8 models. |
| 266 | +_______________ |
| 267 | +- For different precision, the model size decreases follow the order of FP32 > FP16 > FP16-INT8. The inference time follows the same order in this case, the INT8 is faster than FP16 and FP32 is slower than FP32. Lower precision model uses less memory. However, remember lower precision of model also lose the accuracy of model. |
| 268 | +- Memory Access of FP16 is half the size compared with FP32, which reduces memory usage of a neural network. FP16 data transfers are faster than FP32, which improves speed (TFLOPS) and performance. |
| 269 | +- The Model load time and FPS of them are almost the same, besides model only load only when model initializes. |
| 270 | +- In order to achieving the most reasonable combination, we do not want too longer inference time also too low accuracy. Moreover, in some specific scenario such as low budget. We do not want to waste storage attempt to get very high accuracy. To achieve the balance, we need to consider the volume of sacrifice. |
50 | 271 |
|
51 |
| -## Stand Out Suggestions |
52 |
| -This is where you can provide information about the stand out suggestions that you have attempted. |
53 | 272 |
|
54 |
| -### Async Inference |
55 |
| -If you have used Async Inference in your code, benchmark the results and explain its effects on power and performance of your project. |
| 273 | + |
| 274 | +## Stand Out Suggestions |
| 275 | +_______________ |
| 276 | +- Use the VTune Amplifier to find hotspots in inference engine pipline. |
| 277 | +- Build an inference pipeline for both video file and webcam feed as input. Allow the user to select their input option in the command line arguments. |
| 278 | +- Benchmark the running times of different parts of the preprocessing and inference pipeline and let the user specify a CLI argument if they want to see the benchmark timing. Use the get_perf_counts API to print the time it takes for each layer in the model. |
56 | 279 |
|
57 | 280 | ### Edge Cases
|
58 |
| -There will be certain situations that will break your inference flow. For instance, lighting changes or multiple people in the frame. Explain some of the edge cases you encountered in your project and how you solved them to make your project more robust. |
| 281 | +_______________ |
| 282 | +- If there will multiple face situation, the model only extracts one face to control the mouse pointer and ignore the other faces. |
| 283 | +- If due to certain reason, model couldn't detect the face. It will continue process another frame untill it detects face or keyboard interrupt to exit the program. |
59 | 284 |
|
0 commit comments