Skip to content

Commit 93071ab

Browse files
committed
finished README
1 parent a04c7b6 commit 93071ab

File tree

3 files changed

+289
-20
lines changed

3 files changed

+289
-20
lines changed

README.md

Lines changed: 245 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,284 @@
11
# Computer Pointer Controller
2-
https://youtu.be/ZJ8y--zcBag
3-
*TODO:* Write a short introduction to your project
2+
_______________
3+
## Introduction
4+
Computer Pointer Controller is an application which use a gaze detection model to control the mouse pointer of you computer.
5+
The position of mouse pointer will change by following the user's Gaze. The [Gaze Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html) model is used to estimate the gaze of the user's eyes and then feed the result into `pyautogui` module to change the position of mouse pointer.
6+
7+
The pipline of application as shown below:
8+
![pipline](https://github.com/mayujie/Computer-Pointer-Controller-OpenVINO/blob/master/bin/pipeline.png?raw=true)
9+
10+
### LiveDemo:
11+
Recorded video of running the project: [Demo video of project](https://youtu.be/ZJ8y--zcBag)
12+
13+
### Screenshot:
14+
![show_app](https://github.com/mayujie/Computer-Pointer-Controller-OpenVINO/blob/master/bin/show_app.PNG?raw=true)
415

516
## Project Set Up and Installation
6-
*TODO:* Explain the setup procedures to run your project. For instance, this can include your project directory structure, the models you need to download and where to place them etc. Also include details about how to install the dependencies your project requires.
17+
_______________
18+
**1.Prerequisites**
19+
- [Install Intel® Distribution of OpenVINO™ toolkit for Windows* 10](https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_windows.html#model_optimizer_configuration_steps) or you can choose install in Linux system.
20+
- The `requirments.txt` in project directory needs to be installed. Using command: `pip3 install -r requirements.txt`
21+
22+
**2.Environment setup**
23+
Initialize openVINO environment (command in cmd)
24+
**Important!!!**
25+
```sh
26+
cd C:\Program Files (x86)\IntelSWTools\openvino\bin\ && setupvars.bat
27+
```
28+
**3.Download the required model**
29+
- Download the required models:
30+
- [Face Detection](https://docs.openvinotoolkit.org/latest/_models_intel_face_detection_adas_binary_0001_description_face_detection_adas_binary_0001.html)
31+
- [Head Pose Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_head_pose_estimation_adas_0001_description_head_pose_estimation_adas_0001.html)
32+
- [Facial Landmarks Detection](https://docs.openvinotoolkit.org/latest/_models_intel_landmarks_regression_retail_0009_description_landmarks_regression_retail_0009.html)
33+
- [Gaze Estimation Model](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html)
34+
35+
- These can be downloaded using the `model downloader`.
36+
- cd to project directory and follow command below to download the models.
37+
```sh
38+
python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name face-detection-adas-binary-0001
39+
40+
python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name head-pose-estimation-adas-0001
41+
42+
python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name landmarks-regression-retail-0009
43+
44+
python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name gaze-estimation-adas-0002
45+
```
46+
47+
48+
The source structure of project as showm below:
49+
```
50+
E:\Intel-AI\Computer-Pointer-Controller-OpenVINO>tree /a /f
51+
Folder PATH listing for volume entertainment
52+
Volume serial number is 0003-B93B
53+
E:.
54+
| .Instructions.md.swp
55+
| README.md
56+
| requirements.txt
57+
|
58+
+---bin
59+
| .gitkeep
60+
| demo.mp4
61+
| pipeline.png
62+
| show_app.PNG
63+
|
64+
+---intel
65+
| +---face-detection-adas-binary-0001
66+
| | \---FP32-INT1
67+
| | face-detection-adas-binary-0001.bin
68+
| | face-detection-adas-binary-0001.xml
69+
| |
70+
| +---gaze-estimation-adas-0002
71+
| | +---FP16
72+
| | | gaze-estimation-adas-0002.bin
73+
| | | gaze-estimation-adas-0002.xml
74+
| | |
75+
| | +---FP16-INT8
76+
| | | gaze-estimation-adas-0002.bin
77+
| | | gaze-estimation-adas-0002.xml
78+
| | |
79+
| | \---FP32
80+
| | gaze-estimation-adas-0002.bin
81+
| | gaze-estimation-adas-0002.xml
82+
| |
83+
| +---head-pose-estimation-adas-0001
84+
| | +---FP16
85+
| | | head-pose-estimation-adas-0001.bin
86+
| | | head-pose-estimation-adas-0001.xml
87+
| | |
88+
| | +---FP16-INT8
89+
| | | head-pose-estimation-adas-0001.bin
90+
| | | head-pose-estimation-adas-0001.xml
91+
| | |
92+
| | \---FP32
93+
| | head-pose-estimation-adas-0001.bin
94+
| | head-pose-estimation-adas-0001.xml
95+
| |
96+
| \---landmarks-regression-retail-0009
97+
| +---FP16
98+
| | landmarks-regression-retail-0009.bin
99+
| | landmarks-regression-retail-0009.xml
100+
| |
101+
| +---FP16-INT8
102+
| | landmarks-regression-retail-0009.bin
103+
| | landmarks-regression-retail-0009.xml
104+
| |
105+
| \---FP32
106+
| landmarks-regression-retail-0009.bin
107+
| landmarks-regression-retail-0009.xml
108+
|
109+
\---src
110+
| face_detection.py
111+
| facial_landmarks_detection.py
112+
| file_explain.md
113+
| gaze_estimation.py
114+
| head_pose_estimation.py
115+
| input_feeder.py
116+
| main.py
117+
| model.py
118+
| mouse_controller.py
119+
| Project_log.log
120+
|
121+
\---__pycache__
122+
face_detection.cpython-37.pyc
123+
facial_landmarks_detection.cpython-37.pyc
124+
gaze_estimation.cpython-37.pyc
125+
head_pose_estimation.cpython-37.pyc
126+
input_feeder.cpython-37.pyc
127+
mouse_controller.cpython-37.pyc
128+
129+
```
7130

8131
## Demo
9-
*TODO:* Explain how to run a basic demo of your model.
10-
FP32
132+
_______________
133+
**1. cd to `src` folder first**
134+
135+
**2. Template of run the `main.py`**
136+
```
137+
python main.py -f <Path of xml file for face detection model> -fl <Path of xml file for facial landmarks detection model> -hp <Path of xml file for head pose estimation model> -g <Path of xml file for gaze estimation model> -i <Path of input video file or enter cam for feeding input from webcam> -d <choose the device to run the model (default CPU)> -flags <select the visualization: fd fld hp ge>
138+
```
139+
140+
**3. Examples of run `the main.py`:**
141+
- Running model with percision **FP32** in **CPU**:
11142
```sh
12143
python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
13144
```
14-
FP16
145+
146+
- Running model with percision **FP16** in **CPU**:
15147
```sh
16148
python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
17149
```
18-
FP16-INT8
150+
151+
- Running model with percision **FP16-INT8** in **CPU**:
19152
```sh
20153
python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16-INT8\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16-INT8\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16-INT8\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
21154
```
155+
156+
- Running model with percision **FP32** in **CPU** and input through **webcam**
157+
```sh
158+
python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i cam -d CPU -flags fd fld hp ge
159+
```
160+
22161
## Documentation
23-
*TODO:* Include any documentation that users might need to better understand your project code. For instance, this is a good place to explain the command line arguments that your project supports.
162+
_______________
24163

164+
### Command line agruments
165+
**Try command `python main.py -h` to get help for command line arguments of the application**
166+
```
167+
E:\Intel-AI\Computer-Pointer-Controller-OpenVINO\src>python main.py -h
168+
usage: main.py [-h] -f FACEDETECTIONMODEL -fl FACIALLANDMARKMODEL -hp
169+
HEADPOSEMODEL -g GAZEESTIMATIONMODEL -i INPUT
170+
[-flags PREVIEWFLAGS [PREVIEWFLAGS ...]] [-l CPU_EXTENSION]
171+
[-prob PROB_THRESHOLD] [-d DEVICE]
172+
173+
optional arguments:
174+
-h, --help show this help message and exit
175+
-f FACEDETECTIONMODEL, --facedetectionmodel FACEDETECTIONMODEL
176+
Specify Path to .xml file of Face Detection model.
177+
-fl FACIALLANDMARKMODEL, --faciallandmarkmodel FACIALLANDMARKMODEL
178+
Specify Path to .xml file of Facial Landmark Detection
179+
model.
180+
-hp HEADPOSEMODEL, --headposemodel HEADPOSEMODEL
181+
Specify Path to .xml file of Head Pose Estimation
182+
model.
183+
-g GAZEESTIMATIONMODEL, --gazeestimationmodel GAZEESTIMATIONMODEL
184+
Specify Path to .xml file of Gaze Estimation model.
185+
-i INPUT, --input INPUT
186+
Specify Path to video file or enter cam for webcam
187+
-flags PREVIEWFLAGS [PREVIEWFLAGS ...], --previewFlags PREVIEWFLAGS [PREVIEWFLAGS ...]
188+
Specify the flags from fd, fld, hp, ge like --flags fd
189+
hp fld (Seperate each flag by space)for see the
190+
visualization of different model outputs of each
191+
frame,fd for Face Detection, fld for Facial Landmark
192+
Detectionhp for Head Pose Estimation, ge for Gaze
193+
Estimation.
194+
-l CPU_EXTENSION, --cpu_extension CPU_EXTENSION
195+
MKLDNN (CPU)-targeted custom layers.Absolute path to a
196+
shared library with thekernels impl.
197+
-prob PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
198+
Probability threshold for model to detect the face
199+
accurately from the video frame.
200+
-d DEVICE, --device DEVICE
201+
Specify the target device to infer on: CPU, GPU, FPGA
202+
or MYRIAD is acceptable. Sample will look for a
203+
suitable plugin for device specified (CPU by default)
204+
```
25205
## Benchmarks
26-
*TODO:* Include the benchmark results of running your model on multiple hardwares and multiple model precisions. Your benchmarks can include: model loading time, input/output processing time, model inference time etc.
27-
FP32
206+
_______________
207+
208+
- FP32, FP16, FP16-INT8 tested on my CPU: Intel(R) Core(TM)i7-3632QM CPU @ 2.20GHz
209+
- Checked total load model, total inference time, FPS, model size
210+
211+
#### FP32 Project_log
28212
```
29213
INFO:root:Model Load time: 0.9885485172271729
30214
INFO:root:Inference time: 28.26241898536682
31215
INFO:root:FPS: 2.0877565463552723
32216
ERROR:root:VideoStream ended...
33217
```
34-
FP16
218+
219+
#### FP16 Project_log
35220
```
36221
INFO:root:Model Load time: 0.9749979972839355
37222
INFO:root:Inference time: 28.06306004524231
38223
INFO:root:FPS: 2.1026372059871705
39224
ERROR:root:VideoStream ended...
40225
```
41-
FP16-INT8
226+
#### FP16-INT8 Project_log
42227
```
43228
INFO:root:Model Load time: 1.3344995975494385
44-
INFO:root:Inference time: 28.29212784767151
45-
INFO:root:FPS: 2.0855425945563804
229+
INFO:root:Inference time: 27.815059900283813
230+
INFO:root:FPS: 2.12077641984184
46231
ERROR:root:VideoStream ended...
47232
```
233+
234+
### FP32
235+
|Model|Type|Size|
236+
| ------ | ------ | ------ |
237+
|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
238+
|head-pose-estimation-adas-0001 |FP32|7.34 MB|
239+
|landmarks-regression-retail-0009|FP32|786 KB|
240+
|gaze-estimation-adas-0002|FP32|7.24 MB|
241+
242+
### FP16
243+
|Model|Type|Size|
244+
| ------ | ------ | ------ |
245+
|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
246+
|head-pose-estimation-adas-0001 |FP16|3.69 MB|
247+
|landmarks-regression-retail-0009|FP16|413 KB|
248+
|gaze-estimation-adas-0002|FP16|3.69 MB|
249+
250+
### FP16-INT8
251+
|Model|Type|Size|
252+
| ------ | ------ | ------ |
253+
|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
254+
|head-pose-estimation-adas-0001 |FP16-INT8|2.05 MB|
255+
|landmarks-regression-retail-0009|FP16-INT8|314 KB|
256+
|gaze-estimation-adas-0002|FP16-INT8|2.09 MB|
257+
258+
### Comparison
259+
||Total Model Load time (sec)|Total Inference Time (sec)|FPS|
260+
| ------ | ------ | ------ | ------ |
261+
|FP32|0.9885485172271729|28.26241898536682|2.0877565463552723|
262+
|FP16|0.9749979972839355|28.06306004524231|2.1026372059871705|
263+
|FP16-INT8|1.3344995975494385|27.815059900283813|2.12077641984184|
264+
48265
## Results
49-
*TODO:* Discuss the benchmark results and explain why you are getting the results you are getting. For instance, explain why there is difference in inference time for FP32, FP16 and INT8 models.
266+
_______________
267+
- For different precision, the model size decreases follow the order of FP32 > FP16 > FP16-INT8. The inference time follows the same order in this case, the INT8 is faster than FP16 and FP32 is slower than FP32. Lower precision model uses less memory. However, remember lower precision of model also lose the accuracy of model.
268+
- Memory Access of FP16 is half the size compared with FP32, which reduces memory usage of a neural network. FP16 data transfers are faster than FP32, which improves speed (TFLOPS) and performance.
269+
- The Model load time and FPS of them are almost the same, besides model only load only when model initializes.
270+
- In order to achieving the most reasonable combination, we do not want too longer inference time also too low accuracy. Moreover, in some specific scenario such as low budget. We do not want to waste storage attempt to get very high accuracy. To achieve the balance, we need to consider the volume of sacrifice.
50271

51-
## Stand Out Suggestions
52-
This is where you can provide information about the stand out suggestions that you have attempted.
53272

54-
### Async Inference
55-
If you have used Async Inference in your code, benchmark the results and explain its effects on power and performance of your project.
273+
274+
## Stand Out Suggestions
275+
_______________
276+
- Use the VTune Amplifier to find hotspots in inference engine pipline.
277+
- Build an inference pipeline for both video file and webcam feed as input. Allow the user to select their input option in the command line arguments.
278+
- Benchmark the running times of different parts of the preprocessing and inference pipeline and let the user specify a CLI argument if they want to see the benchmark timing. Use the get_perf_counts API to print the time it takes for each layer in the model.
56279

57280
### Edge Cases
58-
There will be certain situations that will break your inference flow. For instance, lighting changes or multiple people in the frame. Explain some of the edge cases you encountered in your project and how you solved them to make your project more robust.
281+
_______________
282+
- If there will multiple face situation, the model only extracts one face to control the mouse pointer and ignore the other faces.
283+
- If due to certain reason, model couldn't detect the face. It will continue process another frame untill it detects face or keyboard interrupt to exit the program.
59284

src/Project_log.log

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,29 @@ INFO:root:Model Load time: 1.616499900817871
1414
INFO:root:Inference time: 28.38741636276245
1515
INFO:root:FPS: 2.0781965480803097
1616
ERROR:root:VideoStream ended...
17+
ERROR:root:Unable to detect the face.
18+
ERROR:root:Unable to detect the face.
19+
INFO:root:Model Load time: 1.3345434665679932
20+
INFO:root:Inference time: 20.2422878742218
21+
INFO:root:FPS: 0.691699604743083
22+
ERROR:root:VideoStream ended...
23+
INFO:root:Model Load time: 1.4301807880401611
24+
INFO:root:Inference time: 27.815059900283813
25+
INFO:root:FPS: 2.12077641984184
26+
ERROR:root:VideoStream ended...
27+
INFO:root:Model Load time: 1.2041051387786865
28+
INFO:root:Inference time: 2.994135618209839
29+
INFO:root:FPS: 2.341137123745819
30+
ERROR:root:VideoStream ended...
31+
INFO:root:Model Load time: 1.014500379562378
32+
INFO:root:Inference time: 27.922627449035645
33+
INFO:root:FPS: 2.113180515759312
34+
ERROR:root:VideoStream ended...
35+
INFO:root:Model Load time: 0.9559900760650635
36+
INFO:root:Inference time: 27.895241737365723
37+
INFO:root:FPS: 2.1146953405017923
38+
ERROR:root:VideoStream ended...
39+
INFO:root:Model Load time: 1.3229994773864746
40+
INFO:root:Inference time: 27.82180690765381
41+
INFO:root:FPS: 2.12077641984184
42+
ERROR:root:VideoStream ended...

src/file_explain.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
## requirements.txt
2+
The requirements file consists of a list of some of the packages and frameworks that you might need to complete your project. This is not a complete list and you might need more depending on how you solve the project.
3+
4+
## src folder
5+
The source folder contains some code files that will help you get started with your project.
6+
7+
## input_feeder.py:
8+
contains an input feeder class that you can use to get input from either a video file or from a webcam. The class has three methods. A load data method that initializes an opencv video captured object with either a video file or the webcam. Next, we have the next_batch function which is a generator that returns successive frames from either the video file or the webcam feed. Finally, the close method closes the video file or the webcam.
9+
At the top of this file, you will find an example of how you can incorporate this file into your project. So first we will initialize an object of input feeder with the input_type. In case you’re using a video file, you will also need to provide the input_file. After that, you need to call the load_data method to initialize our video captured object. Finally, you can use the next_batch function in a loop. Each batch it returns will a single image. However, you can edit the code here to make it return multiple images.
10+
11+
## model.py:
12+
This file contains a skeleton class with methods that will help you load your model, pre-process the inputs of the model and the outputs from the model and also contains a method to run inference on your model. Since each model has different requirements for its inputs and outputs, you will need to create full copies of this file for each model and then finish the to-dos in this file.
13+
14+
## mouse_controller.py:
15+
This file contains a class that uses the pyautogui package to help you move the mouse. In the init method, you can set the precision and the speed of the mouse movement. The higher the precision, the more minute the movement will be and the faster the speed. The faster the mouse motion will happen. You can play around with these values to see what gives you the best results. Calling the move method with the x and y output of the gaze estimation model will move your mouse pointer based on your speed and precision settings.
16+
17+
## bin folder
18+
contains the video file that you can use if you do not have access to a webcam.

0 commit comments

Comments
 (0)