finished README

mayujie · mayujie · commit 93071ab46ee7 · 2020-08-10T04:42:15.000+02:00
diff --git a/README.md b/README.md
@@ -1,59 +1,284 @@
 # Computer Pointer Controller
-https://youtu.be/ZJ8y--zcBag
-*TODO:* Write a short introduction to your project
+_______________
+## Introduction
+Computer Pointer Controller is an application which use a gaze detection model to control the mouse pointer of you computer.
+The position of mouse pointer will change by following the user's Gaze. The [Gaze Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html) model is used to estimate the gaze of the user's eyes and then feed the result into `pyautogui` module to change the position of mouse pointer. 
+
+The pipline of application as shown below:
+![pipline](https://github.com/mayujie/Computer-Pointer-Controller-OpenVINO/blob/master/bin/pipeline.png?raw=true)
+
+### LiveDemo:
+Recorded video of running the project: [Demo video of project](https://youtu.be/ZJ8y--zcBag)
+
+### Screenshot:
+![show_app](https://github.com/mayujie/Computer-Pointer-Controller-OpenVINO/blob/master/bin/show_app.PNG?raw=true)
 
 ## Project Set Up and Installation
-*TODO:* Explain the setup procedures to run your project. For instance, this can include your project directory structure, the models you need to download and where to place them etc. Also include details about how to install the dependencies your project requires.
+_______________
+**1.Prerequisites** 
+- [Install Intel® Distribution of OpenVINO™ toolkit for Windows* 10](https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_windows.html#model_optimizer_configuration_steps) or you can choose install in Linux system.
+- The `requirments.txt` in project directory needs to be installed. Using command: `pip3 install -r requirements.txt`
+
+**2.Environment setup**
+Initialize openVINO environment (command in cmd)
+**Important!!!**
+```sh
+cd C:\Program Files (x86)\IntelSWTools\openvino\bin\ && setupvars.bat
+```
+**3.Download the required model**
+- Download the required models:
+    - [Face Detection](https://docs.openvinotoolkit.org/latest/_models_intel_face_detection_adas_binary_0001_description_face_detection_adas_binary_0001.html)
+    - [Head Pose Estimation](https://docs.openvinotoolkit.org/latest/_models_intel_head_pose_estimation_adas_0001_description_head_pose_estimation_adas_0001.html)
+    - [Facial Landmarks Detection](https://docs.openvinotoolkit.org/latest/_models_intel_landmarks_regression_retail_0009_description_landmarks_regression_retail_0009.html)
+    - [Gaze Estimation Model](https://docs.openvinotoolkit.org/latest/_models_intel_gaze_estimation_adas_0002_description_gaze_estimation_adas_0002.html)
+
+- These can be downloaded using the `model downloader`. 
+- cd to project directory and follow command below to download the models.
+ ```sh
+    python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name face-detection-adas-binary-0001
+    
+    python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name head-pose-estimation-adas-0001
+    
+    python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name landmarks-regression-retail-0009
+    
+    python "C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\open_model_zoo\tools\downloader\downloader.py" --name gaze-estimation-adas-0002
+```
+
+
+The source structure of project as showm below: 
+```
+E:\Intel-AI\Computer-Pointer-Controller-OpenVINO>tree /a /f
+Folder PATH listing for volume entertainment
+Volume serial number is 0003-B93B
+E:.
+|   .Instructions.md.swp
+|   README.md
+|   requirements.txt
+|
++---bin
+|       .gitkeep
+|       demo.mp4
+|       pipeline.png
+|       show_app.PNG
+|
++---intel
+|   +---face-detection-adas-binary-0001
+|   |   \---FP32-INT1
+|   |           face-detection-adas-binary-0001.bin
+|   |           face-detection-adas-binary-0001.xml
+|   |
+|   +---gaze-estimation-adas-0002
+|   |   +---FP16
+|   |   |       gaze-estimation-adas-0002.bin
+|   |   |       gaze-estimation-adas-0002.xml
+|   |   |
+|   |   +---FP16-INT8
+|   |   |       gaze-estimation-adas-0002.bin
+|   |   |       gaze-estimation-adas-0002.xml
+|   |   |
+|   |   \---FP32
+|   |           gaze-estimation-adas-0002.bin
+|   |           gaze-estimation-adas-0002.xml
+|   |
+|   +---head-pose-estimation-adas-0001
+|   |   +---FP16
+|   |   |       head-pose-estimation-adas-0001.bin
+|   |   |       head-pose-estimation-adas-0001.xml
+|   |   |
+|   |   +---FP16-INT8
+|   |   |       head-pose-estimation-adas-0001.bin
+|   |   |       head-pose-estimation-adas-0001.xml
+|   |   |
+|   |   \---FP32
+|   |           head-pose-estimation-adas-0001.bin
+|   |           head-pose-estimation-adas-0001.xml
+|   |
+|   \---landmarks-regression-retail-0009
+|       +---FP16
+|       |       landmarks-regression-retail-0009.bin
+|       |       landmarks-regression-retail-0009.xml
+|       |
+|       +---FP16-INT8
+|       |       landmarks-regression-retail-0009.bin
+|       |       landmarks-regression-retail-0009.xml
+|       |
+|       \---FP32
+|               landmarks-regression-retail-0009.bin
+|               landmarks-regression-retail-0009.xml
+|
+\---src
+    |   face_detection.py
+    |   facial_landmarks_detection.py
+    |   file_explain.md
+    |   gaze_estimation.py
+    |   head_pose_estimation.py
+    |   input_feeder.py
+    |   main.py
+    |   model.py
+    |   mouse_controller.py
+    |   Project_log.log
+    |
+    \---__pycache__
+            face_detection.cpython-37.pyc
+            facial_landmarks_detection.cpython-37.pyc
+            gaze_estimation.cpython-37.pyc
+            head_pose_estimation.cpython-37.pyc
+            input_feeder.cpython-37.pyc
+            mouse_controller.cpython-37.pyc
+
+```
 
 ## Demo
-*TODO:* Explain how to run a basic demo of your model.
-FP32
+_______________
+**1. cd to `src` folder first**
+
+**2. Template of run the `main.py`**
+```
+python main.py -f <Path of xml file for face detection model> -fl <Path of xml file for facial landmarks detection model> -hp <Path of xml file for head pose estimation model> -g <Path of xml file for gaze estimation model> -i <Path of input video file or enter cam for feeding input from webcam> -d <choose the device to run the model (default CPU)> -flags <select the visualization: fd fld hp ge>
+```
+
+**3. Examples of run `the main.py`:**
+- Running model with percision **FP32** in **CPU**:
 ```sh
 python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
 ```
-FP16
+
+- Running model with percision **FP16** in **CPU**:
 ```sh
 python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
 ```
-FP16-INT8
+
+- Running model with percision **FP16-INT8** in **CPU**:
 ```sh
 python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP16-INT8\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP16-INT8\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP16-INT8\gaze-estimation-adas-0002.xml" -i "\Intel-AI\Computer-Pointer-Controller-OpenVINO\bin\demo.mp4" -d CPU -flags fd fld hp ge
 ```
+
+- Running model with percision **FP32** in **CPU** and input through **webcam**
+```sh
+python main.py -f "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\face-detection-adas-binary-0001\FP32-INT1\face-detection-adas-binary-0001.xml" -fl "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\landmarks-regression-retail-0009\FP32\landmarks-regression-retail-0009.xml" -hp "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\head-pose-estimation-adas-0001\FP32\head-pose-estimation-adas-0001.xml" -g "\Intel-AI\Computer-Pointer-Controller-OpenVINO\intel\gaze-estimation-adas-0002\FP32\gaze-estimation-adas-0002.xml" -i cam -d CPU -flags fd fld hp ge
+```
+
 ## Documentation
-*TODO:* Include any documentation that users might need to better understand your project code. For instance, this is a good place to explain the command line arguments that your project supports.
+_______________
 
+### Command line agruments 
+**Try command `python main.py -h` to get help for command line arguments of the application** 
+```
+E:\Intel-AI\Computer-Pointer-Controller-OpenVINO\src>python main.py -h
+usage: main.py [-h] -f FACEDETECTIONMODEL -fl FACIALLANDMARKMODEL -hp
+               HEADPOSEMODEL -g GAZEESTIMATIONMODEL -i INPUT
+               [-flags PREVIEWFLAGS [PREVIEWFLAGS ...]] [-l CPU_EXTENSION]
+               [-prob PROB_THRESHOLD] [-d DEVICE]
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -f FACEDETECTIONMODEL, --facedetectionmodel FACEDETECTIONMODEL
+                        Specify Path to .xml file of Face Detection model.
+  -fl FACIALLANDMARKMODEL, --faciallandmarkmodel FACIALLANDMARKMODEL
+                        Specify Path to .xml file of Facial Landmark Detection
+                        model.
+  -hp HEADPOSEMODEL, --headposemodel HEADPOSEMODEL
+                        Specify Path to .xml file of Head Pose Estimation
+                        model.
+  -g GAZEESTIMATIONMODEL, --gazeestimationmodel GAZEESTIMATIONMODEL
+                        Specify Path to .xml file of Gaze Estimation model.
+  -i INPUT, --input INPUT
+                        Specify Path to video file or enter cam for webcam
+  -flags PREVIEWFLAGS [PREVIEWFLAGS ...], --previewFlags PREVIEWFLAGS [PREVIEWFLAGS ...]
+                        Specify the flags from fd, fld, hp, ge like --flags fd
+                        hp fld (Seperate each flag by space)for see the
+                        visualization of different model outputs of each
+                        frame,fd for Face Detection, fld for Facial Landmark
+                        Detectionhp for Head Pose Estimation, ge for Gaze
+                        Estimation.
+  -l CPU_EXTENSION, --cpu_extension CPU_EXTENSION
+                        MKLDNN (CPU)-targeted custom layers.Absolute path to a
+                        shared library with thekernels impl.
+  -prob PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
+                        Probability threshold for model to detect the face
+                        accurately from the video frame.
+  -d DEVICE, --device DEVICE
+                        Specify the target device to infer on: CPU, GPU, FPGA
+                        or MYRIAD is acceptable. Sample will look for a
+                        suitable plugin for device specified (CPU by default)
+```
 ## Benchmarks
-*TODO:* Include the benchmark results of running your model on multiple hardwares and multiple model precisions. Your benchmarks can include: model loading time, input/output processing time, model inference time etc.
-FP32
+_______________
+
+- FP32, FP16, FP16-INT8 tested on my CPU: Intel(R) Core(TM)i7-3632QM CPU @ 2.20GHz
+- Checked total load model, total inference time, FPS, model size
+
+#### FP32 Project_log 
 ```
 INFO:root:Model Load time: 0.9885485172271729
 INFO:root:Inference time: 28.26241898536682
 INFO:root:FPS: 2.0877565463552723
 ERROR:root:VideoStream ended...
 ```
-FP16
+
+#### FP16 Project_log 
 ```
 INFO:root:Model Load time: 0.9749979972839355
 INFO:root:Inference time: 28.06306004524231
 INFO:root:FPS: 2.1026372059871705
 ERROR:root:VideoStream ended...
 ```
-FP16-INT8
+#### FP16-INT8 Project_log 
 ```
 INFO:root:Model Load time: 1.3344995975494385
-INFO:root:Inference time: 28.29212784767151
-INFO:root:FPS: 2.0855425945563804
+INFO:root:Inference time: 27.815059900283813
+INFO:root:FPS: 2.12077641984184
 ERROR:root:VideoStream ended...
 ```
+
+### FP32
+|Model|Type|Size|
+| ------ | ------ | ------ |
+|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
+|head-pose-estimation-adas-0001	|FP32|7.34 MB|
+|landmarks-regression-retail-0009|FP32|786 KB|
+|gaze-estimation-adas-0002|FP32|7.24 MB|
+
+### FP16
+|Model|Type|Size|
+| ------ | ------ | ------ |
+|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
+|head-pose-estimation-adas-0001	|FP16|3.69 MB|
+|landmarks-regression-retail-0009|FP16|413 KB|
+|gaze-estimation-adas-0002|FP16|3.69 MB|
+
+### FP16-INT8
+|Model|Type|Size|
+| ------ | ------ | ------ |
+|face-detection-adas-binary-0001|FP32-INT1|1.86 MB|
+|head-pose-estimation-adas-0001	|FP16-INT8|2.05 MB|
+|landmarks-regression-retail-0009|FP16-INT8|314 KB|
+|gaze-estimation-adas-0002|FP16-INT8|2.09 MB|
+
+### Comparison
+||Total Model Load time (sec)|Total Inference Time (sec)|FPS|
+| ------ | ------ | ------ | ------ |
+|FP32|0.9885485172271729|28.26241898536682|2.0877565463552723|
+|FP16|0.9749979972839355|28.06306004524231|2.1026372059871705|
+|FP16-INT8|1.3344995975494385|27.815059900283813|2.12077641984184|
+
 ## Results
-*TODO:* Discuss the benchmark results and explain why you are getting the results you are getting. For instance, explain why there is difference in inference time for FP32, FP16 and INT8 models.
+_______________
+- For different precision, the model size decreases follow the order of FP32 > FP16 > FP16-INT8. The inference time follows the same order in this case, the INT8 is faster than FP16 and FP32 is slower than FP32. Lower precision model uses less memory. However, remember lower precision of model also lose the accuracy of model.
+- Memory Access of FP16 is half the size compared with FP32, which reduces memory usage of a neural network. FP16 data transfers are faster than FP32, which improves speed (TFLOPS) and performance.
+- The Model load time and FPS of them are almost the same, besides model only load only when model initializes.
+- In order to achieving the most reasonable combination, we do not want too longer inference time also too low accuracy. Moreover, in some specific scenario such as low budget. We do not want to waste storage attempt to get very high accuracy. To achieve the balance, we need to consider the volume of sacrifice.
 
-## Stand Out Suggestions
-This is where you can provide information about the stand out suggestions that you have attempted.
 
-### Async Inference
-If you have used Async Inference in your code, benchmark the results and explain its effects on power and performance of your project.
+
+## Stand Out Suggestions
+_______________
+- Use the VTune Amplifier to find hotspots in inference engine pipline.
+- Build an inference pipeline for both video file and webcam feed as input. Allow the user to select their input option in the command line arguments. 
+- Benchmark the running times of different parts of the preprocessing and inference pipeline and let the user specify a CLI argument if they want to see the benchmark timing. Use the get_perf_counts API to print the time it takes for each layer in the model. 
 
 ### Edge Cases
-There will be certain situations that will break your inference flow. For instance, lighting changes or multiple people in the frame. Explain some of the edge cases you encountered in your project and how you solved them to make your project more robust.
+_______________
+- If there will multiple face situation, the model only extracts one face to control the mouse pointer and ignore the other faces.
+- If due to certain reason, model couldn't detect the face. It will continue process another frame untill it detects face or keyboard interrupt to exit the program.
 
diff --git a/src/Project_log.log b/src/Project_log.log
@@ -14,3 +14,29 @@ INFO:root:Model Load time: 1.616499900817871
 INFO:root:Inference time: 28.38741636276245
 INFO:root:FPS: 2.0781965480803097
 ERROR:root:VideoStream ended...
+ERROR:root:Unable to detect the face.
+ERROR:root:Unable to detect the face.
+INFO:root:Model Load time: 1.3345434665679932
+INFO:root:Inference time: 20.2422878742218
+INFO:root:FPS: 0.691699604743083
+ERROR:root:VideoStream ended...
+INFO:root:Model Load time: 1.4301807880401611
+INFO:root:Inference time: 27.815059900283813
+INFO:root:FPS: 2.12077641984184
+ERROR:root:VideoStream ended...
+INFO:root:Model Load time: 1.2041051387786865
+INFO:root:Inference time: 2.994135618209839
+INFO:root:FPS: 2.341137123745819
+ERROR:root:VideoStream ended...
+INFO:root:Model Load time: 1.014500379562378
+INFO:root:Inference time: 27.922627449035645
+INFO:root:FPS: 2.113180515759312
+ERROR:root:VideoStream ended...
+INFO:root:Model Load time: 0.9559900760650635
+INFO:root:Inference time: 27.895241737365723
+INFO:root:FPS: 2.1146953405017923
+ERROR:root:VideoStream ended...
+INFO:root:Model Load time: 1.3229994773864746
+INFO:root:Inference time: 27.82180690765381
+INFO:root:FPS: 2.12077641984184
+ERROR:root:VideoStream ended...
diff --git a/src/file_explain.md b/src/file_explain.md
@@ -0,0 +1,18 @@
+## requirements.txt
+The requirements file consists of a list of some of the packages and frameworks that you might need to complete your project. This is not a complete list and you might need more depending on how you solve the project.
+
+## src folder
+The source folder contains some code files that will help you get started with your project. 
+
+## input_feeder.py: 
+contains an input feeder class that you can use to get input from either a video file or from a webcam. The class has three methods. A load data method that initializes an opencv video captured object with either a video file or the webcam. Next, we have the next_batch function which is a generator that returns successive frames from either the video file or the webcam feed. Finally, the close method closes the video file or the webcam. 
+At the top of this file, you will find an example of how you can incorporate this file into your project. So first we will initialize an object of input feeder with the input_type. In case you’re using a video file, you will also need to provide the input_file. After that, you need to call the load_data method to initialize our video captured object. Finally, you can use the next_batch function in a loop. Each batch it returns will a single image. However, you can edit the code here to make it return multiple images. 
+
+## model.py: 
+This file contains a skeleton class with methods that will help you load your model, pre-process the inputs of the model and the outputs from the model and also contains a method to run inference on your model. Since each model has different requirements for its inputs and outputs, you will need to create full copies of this file for each model and then finish the to-dos in this file. 
+
+## mouse_controller.py: 
+This file contains a class that uses the pyautogui package to help you move the mouse. In the init method, you can set the precision and the speed of the mouse movement. The higher the precision, the more minute the movement will be and the faster the speed. The faster the mouse motion will happen. You can play around with these values to see what gives you the best results. Calling the move method with the x and y output of the gaze estimation model will move your mouse pointer based on your speed and precision settings. 
+
+## bin folder 
+contains the video file that you can use if you do not have access to a webcam.