|
| 1 | +# Human Pose Estimation Python\* Demo |
| 2 | + |
| 3 | +This demo showcases the work of multi-person 2D pose estimation algorithms. The task is to predict a pose: body skeleton, which consists of a predefined set of keypoints and connections between them, for every person in an input image/video. |
| 4 | + |
| 5 | +Demo application supports inference in both sync and async modes. Please refer to [Optimization Guide](https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html) and [Object Detection SSD, Async API performance showcase](../../object_detection_demo_ssd_async/README.md) demo for more information about Async API and its use. |
| 6 | + |
| 7 | +Other demo objectives are: |
| 8 | +* Video as input support via OpenCV\* |
| 9 | +* Visualization of the resulting poses |
| 10 | +* Demonstration of the Async API in action. For this, the demo features two modes toggled by the **Tab** key: |
| 11 | + - "User specified" mode, where you can set the number of Infer Requests, throughput streams and threads. |
| 12 | + Inference, starting new requests and displaying the results of completed requests are all performed asynchronously. |
| 13 | + The purpose of this mode is to get the higher FPS by fully utilizing all available devices. |
| 14 | + - "Min latency" mode, which uses only one Infer Request. The purpose of this mode is to get the lowest latency. |
| 15 | + |
| 16 | +## How It Works |
| 17 | + |
| 18 | +On the start-up, the application reads command-line parameters and loads a network to the Inference |
| 19 | +Engine. Upon getting a frame from the OpenCV VideoCapture, it performs inference and displays the results. |
| 20 | + |
| 21 | +> **NOTE**: By default, Open Model Zoo demos expect input with BGR channels order. If you trained your model to work |
| 22 | +with RGB order, you need to manually rearrange the default channels order in the demo application or reconvert your |
| 23 | +model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about |
| 24 | +the argument, refer to **When to Reverse Input Channels** section of |
| 25 | +[Converting a Model Using General Conversion Parameters](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Converting_Model_General.html). |
| 26 | + |
| 27 | +## Running |
| 28 | + |
| 29 | +Running the application with the `-h` option yields the following usage message: |
| 30 | +``` |
| 31 | +python3 human_pose_estimation.py -h |
| 32 | +``` |
| 33 | +The command yields the following usage message: |
| 34 | +``` |
| 35 | +usage: human_pose_estimation.py [-h] -i INPUT -m MODEL -at {ae,openpose} |
| 36 | + [--tsize TSIZE] [-t PROB_THRESHOLD] [-r] |
| 37 | + [-d DEVICE] [-nireq NUM_INFER_REQUESTS] |
| 38 | + [-nstreams NUM_STREAMS] |
| 39 | + [-nthreads NUM_THREADS] [-loop LOOP] |
| 40 | + [-no_show] [-u UTILIZATION_MONITORS] |
| 41 | +
|
| 42 | +Options: |
| 43 | + -h, --help Show this help message and exit. |
| 44 | + -i INPUT, --input INPUT |
| 45 | + Required. Path to an image, video file or a numeric |
| 46 | + camera ID. |
| 47 | + -m MODEL, --model MODEL |
| 48 | + Required. Path to an .xml file with a trained model. |
| 49 | + -at {ae,openpose}, --architecture_type {ae,openpose} |
| 50 | + Required. Type of the network, either "ae" for |
| 51 | + Associative Embedding or "openpose" for OpenPose. |
| 52 | + --tsize TSIZE Optional. Target input size. This demo implements |
| 53 | + image pre-processing pipeline that is common to human |
| 54 | + pose estimation approaches. Image is resize first to |
| 55 | + some target size and then the network is reshaped to |
| 56 | + fit the input image shape. By default target image |
| 57 | + size is determined based on the input shape from IR. |
| 58 | + Alternatively it can be manually set via this |
| 59 | + parameter. Note that for OpenPose-like nets image is |
| 60 | + resized to a predefined height, which is the target |
| 61 | + size in this case. For Associative Embedding-like nets |
| 62 | + target size is the length of a short image side. |
| 63 | + -t PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD |
| 64 | + Optional. Probability threshold for poses filtering. |
| 65 | + -r, --raw_output_message |
| 66 | + Optional. Output inference results raw values showing. |
| 67 | + -d DEVICE, --device DEVICE |
| 68 | + Optional. Specify the target device to infer on; CPU, |
| 69 | + GPU, FPGA, HDDL or MYRIAD is acceptable. The sample |
| 70 | + will look for a suitable plugin for device specified. |
| 71 | + Default value is CPU. |
| 72 | + -nireq NUM_INFER_REQUESTS, --num_infer_requests NUM_INFER_REQUESTS |
| 73 | + Optional. Number of infer requests |
| 74 | + -nstreams NUM_STREAMS, --num_streams NUM_STREAMS |
| 75 | + Optional. Number of streams to use for inference on |
| 76 | + the CPU or/and GPU in throughput mode (for HETERO and |
| 77 | + MULTI device cases use format |
| 78 | + <device1>:<nstreams1>,<device2>:<nstreams2> or just |
| 79 | + <nstreams>) |
| 80 | + -nthreads NUM_THREADS, --num_threads NUM_THREADS |
| 81 | + Optional. Number of threads to use for inference on |
| 82 | + CPU (including HETERO cases) |
| 83 | + -loop LOOP, --loop LOOP |
| 84 | + Optional. Number of times to repeat the input. |
| 85 | + -no_show, --no_show Optional. Don't show output |
| 86 | + -u UTILIZATION_MONITORS, --utilization_monitors UTILIZATION_MONITORS |
| 87 | + Optional. List of monitors to show initially. |
| 88 | +``` |
| 89 | + |
| 90 | +Running the application with the empty list of options yields the short usage message and an error message. |
| 91 | +You can use the following command to do inference on CPU with a pre-trained human pose estimation model: |
| 92 | +``` |
| 93 | +python3 human_pose_estimation.py -i <path_to_video>/inputVideo.mp4 -m <path_to_model>/hpe.xml -d CPU |
| 94 | +``` |
| 95 | + |
| 96 | +To run the demo, you can use public or pre-trained models. You can download the pre-trained models with the OpenVINO |
| 97 | +[Model Downloader](../../../tools/downloader/README.md) or from |
| 98 | +[https://download.01.org/opencv/](https://download.01.org/opencv/). |
| 99 | + |
| 100 | +> **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine |
| 101 | +format (\*.xml + \*.bin) using the |
| 102 | +[Model Optimizer tool](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html). |
| 103 | + |
| 104 | +The only GUI knob is to use **Tab** to switch between the synchronized execution ("Min latency" mode) |
| 105 | +and the asynchronous mode configured with provided command-line parameters ("User specified" mode). |
| 106 | + |
| 107 | +## Demo Output |
| 108 | + |
| 109 | +The demo uses OpenCV to display the resulting frame with estimated poses. |
| 110 | +The demo reports |
| 111 | +* **FPS**: average rate of video frame processing (frames per second) |
| 112 | +* **Latency**: average time required to process one frame (from reading the frame to displaying the results) |
| 113 | +You can use both of these metrics to measure application-level performance. |
| 114 | + |
| 115 | +## See Also |
| 116 | +* [Using Open Model Zoo demos](../../README.md) |
| 117 | +* [Model Optimizer](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) |
| 118 | +* [Model Downloader](../../../tools/downloader/README.md) |
0 commit comments