Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human pose estimation models update #1718

Merged
merged 58 commits into from
Nov 26, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
4acbc05
add hpe python demo
druzhkov-paul Oct 27, 2020
8844834
add models description
druzhkov-paul Oct 30, 2020
3bd5a5e
fix
druzhkov-paul Oct 30, 2020
1ea0532
fix post-processing
druzhkov-paul Nov 5, 2020
d2553bf
fix net reshape
druzhkov-paul Nov 5, 2020
aa76d7c
add initial AC support and config
druzhkov-paul Nov 5, 2020
cd0563e
fix typo
druzhkov-paul Nov 5, 2020
e62ef4b
update configs
druzhkov-paul Nov 5, 2020
8a43ca8
clean up
druzhkov-paul Nov 5, 2020
af96059
fix padding
druzhkov-paul Nov 5, 2020
629ebc5
rename dataset
druzhkov-paul Nov 6, 2020
fcc343c
remove empty sections in readme
druzhkov-paul Nov 6, 2020
3c6112d
change hpe visualization
druzhkov-paul Nov 6, 2020
18f4dea
style fixes
druzhkov-paul Nov 9, 2020
8ad309b
initial openpose support
druzhkov-paul Nov 9, 2020
396d8d2
optimize grouping
druzhkov-paul Nov 9, 2020
6b5db64
optimize postprocessing
Nov 10, 2020
6e7cb20
add OpenPose AC adapter draft
druzhkov-paul Nov 11, 2020
180ca3c
up
druzhkov-paul Nov 12, 2020
312ec71
add various nms impls to ac adapter
Nov 12, 2020
4384209
add openvino nms to demo model
Nov 12, 2020
7d21e4f
clean up
Nov 13, 2020
da298ff
align decoder in demo and AC
Nov 13, 2020
12e6434
refactor visualization
Nov 13, 2020
5e18824
clean up
Nov 13, 2020
7af051b
add cpp postprocessing support
Nov 13, 2020
59d9bb4
refactor openpose decoder (wip)
Nov 13, 2020
fafd7b2
moved nms out of decoder
Nov 13, 2020
89053cf
refactor openpose related components
Nov 16, 2020
63c56c9
update readme and model.lst for hpe demo
Nov 16, 2020
78fea41
make target size configurable
Nov 16, 2020
c182441
fix trailing spaces
vladimir-dudnik Nov 16, 2020
af6452b
fix trailing spaces
vladimir-dudnik Nov 16, 2020
6f71ddc
revert changes in HumanPoseAdapter
Nov 16, 2020
e009bc0
fix linter comments
Nov 16, 2020
35e3cb2
moved pose estimation adapters to separate modules
Nov 17, 2020
aca1ccc
fixes
druzhkov-paul Nov 17, 2020
cc24fe5
fix GPU inference in demo
Nov 18, 2020
f56b5ed
improve performance of ae decoder
Nov 18, 2020
15bae7b
add license headers
Nov 18, 2020
3f92178
add adapters description to readme
Nov 18, 2020
a37d6e9
handle scipy absence gently
Nov 18, 2020
991a13a
fix models doc pages
Nov 18, 2020
81bc936
fix indent
Nov 18, 2020
2257add
move AC configs
Nov 18, 2020
679b8a4
remove pseudo links
Nov 18, 2020
4946f64
fix typo
Nov 19, 2020
cb1af2e
update arguments description
Nov 19, 2020
86544c2
move image resize to base Model class
Nov 20, 2020
0972357
extend documentation of --tsize arg
Nov 20, 2020
187d39b
support old numpy versions
Nov 20, 2020
230b52e
fix docs
Nov 20, 2020
2116b47
removed cpp postprocessing
Nov 20, 2020
38efe54
add symlinks to AC config files
druzhkov-paul Nov 21, 2020
a980b1e
add demo to the list of all demos in readme
Nov 23, 2020
3d8b5f3
fix readme
Nov 23, 2020
1c0954c
change CLI arg name
Nov 23, 2020
190a85d
fix demo readme...
Nov 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
240 changes: 240 additions & 0 deletions demos/python_demos/human_pose_estimation_demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
# Object Detection SSD Python* Demo, Async API performance showcase

This demo showcases Object Detection with SSD and Async API.

Async API usage can improve overall frame-rate of the application, because rather than wait for inference to complete,
the app can continue doing things on the host, while accelerator is busy.
Specifically, this demo keeps the number of Infer Requests that you have set using `-nireq` flag.
While some of the Infer Requests are processed by IE, the other ones can be filled with new frame data
and asynchronously started or the next output can be taken from the Infer Request and displayed.

The technique can be generalized to any available parallel slack, for example, doing inference and simultaneously
encoding the resulting (previous) frames or running further inference, like some emotion detection on top of
the face detection results.
There are important performance caveats though, for example the tasks that run in parallel should try to avoid
oversubscribing the shared compute resources.
For example, if the inference is performed on the FPGA, and the CPU is essentially idle,
than it makes sense to do things on the CPU in parallel. But if the inference is performed say on the GPU,
than it can take little gain to do the (resulting video) encoding on the same GPU in parallel,
because the device is already busy.

This and other performance implications and tips for the Async API are covered in the
[Optimization Guide](https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html).

Other demo objectives are:
* Video as input support via OpenCV\*
* Visualization of the resulting bounding boxes and text labels (from the `.labels` file)
or class number (if no file is provided)
* Demonstration of the Async API in action. For this, the demo features two modes toggled by the **Tab** key:
- "User specified" mode, where you can set the number of Infer Requests, throughput streams and threads.
Inference, starting new requests and displaying the results of completed requests are all performed asynchronously.
The purpose of this mode is to get the higher FPS by fully utilizing all available devices.
- "Min latency" mode, which uses only one Infer Request. The purpose of this mode is to get the lowest latency.

## How It Works

On the start-up, the application reads command-line parameters and loads a network to the Inference
Engine. Upon getting a frame from the OpenCV VideoCapture, it performs inference and displays the results.

Async API operates with a notion of the "Infer Request" that encapsulates the inputs/outputs and separates
*scheduling and waiting for result*.

The pipeline is the same for both modes. The difference is in the number of Infer Requests used.
```
while True:
if (Infer Request containing the next video frame has completed):
get inference results
process inference results
display the frame
elif (one of the Infer Requests is idle and it is not the end of the input video):
capture frame
populate empty Infer Request
start Infer Request
```

> **NOTE**: By default, Open Model Zoo demos expect input with BGR channels order. If you trained your model to work
with RGB order, you need to manually rearrange the default channels order in the demo application or reconvert your
model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about
the argument, refer to **When to Reverse Input Channels** section of
[Converting a Model Using General Conversion Parameters](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Converting_Model_General.html).

### Async API

The Inference Engine offers Async API based on the notion of Infer Requests. You can execute a Infer Requests
asynchronously (in the background) and wait until ready, when the result is actually needed.
In a mean time your app can continue :

```python
# load network as usual
ie = IECore()
net = ie.read_network(model='Model.xml', weights='Model.bin')
# load network to the plugin, setting the maximal number of concurrent Infer Requests to be used
exec_net = ie.load_network(network=net, device_name='GPU', num_requests=2)
# start concurrent Infer Requests (put requests to the queue and immediately return)
for i, request in enumerate(exec_net.requests):
request.async_infer(inputs={'data': imgs[i]})
# here you can continue execution on the host until results of requests are really needed
# ...
outputs = [request.wait(-1) for request in exec_net.requests]
```

Another option is to set a callback on Infer Request completion:

```python
# load network as usual
ie = IECore()
net = ie.read_network(model='Model.xml', weights='Model.bin')
# load network to the plugin, setting the maximal number of concurrent Infer Requests to be used
exec_net = ie.load_network(network=net, device_name='GPU', num_requests=2)
# define a callback function
def callback(status, py_data):
request, id = py_data
print(id, {key: blob.buffer for key, blob in request.output_blobs.items()})

# start concurrent Infer Requests and set their completion callbacks
for i, request in enumerate(exec_net.requests):
request.set_completion_callback(py_callback=callback, py_data=(request, i))
request.async_infer(inputs={'data': imgs[i]})

# here you can continue execution on the host until results of requests are really needed
# ...
```

For more details on the requests-based Inference Engine API, including the Async execution, refer to
[Integrate the Inference Engine New Request API with Your Application](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Integrate_with_customer_application_new_API.html).


## Running

Running the application with the `-h` option yields the following usage message:
```
python3 object_detection_demo_ssd_async.py -h
```
The command yields the following usage message:
```
usage: object_detection_demo_ssd_async.py [-h] -m MODEL -i INPUT [-d DEVICE]
[--labels LABELS]
[-t PROB_THRESHOLD] [-r]
[-nireq NUM_INFER_REQUESTS]
[-nstreams NUM_STREAMS]
[-nthreads NUM_THREADS]
[-loop LOOP] [-no_show]
[-u UTILIZATION_MONITORS]
[--keep_aspect_ratio]

Options:
-h, --help Show this help message and exit.
-m MODEL, --model MODEL
Required. Path to an .xml file with a trained model.
-i INPUT, --input INPUT
Required. Path to an image, video file or a numeric
camera ID.
-d DEVICE, --device DEVICE
Optional. Specify the target device to infer on; CPU,
GPU, FPGA, HDDL or MYRIAD is acceptable. The sample
will look for a suitable plugin for device specified.
Default value is CPU.
Options:
-h, --help Show this help message and exit.
-m MODEL, --model MODEL
Required. Path to an .xml file with a trained model.
-i INPUT, --input INPUT
Required. Path to an image, video file or a numeric
camera ID.
-d DEVICE, --device DEVICE
Optional. Specify the target device to infer on; CPU,
GPU, FPGA, HDDL or MYRIAD is acceptable. The sample
will look for a suitable plugin for device specified.
Default value is CPU.
--labels LABELS Optional. Labels mapping file.
-t PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
Optional. Probability threshold for detections
filtering.
-r, --raw_output_message
Optional. Output inference results raw values showing.
-nireq NUM_INFER_REQUESTS, --num_infer_requests NUM_INFER_REQUESTS
Optional. Number of infer requests
-nstreams NUM_STREAMS, --num_streams NUM_STREAMS
Optional. Number of streams to use for inference on
the CPU or/and GPU in throughput mode (for HETERO and
MULTI device cases use format
<device1>:<nstreams1>,<device2>:<nstreams2> or just
<nstreams>)
-nthreads NUM_THREADS, --num_threads NUM_THREADS
-m MODEL, --model MODEL
Required. Path to an .xml file with a trained model.
-i INPUT, --input INPUT
Required. Path to an image, video file or a numeric
camera ID.
-d DEVICE, --device DEVICE
Optional. Specify the target device to infer on; CPU,
GPU, FPGA, HDDL or MYRIAD is acceptable. The sample
will look for a suitable plugin for device specified.
Default value is CPU.
--labels LABELS Optional. Labels mapping file.
-t PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
Optional. Probability threshold for detections
filtering.
-r, --raw_output_message
Optional. Output inference results raw values showing.
-nireq NUM_INFER_REQUESTS, --num_infer_requests NUM_INFER_REQUESTS
Optional. Number of infer requests
-nstreams NUM_STREAMS, --num_streams NUM_STREAMS
Optional. Number of streams to use for inference on
the CPU or/and GPU in throughput mode (for HETERO and
MULTI device cases use format
<device1>:<nstreams1>,<device2>:<nstreams2> or just
<nstreams>)
-nthreads NUM_THREADS, --num_threads NUM_THREADS
Optional. Number of threads to use for inference on
CPU (including HETERO cases)
-loop LOOP, --loop LOOP
Optional. Number of times to repeat the input.
-no_show, --no_show Optional. Don't show output
-u UTILIZATION_MONITORS, --utilization_monitors UTILIZATION_MONITORS
Optional. List of monitors to show initially.
--keep_aspect_ratio Optional. Keeps aspect ratio on resize.
```

The number of Infer Requests is specified by `-nireq` flag. An increase of this number usually leads to an increase
of performance (throughput), since in this case several Infer Requests can be processed simultaneously if the device
supports parallelization. However, a large number of Infer Requests increases the latency because each frame still
has to wait before being sent for inference.

For higher FPS, it is recommended that you set `-nireq` to slightly exceed the `-nstreams` value,
summed across all devices used.

> **NOTE**: This demo is based on the callback functionality from the Inference Engine Python API.
The selected approach makes the execution in multi-device mode optimal by preventing wait delays caused by
the differences in device performance. However, the internal organization of the callback mechanism in Python API
leads to FPS decrease. Please, keep it in mind and use the C++ version of this demo for performance-critical cases.

Running the application with the empty list of options yields the usage message given above and an error message.
You can use the following command to do inference on GPU with a pre-trained object detection model:
```
python3 object_detection_demo_ssd_async.py -i <path_to_video>/inputVideo.mp4 -m <path_to_model>/ssd.xml -d GPU
```

To run the demo, you can use public or pre-trained models. You can download the pre-trained models with the OpenVINO
[Model Downloader](../../../tools/downloader/README.md) or from
[https://download.01.org/opencv/](https://download.01.org/opencv/).

> **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine
format (\*.xml + \*.bin) using the
[Model Optimizer tool](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html).

The only GUI knob is to use **Tab** to switch between the synchronized execution ("Min latency" mode)
and the asynchronous mode configured with provided command-line parameters ("User specified" mode).

## Demo Output

The demo uses OpenCV to display the resulting frame with detections (rendered as bounding boxes and labels, if provided).
The demo reports
* **FPS**: average rate of video frame processing (frames per second)
* **Latency**: average time required to process one frame (from reading the frame to displaying the results)
You can use both of these metrics to measure application-level performance.

## See Also
* [Using Open Model Zoo demos](../../README.md)
* [Model Optimizer](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html)
* [Model Downloader](../../../tools/downloader/README.md)
Loading