Skip to content

Latest commit

 

History

History
49 lines (35 loc) · 2.36 KB

benchmark.md

File metadata and controls

49 lines (35 loc) · 2.36 KB

Benchmark

We compare our results with some popular frameworks and official releases in terms of speed.

Settings

Hardware

  • 8 NVIDIA Tesla V100 (16G) GPUs
  • Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

Software Environment

  • Python 3.7
  • Paddlepaddle-develop(待定)
  • CUDA 10.1
  • CUDNN 7.6.3
  • NCCL 2.1.15
  • gcc 8.2.0

Metrics

The time we measured is the average training time, including data processing and model training. The training speed is measure with ips(instance per second). The higher, the better. Note that we skip the first 50 iter times as they may contain the device warmup time.

Comparison Rules

Here we compare our Paddle Video repo with other video understanding toolboxes in the same data and model settings .

To ensure the fairness of the comparison, the comparison experiments were conducted under the same hardware environment and using the same dataset. The dataset we used is generated by the data preparation. Significant improvement can be observed when comparing with other video understanding framework as shown in the table below,Especially the Slowfast model is nearly 2x faster than the counterparts.

For each model setting, we kept the same data preprocessing methods to make sure the same feature input.

Main Results

Recognizers

| Model | batch size x gpus | Paddle(ips) | Reference(ips) | MMAction2 (ips) | PySlowFast (ips)| | :------ :| :-------------------:|:---------------:|:---------------: | :---------------: |:---------------: | | TSM | 16x8 | 58.1 | 46.04(temporal-shift-module) | To do | X | | PPTSM | 16x8 | 57.6 | X | X | X | | TSN | 16x8 | 841.1 | To do (tsn-pytorch) | To do | X | | Slowfast| 16x8 | 99.5 | X | To do | 43.2 | | Attention_LSTM | 128x8 | 112.6 | X | X | X |

Localizers

Model Paddle(ips) MMAction2 (ips) BMN(boundary matching network) (ips)
BMN To do x x