Skip to content

Easy to use SOTA Top-Down Multi-person Pose Estimation Models in PyTorch

License

Notifications You must be signed in to change notification settings

sithu31296/pose-estimation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Top-Down Multi-person Pose Estimation

Introduction

Pose estimation find the keypoints belong to the people in the image. There are two methods exist for pose estimation.

  • Bottom-Up first finds the keypoints and associates them into different people in the image. (Generally faster and lower accuracy)
  • Top-Down first detect people in the image and estimate the keypoints. (Generally computationally intensive but better accuracy)

This repo will only include top-down pose estimation models.

Model Zoo

COCO-val with 56.4 Detector AP
Model Backbone Image Size AP AP50 AP75 Params
(M)
FLOPs
(B)
FPS Weights
PoseHRNet HRNet-w32 256x192 74.4 90.5 81.9 29 7 25 download
HRNet-w48 256x192 75.1 90.6 82.2 64 15 24 download
SimDR HRNet-w32 256x192 75.3 - - 31 7 25 download
HRNet-w48 256x192 75.9 90.4 82.7 66 15 24 download

Note: FPS is tested on a GTX1660ti with one person per frame including pre-processing, model inference and post-processing. Both detection and pose models are in PyTorch FP32.

COCO-test with 60.9 Detector AP (click to expand)
Model Backbone Image Size AP AP50 AP75 Params
(M)
FLOPs
(B)
Weights
SimDR* HRNet-w48 256x192 75.4 92.4 82.7 66 15 download
RLEPose HRNet-w48 384x288 75.7 92.3 82.9 - - -
UDP+PSA HRNet-w48 256x192 78.9 93.6 85.8 70 16 -

Download Backbone Models' Weights (click to expand)
Model Weights
HRNet-w32 download
HRNet-w48 download

Requirements

  • torch >= 1.8.1
  • torchvision >= 0.9.1

Other requirements can be installed with pip install -r requirements.txt.

Clone the repository recursively:

$ git clone --recursive https://github.com/sithu31296/pose-estimation.git

Inference

$ python infer.py --source TEST_SOURCE --det-model DET_MODEL_PATH --pose-model POSE_MODEL_PATH --img-size 640

Arguments:

  • source: Testing sources
    • To test an image, set to image file path. (For example, assests/test.jpg)
    • To test a folder containing images, set to folder name. (For example, assests/)
    • To test a video, set to video file path. (For example, assests/video.mp4)
    • To test with a webcam, set to 0.
  • det-model: YOLOv5 model's weights path
  • pose-model: Pose estimation model's weights path

Example inference results (image credit: [1, 2]):

infer_result

References

Citations

@article{WangSCJDZLMTWLX19,
  title={Deep High-Resolution Representation Learning for Visual Recognition},
  author={Jingdong Wang and Ke Sun and Tianheng Cheng and 
          Borui Jiang and Chaorui Deng and Yang Zhao and Dong Liu and Yadong Mu and 
          Mingkui Tan and Xinggang Wang and Wenyu Liu and Bin Xiao},
  journal   = {TPAMI}
  year={2019}
}

@misc{li20212d,
  title={Is 2D Heatmap Representation Even Necessary for Human Pose Estimation?}, 
  author={Yanjie Li and Sen Yang and Shoukui Zhang and Zhicheng Wang and Wankou Yang and Shu-Tao Xia and Erjin Zhou},
  year={2021},
  eprint={2107.03332},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}