Memo about 3D human pose estimation, record of datasets, papers, codes.
related works
related works
note: I don't include some paper without codes.
2019
- Learning 3D Human Shape and Pose from Dense Body Parts
- CVPR, 19. Learning 3D Human Dynamics from Video
- ICCV, 19. TexturePose: Supervising Human Mesh Estimation with Texture Consistency
- ICCV, 19. SPIN - SMPL oPtimization IN the loop
- ICCV, 19. Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild
- ICCV, 19. Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
- CVPR, 19. Exploiting temporal context for 3D human pose estimation in the wild
- CVPR, 19. Learning Joint Reconstruction of Hands and Manipulated Objects - Demo, Training Code and Models
- ICCV, 19. MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation
- SIGGRAPH Asia, 18. Motion Reconstruction Code and Data for Skills from Videos (SFV)
- CVPR, 19. Monocular Total Capture: Posing Face, Body and Hands in the Wild
- CVPR, 19. Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation
- CVPR, 19. Convolutional Mesh Regression for Single-Image Human Shape Reconstruction
- CVPR, 19. Self-Supervised Learning of 3D Human Pose using Multi-view Geometry
- CVPR, 19. 3D human pose estimation in video with temporal convolutions and semi-supervised training
2018
2019
2018
2019
- ICCV 19, PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization:[code]
- Learning Nonparametric Human Mesh Reconstruction from a Single Image without Ground Truth Meshes:image => 2D pose + part seg ==Graph-CNN==> mesh
- PeelNet: Textured 3D reconstruction of human body using single view RGB image
- CVPR, 19. Dense Intrinsic Appearance Flow for Human Pose Transfer
- ICCV, 19. Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis
- CVPR, 19. Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
- ICCV, 19. Multi-Garment Net: Learning to Dress 3D People from Images
2018
2019
2018
keywords: human, motion, tracking, person, pose
- Combining Detection and Tracking for Human Pose Estimation in Videos
- MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation
- HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
- The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation
- Distribution-Aware Coordinate Representation for Human Pose Estimation
- CVPR 20, Hierarchical Human Parsing with Typed Part-Relation Reasoning:[code]
- VIBE: Video Inference for Human Body Pose and Shape Estimation]
- 3D Human Mesh Regression with Dense Correspondence [code]
- Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation:[code]
- Deep Kinematics Analysis for Monocular 3D Human Pose Estimation
- Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction[oral, code]
- Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild
- Coherent Reconstruction of Multiple Humans From a Single Image
- Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis[oral, project]
- Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data[oral]
- GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models[oral]
- Generating 3D People in Scenes Without People[oral]
- Bodies at Rest: 3D Human Pose and Shape Estimation From a Pressure Image Using Synthetic Data[oral]
- Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation
- Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation
- UniPose: Unified Human Pose Estimation in Single Images and Videos
- Three-Dimensional Reconstruction of Human Interactions
- Sequential 3D Human Pose and Shape Estimation From Point Clouds
- Object-Occluded Human Shape and Pose Estimation From a Single Color Image[oral]
- PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
- Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data:[code]
- ActiveMoCap: Optimized Viewpoint Selection for Active Human Motion Capture
- Multi-View Neural Human Rendering
- Fusing Wearable IMUs With Multi-View Images for Human Pose Estimation: A Geometric Approach
- Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS
- 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras
- Deep 3D Capture: Geometry and Reflectance From Sparse Multi-View Images
- Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation
- PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization:[code]
- Self-Supervised Human Depth Estimation From Monocular Videos
- ARCH: Animatable Reconstruction of Clothed Humans
- DeepCap: Monocular Human Performance Capture Using Weak Supervision
- TetraTSDF: 3D Human Reconstruction From a Single Image With a Tetrahedral Outer Shell
- Learning to Transfer Texture From Clothing Images to 3D Humans
- TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style[oral]
- Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera
- 4D Visualization of Dynamic Events From Unconstrained Multi-View Videos
- Multi-View Neural Human Rendering
- Discovering Human Interactions With Novel Objects via Zero-Shot Learning
- Mixture Dense Regression for Object Detection and Human Pose Estimation
- VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
- PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection
- Learning Human-Object Interaction Detection Using Interaction Points
- Cascaded Human-Object Interaction Recognition
- GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes
- Detailed 2D-3D Joint Representation for Human-Object Interaction
- Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction
- Active Vision for Early Recognition of Human Actions
- Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
- [Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction][code]
- Reciprocal Learning Networks for Human Trajectory Prediction
- PaStaNet: Toward Human Activity Knowledge Engine
- A Stochastic Conditioning Scheme for Diverse Human Motion Prediction
- Bayesian Adversarial Human Motion Synthesis[oral]
- Learning Dynamic Relationships for 3D Human Motion Prediction
- Context-Aware Human Motion Prediction
- Learning a Neural Solver for Multiple Object Tracking[oral]
- Skeleton-Based Action Recognition With Shift Graph Convolutional Network
- Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
- Understanding Human Hands in Contact at Internet Scale
- AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild”
- Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild
- Deep Facial Non-Rigid Multi-View Stereo
- Can Facial Pose and Expression Be Separated With Weak Perspective Camera?
- HUMBI: A Large Multiview Dataset of Human Body Expressions
- PANDA: A Gigapixel-Level Human-Centric Video Dataset
- HOnnotate: A Method for 3D Annotation of Hand and Object Poses
some interesting works
- End-to-End Camera Calibration for Broadcast Videos
- Transferring Dense Pose to Proximal Animal Classes
- Dynamic Graph Message Passing Networks
- Self-Learning Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence
- Learning to Optimize Non-Rigid Tracking
- SuperGlue: Learning Feature Matching With Graph Neural Networks
- Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification
- Minimal Solutions to Relative Pose Estimation From Two Views Sharing a Common Direction With Unknown Focal Length
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis: [code],[code-PyTorch]
- Learning Character-Agnostic Motion for Motion Retargeting in 2D Decompose and recompose the video, could be used for motion retrival.
- Peeking into occluded joints:A novel framework for crowd pose estimation[code]
- Differentiable Hierarchical Graph Grouping forMulti-Person Pose Estimation
- Whole-Body Human Pose Estimation in the Wild
- Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos
- SimPose: Effectively Learning DensePose andSurface Normals of People from Simulated Data
- Contact and Human Dynamics from Monocular Video
- HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization
- HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
- 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning
- I2L-MeshNet: Image-to-Lixel PredictionNetwork for Accurate 3D Human Pose andMesh Estimation from a Single RGB Image[code]
- Full-Body Awareness from Partial Observations
- Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach
- Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry
- End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras[code]
- Unsupervised Cross-Modal Alignment forMulti-Person 3D Pose Estimation[project]
- Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition
- Hidden Footprints: Learning ContextualWalkability from 3D Human Trails
- MotionSqueeze: Neural Motion FeatureLearning for Video Understanding
- Structure-Aware Human-Action Generation
- Self-Supervised Monocular 3D FaceReconstruction by Occlusion-AwareMulti-view Geometry Consistency
- Combining Implicit Function Learning andParametric Models for 3D HumanReconstruction
Other
- Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition
- Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation
- Long-term Human Motion Prediction with Scene Context
- Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
- Appearance Consensus Driven Self-Supervised Human Mesh Recovery
- End-to-end Dynamic Matching Network for Multi-view Multi-person 3d Pose Estimation
- Deep Graph Matching via BlackboxDifferentiation of Combinatorial Solvers
- Accurate Optimization of Weighted NuclearNorm for Non-Rigid Structure from Motion
- Aligning Videos in Space and Time
- Dense Hybrid Recurrent Multi-view Stereo Netwith Dynamic Consistency Checking
- DeepSFM: Structure From Motion Via DeepBundle Adjustment
- A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
- Multi-View Optimization ofLocal Feature Geometry
- DeepFit: 3D Surface Fitting via Neural NetworkWeighted Least Squares
- Human Mesh Recovery from Multiple Shots
- NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
- Reconstructing Hand-Object Interactions in the Wild
- oral, Reconstructing 3D Human Pose by Watching Humans in the Mirror | Project Page
- Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans | Project Page
- Deep Dual Consecutive Network for Human Pose Estimation
- Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
- oral, Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
- Monocular Real-time Full Body Capture with Inter-part Correlations
- End-to-End Human Pose and Mesh Reconstruction with Transformers
- Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
- Graph Stacked Hourglass Networks for 3D Human Pose Estimation
- Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
- Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans
- oral, SimPoE: Simulated Character Control for 3D Human Pose Estimation
- PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
- Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks | Code
- Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo
- Body Meshes as Points
- AGORA: Avatars in Geography Optimized for Regression Analysis
- SMPLicit: Topology-aware Generative Model for Clothed People
- oral, POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture
- oral, SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
- oral, Pixel Codec Avatars
- SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements
- Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration
- StylePeople: A Generative Model of Fullbody Human Avatars
- Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video
- Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors
- LASR: Learning Articulated Shape Reconstruction from a Monocular Video | Code
- We are More than Our Joints: Predicting how 3D Bodies Move
- Motion Representations for Articulated Animation | Code
- 3D Human Action Representation Learning via Cross-View Consistency Pursuit
- NeRD: Neural 3D Reflection Symmetry Detector
- Monocular Real-time Full Body Capture with Inter-part Correlations
- CVPR21, oral, Learning High Fidelity Depths of Dressed Humansby Watching Social Media Dance Videos: self-supervised from TikTok videos to estimate high fidelity depths of dressed humans from a single view image.
- CVPR21, Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors: 通过IMU与头戴相机,恢复出人在场景中的位置和姿态
Other
- Mixamo
- fairmotion: Tools to load, process and visualize motion capture data
- Deep-motion-editing: contains code of visualization in blender
You can contribute to this repor by fork and pull.
You can also see Awesome Human Pose Estimation, awesome-3d-human