Skip to content

Commit b1e316c

Browse files
authored
Merge pull request #15 from ttxskk/refactor
1. Fix bug. #10 2. Refactor code. 3. Add HuggingFace demo. 4. Update the document for the training dataset.
2 parents 7743efb + d07064b commit b1e316c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1733
-4616
lines changed

.gitignore

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,13 @@ __pycache__/
33
*.py[cod]
44
*$py.class
55
**/*.pyc
6-
osx/output
76
# C extensions
87
*.so
98

109
*.pkl
1110
# Distribution / packaging
11+
data/
12+
demo/
1213
.Python
1314
build/
1415
develop-eggs/
@@ -112,16 +113,19 @@ logs/
112113
!detrsmpl/data
113114
!mmhuman3d/data
114115
!osx/data
116+
ssc_script/
117+
work_dirs/
118+
logs/
119+
test_result/
120+
115121
# data for pytest moved to http server
116122
# !tests/data
117123
.vscode
118124
.idea
119125
*.pkl
120126
*.pkl.json
121127
*.log.json
122-
work_dirs/
123-
logs/
124-
test_result/
128+
125129
# Pytorch
126130
*.pth
127131
*.pt

README.md

Lines changed: 111 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,58 @@
1+
12
<div align="center">
23
<h2>
34
AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
45
</h2>
56
</div>
6-
77
<div align="center">
8-
<a href="https://ttxskk.github.io/AiOS/" class="button"><b>[Homepage]</b></a> &nbsp;&nbsp;&nbsp;&nbsp;
9-
<a href="https://arxiv.org/abs/2403.17934" class="button"><b>[arXiv]</b></a> &nbsp;&nbsp;&nbsp;&nbsp;
10-
<a href="https://ttxskk.github.io/AiOS/" class="button"><b>[Code]</b></a> &nbsp;&nbsp;&nbsp;&nbsp;
8+
<span>
9+
<a href="https://github.com/ttxskk">Qingping Sun</a><sup>1, 2</sup>,&nbsp;
10+
</span>
11+
<span>
12+
<a href="https://github.com/WYJSJTU">Yanjun Wang</a><sup>1</sup>,&nbsp;
13+
</span>
14+
<span>
15+
<a href="https://ailingzeng.site/">Ailing Zeng</a><sup>3</sup>,&nbsp;
16+
</span>
17+
<span>
18+
<a href="https://scholar.google.com/citations?view_op=list_works&hl=en&user=zlIJwBEAAAAJ">Wanqi Yin</a><sup>1</sup>,&nbsp;
19+
</span>
20+
<span>
21+
<a href="https://www.linkedin.com/in/chen-wei-weic0006/">Chen Wei</a><sup>1</sup>,&nbsp;
22+
</span>
23+
<span>
24+
<a href="https://wenjiawang0312.github.io/">Wenjia Wang</a><sup>5</sup>,&nbsp;
25+
</span>
26+
<br>
27+
<span>
28+
<a href="https://haiyi-mei.com">Haiyi Mei</a><sup>1</sup>,&nbsp;
29+
</span>
30+
<span>
31+
<a href="https://ttxskk.github.io/AiOS/">Chi Sing Leung</a><sup>2</sup>,&nbsp;
32+
<span>
33+
<a href="https://liuziwei7.github.io/">Ziwei Liu</a><sup>4</sup>,&nbsp;
34+
</span>
35+
</span>
36+
<span>
37+
<a href="https://yanglei.me/">Lei Yang</a><sup>1, 5</sup>,&nbsp;
38+
</span>
39+
<span>
40+
<a href="https://caizhongang.github.io/">Zhongang Cai</a><sup>✉, 1, 4, 5</sup>,&nbsp;
41+
</span>
42+
</div>
43+
<div align="center">
44+
<span><sup>1</sup>SenseTime Research</span>,
45+
<span><sup>2</sup>City University of Hong Kong</span>,
46+
<br>
47+
<span><sup>3</sup>International Digital Economy Academy (IDEA)</span>,
48+
<br>
49+
<span><sup>4</sup>S-Lab, Nanyang Technological University</span>,
50+
<span><sup>5</sup>Shanghai AI Laboratory</span>
51+
</div>
52+
<div align="center">
53+
<a href="https://ttxskk.github.io/AiOS/"><img src='https://img.shields.io/badge/Project-Page-Green'></a>
54+
<a href="https://arxiv.org/abs/2403.17934"><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
55+
<a href="https://huggingface.co/spaces/ttxskk/AiOS"><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-blue'></a>
1156
</div>
1257

1358
---
@@ -29,42 +74,70 @@
2974

3075

3176

77+
78+
79+
3280
## Preparation
33-
- download datasets for evaluation
81+
- download all datasets
3482
- [AGORA](https://agora.is.tue.mpg.de/index.html)
35-
- [BEDLAM](https://bedlam.is.tue.mpg.de/index.html)
36-
- download [SMPL-X](https://smpl-x.is.tue.mpg.de/) body models.
37-
- download SMPL body models `SMPL_FEMALE.pkl`, `SMPL_MALE.pkl`, `SMPL_NEUTRAL.pkl` provided by [SMPLer-X](https://huggingface.co/camenduru/SMPLer-X/tree/main).
38-
- download other SMPL-X dependent files: `SMPLX_to_J14.pkl`, `MANO_SMPLX_vertex_ids.pkl`, `SMPL-X__FLAME_vertex_ids.npy`, `SMPLX_NEUTRAL.pkl`
39-
provided by [SMPLer-X](https://huggingface.co/camenduru/SMPLer-X/tree/main).
40-
- download AiOS [checkpoint](https://drive.google.com/file/d/1arUq25YMpgrTCKFKsQQy1LAaNgVwlL99/view?usp=sharing)
41-
- download AGORA validation set [Humandata](https://drive.google.com/file/d/1cjCVwrFdZ9qMXsA_yaZa3_plYYK8uyPU/view?usp=sharing)
42-
Organize them according to this datastructure:
83+
- [BEDLAM](https://bedlam.is.tue.mpg.de/index.html)
84+
- [MSCOCO](https://cocodataset.org/#home)
85+
- [UBody](https://github.com/IDEA-Research/OSX)
86+
- [ARCTIC](https://arctic.is.tue.mpg.de/)
87+
- [EgoBody](https://sanweiliti.github.io/egobody/egobody.html)
88+
- [EHF](https://smpl-x.is.tue.mpg.de/index.html)
89+
- process all datasets into [HumanData](https://github.com/open-mmlab/mmhuman3d/blob/main) format. We provided the proccessed npz file, which can be download from [here](https://huggingface.co/datasets/ttxskk/AiOS_Train_Data).
90+
- download [SMPL-X](https://smpl-x.is.tue.mpg.de/)
91+
- download AiOS [checkpoint](https://huggingface.co/ttxskk/AiOS/tree/main)
92+
93+
The file structure should be like:
4394
```text
4495
AiOS/
4596
├── config/
4697
└── data
4798
├── body_models
48-
── smplx
49-
| ├──MANO_SMPLX_vertex_ids.pkl
50-
| ├──SMPL-X__FLAME_vertex_ids.npy
51-
| ├──SMPLX_NEUTRAL.pkl
52-
| ├──SMPLX_to_J14.pkl
53-
| ├──SMPLX_NEUTRAL.npz
54-
| ├──SMPLX_MALE.npz
55-
| └──SMPLX_FEMALE.npz
56-
└── smpl
99+
| ── smplx
100+
| | ├──MANO_SMPLX_vertex_ids.pkl
101+
| | ├──SMPL-X__FLAME_vertex_ids.npy
102+
| | ├──SMPLX_NEUTRAL.pkl
103+
| | ├──SMPLX_to_J14.pkl
104+
| | ├──SMPLX_NEUTRAL.npz
105+
| | ├──SMPLX_MALE.npz
106+
| | └──SMPLX_FEMALE.npz
107+
| └── smpl
57108
| ├──SMPL_FEMALE.pkl
58109
| ├──SMPL_MALE.pkl
59110
| └──SMPL_NEUTRAL.pkl
60-
├── cache
111+
├── preprocessed_npz
112+
│ └── cache
113+
| ├──agora_train_3840_w_occ_cache_2010.npz
114+
| ├──bedlam_train_cache_080824.npz
115+
| ├──...
116+
| └──coco_train_cache_080824.npz
61117
├── checkpoint
62118
│ └── aios_checkpoint.pth
63119
├── datasets
64120
│ ├── agora
65-
│ └── bedlam
66-
└── multihuman_data
67-
└── agora_validation_multi_3840_1010.npz
121+
| │ └──3840x2160
122+
│ │ ├──train
123+
│ │ └──test
124+
│ ├── bedlam
125+
│ │ ├──train_images
126+
│ │ └──test_images
127+
│ ├── ARCTIC
128+
│ │ ├──s01
129+
│ │ ├──s02
130+
│ │ ├──...
131+
│ │ └──s10
132+
│ ├── EgoBody
133+
│ │ ├──egocentric_color
134+
│ │ └──kinect_color
135+
│ └── UBody
136+
| └──images
137+
└── checkpoint
138+
├── edpose_r50_coco.pth
139+
└── aios_checkpoint.pth
140+
68141
```
69142
# Installtion
70143

@@ -106,11 +179,20 @@ cd ../../..
106179
- Inference output will be saved in `AiOS/demo/{INPUT_VIDEO}_out`
107180

108181
```bash
109-
cd main
110-
sh scripts/inference.sh {INPUT_VIDEO} {OUTPUT_DIR}
182+
# CHECKPOINT: checkpoint path
183+
# INPUT_VIDEO: input video path
184+
# OUTPUT_DIR: output path
185+
# NUM_PERSON: num of person. This parameter sets the expected number of persons to be detected in the input (image or video).
186+
# The default value is 1, meaning the algorithm will try to detect at least one person. If you know the maximum number of persons
187+
# that can appear simultaneously, you can set this variable to that number to optimize the detection process (a lower threshold is recommended as well).
188+
# THRESHOLD: socre threshold. This parameter sets the score threshold for person detection. The default value is 0.5.
189+
# If the confidence score of a detected person is lower than this threshold, the detection will be discarded.
190+
# Adjusting this threshold can help in filtering out false positives or ensuring only high-confidence detections are considered.
191+
# GPU_NUM: GPU num.
192+
sh scripts/inference.sh {CHECKPOINT} {INPUT_VIDEO} {OUTPUT_DIR} {NUM_PERSON} {THRESHOLD} {THRESHOLD}
111193

112194
# For inferencing short_video.mp4 with output directory of demo/short_video_out
113-
sh scripts/inference.sh short_video demo
195+
sh scripts/inference.sh data/checkpoint/aios_checkpoint.pth short_video.mp4 demo 2 0.1 8
114196
```
115197
# Test
116198

config/aios_smplx.py

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -210,21 +210,13 @@
210210
pretrained_model_path = '../output/train_gta_synbody_ft_20230410_132110/model_dump/snapshot_2.pth.tar'
211211

212212
# dataset setting
213-
# dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
214-
# trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
215-
dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
216-
trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
213+
214+
dataset_list = []
215+
trainset_3d = []
217216
trainset_2d = []
218-
trainset_partition = {
219-
'AGORA_MM': 0.4,
220-
'BEDLAM': 0.7,
221-
'COCO_NA': 1,
222-
223-
# 'EgoBody_Egocentric': 1,
224-
# 'EgoBody_Kinect': 1.0,
225-
}
217+
trainset_partition = {}
226218
trainset_humandata = []
227-
testset = 'INFERENCE_AGORA'
219+
testset = 'INFERENCE_BEDLAM'
228220
train_sizes=[480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]
229221
train_max_size=1333
230222
test_sizes=[800]

config/aios_smplx_agora_val.py

Lines changed: 5 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11

22
num_classes = 2
3-
lr = 0.0001*1.414/10
3+
lr = 1e-04
44
param_dict_type = 'default'
5-
lr_backbone = 1e-05*1.414/10
5+
lr_backbone = 1e-05
66
lr_backbone_names = ['backbone.0']
77
lr_linear_proj_names = ['reference_points', 'sampling_offsets']
88
lr_linear_proj_mult = 0.1
@@ -210,19 +210,10 @@
210210
pretrained_model_path = '../output/train_gta_synbody_ft_20230410_132110/model_dump/snapshot_2.pth.tar'
211211

212212
# dataset setting
213-
# dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
214-
# trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
215-
dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
216-
trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
213+
dataset_list = []
214+
trainset_3d = []
217215
trainset_2d = []
218-
trainset_partition = {
219-
'AGORA_MM': 0.4,
220-
'BEDLAM': 0.7,
221-
'COCO_NA': 1,
222-
223-
# 'EgoBody_Egocentric': 1,
224-
# 'EgoBody_Kinect': 1.0,
225-
}
216+
trainset_partition = {}
226217
trainset_humandata = []
227218
testset = 'INFERENCE_AGORA'
228219
train_sizes=[480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]

config/aios_smplx_bedlam.py

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -210,19 +210,11 @@
210210
pretrained_model_path = '../output/train_gta_synbody_ft_20230410_132110/model_dump/snapshot_2.pth.tar'
211211

212212
# dataset setting
213-
# dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
214-
# trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
215-
dataset_list = ['AGORA_MM','BEDLAM', 'COCO_NA']
216-
trainset_3d = ['AGORA_MM','BEDLAM', 'COCO_NA']
213+
214+
dataset_list = []
215+
trainset_3d = []
217216
trainset_2d = []
218-
trainset_partition = {
219-
'AGORA_MM': 0.4,
220-
'BEDLAM': 0.7,
221-
'COCO_NA': 1,
222-
223-
# 'EgoBody_Egocentric': 1,
224-
# 'EgoBody_Kinect': 1.0,
225-
}
217+
trainset_partition = {}
226218
trainset_humandata = []
227219
testset = 'INFERENCE_BEDLAM'
228220
train_sizes=[480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]

0 commit comments

Comments
 (0)