This is the official repository for the paper: Scene Synthesis from Human Motion [Paper] [Project page]
We highly recommand you to create a Conda environment to better manage all the Python packages needed.
conda create -n summon python=3.8
conda activate summon
After you create the environment, please install pytorch with CUDA. You can do it by running
conda install pytorch pytorch-cuda=11.6 -c pytorch -c nvidia
The other dependencies needed for this work is listed in the requirements.txt. We recommend you to use pip to install them:
pip install -r requirements.txt
We provide our preprocessed PROXD dataset and several testing motion sequences from the AMASS dataset. Please use this link to download. After downloading, please unzip it in the project root directory.
We provide a pretrained model checkpoint
for ContactFormer. Please download and unzip it in the project root directory.
After doing that, you will get a training/
folder with two subfolders:
contactformer/
and posa/
.
We also provide a small subset of 3D_Future for you to test. Please use this link to download and unzip it at the root directory.
To generate contact label predictions for all motion sequences stored in the
.npy
format in a folder (e.g. amass/
in our provided data folder),
you can run
cd contactFormer
python predict_contact.py ../data/amass --load_model ../training/contactformer/model_ckpt/best_model_recon_acc.pt --output_dir PATH_OF_OUTPUT
Please replace the PATH_OF_OUTPUT
to any path you want. If you want to
generate predictions for the PROXD dataset, you can try
python predict_contact.py ../data/proxd_valid/vertices_can --load_model ../training/contactformer/model_ckpt/best_model_recon_acc.pt --output_dir PATH_OF_OUTPUT
The above example command generate predictions for the validation split of PROXD.
If you want save the probability for each contact object category in order
to generate more diverse scenes, you can add a --save_probability
flag
in addition to the above command.
You can train and test your own model. To train the model, still under contactformer/
, you can run
python train_contactformer.py --train_data_dir ../data/proxd_train --valid_data_dir ../data/proxd_valid --fix_ori --epochs 1000 --out_dir ../training --experiment default_exp
Replace the train and validation dataset paths after --train_data_dir
and --valid_data_dir
with the path of your downloaded data. --fix_ori
normalizes the orientation of
all motion sequences in the dataset: for each motion sequence, rotate all poses in that sequence
so that the first pose faces towards some canonical orientation (e.g. pointing out of the screen)
and the motion sequence continues with the rotated first pose. --experiment
specifies the name
of the current experiment. Running the above command, all model checkpoints and
training logs will be saved at <path_to_project_root>/training/default_exp
.
To test a model checkpoint, you can run
python test_contactformer.py ../data/proxd_valid/ --load_model ../training/contactformer/model_ckpt/best_model_recon_acc.pt --model_name contactformer --fix_ori --test_on_valid_set --output_dir PATH_OF_OUTPUT
The above command tests ContactFormer on the validation split of PROXD dataset.
The first argument is location of the validation set folder.
--model_name
is an arbitrary name you can set for disguishing the model you are testing.
It can also help you pinpoint the location of the test result
since the result will be saved in a text file at the location PATH_OF_OUTPUT/validation_results_<model_name>.txt
.
You can also run add a --save_video
flag to save the visualization of contact label prediction
for some specific motion sequence. For example, you can run
python test_contactformer.py ../data/proxd_valid/ --load_model ../training/contactformer/model_ckpt/best_model_recon_acc.pt --model_name contactformer --fix_ori --single_seq_name MPH112_00151_01 --save_video --output_dir PATH_OF_OUTPUT
to save the visualization for predicted contact labels along with rendered body and scene meshes
for each frame in the MPH112_00151_01 motion sequence. The rendered frames will be saved at
PATH_OF_OUTPUT/MPH112_00151_01/
. Note that you need a screen to run this command.
There are other parameters you can set to change the training scheme or the model architecture. Check
train_contactformer.py
and test_contactformer.py
for more details.
To fit best object based on the most probossible contact prediction, you can run the following, under the root directory of this repository:
python fit_best_obj.py --sequence_name <sequence_name> --vertices_path <path_to_human_vertices> --contact_labels_path <path_to_contact_predictions> --output_dir <output_directory>
The fitting results for all candidate objects will be saved under
<output_directory>/<sequence_name>/fit_best_obj
.
For each predicted object category, there is a json file called best_obj_id.json
which stores the object ID that achieves the lowest optimization loss.
For example, suppose you have human vertices for motion sequence MPH11_00150_01 saved at
data/proxd_valid/vertices/MPH11_00150_01_verts.npy
and contact predictions saved at
predictions/proxd_valid/MPH11_00150_01.npy
, you can run
python fit_best_obj.py --sequence_name MPH11_00150_01 --vertices_path data/proxd_valid/vertices/MPH11_00150_01_verts.npy --contact_labels_path predictions/proxd_valid/MPH11_00150_01.npy --output_dir fitting_results
Fitting results will be saved under fitting_results/MPH11_00150_01/fit_best_obj
.
There are several things noteworthy here:
- If it is the first time you run this optimization for some human motion sequence, the script can take several minutes since it needs to estimate the SDF for human meshes in all frames. The estimation will be saved to accelerate future inference.
- If you want to run this script on a server/cluster without a screen, please add
os.environ['PYOPENGL_PLATFORM'] = 'egl'
infit_best_obj.py
after the lineimport os
. - If you want to use contact probability for each object category as input (i.e.
if you used
--save_probability
flag when generating the contact prediction), you can need to--input_probability
flag at the end.
If you want to visualize the fitting result (i.e. recovered objects along with the human motion), using the same example as mentioned above, you can run
python vis_fitting_results.py --fitting_results_path fitting_results/MPH11_00150_01 --vertices_path data/proxd_valid/vertices/MPH11_00150_01_verts.npy
The script will save rendered frames in fitting_results/MPH11_00150_01/rendering
.
Note that you need a screen to run this command. In case you are testing the project on a server
which doesn't have a display service, you can still load the saved objects and human meshes and
use other approaches to visualize them. To get the human meshes, you can still run the above
command and wait until the program automatically exits. The script will save the human meshes
of your specified motion sequence in fitting_results/<sequence name>/human/mesh
.
Best fitting objects are stored in fitting_results/<sequence name>/fit_best_obj/<object category>/<object index>/<best_obj_id>/opt_best.obj
.
As mentioned before, you can get <best_obj_id>
in fitting_results/<sequence name>/fit_best_obj/<object category>/<object index>/best_obj_id.json
.
Note that the candidate objects for fitting will be from the 3D_Future
directory in this repository, which is a subset of the 3D Future dataset. You can modify the candidate objects by changing the contents of the 3D_Future
directory.
Our work allow you to further populate the scene with non-contact objects. To achieve this, we adopted a in-door scene generation model called ATISS. To use it, please first install additional dependencies required for ATISS:
pip install -r atiss/requirements.txt
conda install torchvision -c pytorch
Then, please download our pretrained ATISS model checkpoint here. Then, you can do scene completion by running
python scene_completion.py --fitting_results_path <fitting_results_path> --path_to_model <path_to_atiss_ckpt> --obj_dataset_path 3D_Future/models --spare_length 1 --num_iter 2
Here fitting_results_path
should be the object fitting results you just ran for some sequence.
In the above example, this path should be fitting_results/MPH11_00150_01
. path_to_model
should be
the path to the ATISS model checkpoint. spare_length
is the extra size you want to add to the scene/room.
Setting it to 0 will make the scene/room exactly bound existed objects. num_iter
is the number
of non-contact objects you want to add to the scene.
After completing the scene, you can visualize the results using exactly the same visualization script provided in the previous section.
If you find this work helpful, please consider citing:
@inproceedings{10.1145/3550469.3555426,
author = {Ye, Sifan and Wang, Yixing and Li, Jiaman and Park, Dennis and Liu, C. Karen and Xu, Huazhe and Wu, Jiajun},
title = {Scene Synthesis from Human Motion},
year = {2022},
isbn = {9781450394703},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3550469.3555426},
doi = {10.1145/3550469.3555426},
booktitle = {SIGGRAPH Asia 2022 Conference Papers},
articleno = {26},
numpages = {9},
keywords = {Scene synthesis, activity understanding, motion analysis},
location = {Daegu, Republic of Korea},
series = {SA '22}
}
}