Sirui Xu*β
Dongting Li*β
Yucheng Zhang*β
Xiyan Xu*β
Qi Long*β
Ziyin Wang*β
Yunzhi Luβ
Shuchang Dongβ
Hezi Jiangβ
Akshat Guptaβ
Yu-Xiong Wangβ
Liang-Yan Guiβ
University of Illinois Urbana Champaign
*Equal contribution
CVPR 2025
- [2025-04-20] Initial release of the InterAct dataset.
- [2025-07-08] Release the proessing code for unified SMPL-H representation.
- [2025-09-12] Publish the paper on arXiv.
- [2025-10-06] Release the hoi correction pipeline.
- [2025-10-06] Release the evaluation pipeline for text-to-hoi.
- [2025-10-29] Release corrected OMOMO data.
- [2025-11-23] Provide additional supports on ARCTIC and ParaHome.
- [2025-11-26] Release training code, pretrained model and evaluator checkpoints.
- [2025-11-26] Release augmentated data for InterAct-X.
- [2025-12-07] π Release the data conversion pipeline for bringing InterAct into simulation, specifically for InterMimic use.
- [2026-02-03] Release the pipeline for object-to-human.
- [2025-02-03] Release corrected OMOMO data V2.
- Release comprehensive text descriptions, data processing workflows, visualization tools, and usage guidelines
- Release the proessing code for unified SMPL-H representation
- Publish the paper on arXiv
- Release the evaluation pipeline for the benchmark
- Release HOI correction pipeline
- Release HOI correction data
- Release augmentation data
- Release baseline constructions for text2HOI.
- Release the pipeline for constructing simulation ready data
- Release baseline constructions for Object to Human
- Release HOI correction data V2
- Release baseline constructions for the other HOI generative tasks
- Release the dataset with unified SMPL representation
- Release retargeted HOI dataset with unified human shape
We introduce InterAct, a comprehensive large-scale 3D human-object interaction (HOI) dataset, originally comprising 21.81 hours of HOI data consolidated from diverse sources, the dataset is meticulously refined by correcting contact artifacts and augmented with varied motion patterns to extend the total duration to approximately 30 hours. It includes 34.1K sequence-level detailed text descriptions.
The InterAct dataset is consolidated according to the licenses of its original data sources. For data approved for redistribution, direct download links are provided; for others, we supply processing code to convert the raw data into our standardized format.
Please follow the steps below to download, process, and organize the data.
Please fill out this form to request non-commercial access to InterAct and InterAct-X. Once authorized, you'll receive the download links. Organize the data from NeuralDome, IMHD, CHAIRS, OMOMO, and its corrected and augmented data according to the following directory structure.
Data structure
data
βββ neuraldome
β βββ objects
β β βββ baseball
β β βββ baseball.obj # object mesh
β β βββ sample_points.npy # sampled object pointcloud
β βββ ...
β βββ objects_bps
β β βββ baseball
β β βββ baseball.npy # static bps representation
β βββ ...
β βββ sequences
β β βββ subject01_baseball_0
β β βββ action.npy
β β βββ action.txt
β β βββ human.npz
β β βββ markers.npy
β β βββ joints.npy
β β βββ motion.npy
β β βββ object.npz
β β βββ text.txt
β βββ ...
β βββ sequences_canonical
β βββ subject01_baseball_0
β βββ action.npy
β βββ action.txt
β βββ human.npz
β βββ markers.npy
β βββ joints.npy
β βββ motion.npy
β βββ object.npz
β βββ text.txt
β βββ ...
βββ imhd
βββ chairs
βββ omomo
βββ annotations
The GRAB, BEHAVE, INTERCAP, ARCTIC datasets are available for academic research under custom licenses from the Max Planck Institute for Intelligent Systems. Note that we do not distribute the original motion dataβinstead, we provide the processing code and annotations. Besides, we support ParaHome in addition to our original dataset. To download these datasets, please visit their respective websites and agree to the terms of their licenses:
Please follow these steps to get started
-
Download SMPL+H, SMPLX, DMPLs.
Download SMPL+H mode fromΒ SMPL+HΒ (choose Extended SMPL+H model used in the AMASS project), DMPL model fromΒ DMPLΒ (choose DMPLs compatible with SMPL), and SMPL-X model from SMPL-X. Then, please place all the models under
./models/. The./models/folder tree should be:models βββ smplh β βββ female β β βββ model.npz β βββ male β β βββ model.npz β βββ neutral β β βββ model.npz β βββ SMPLH_FEMALE.pkl β βββ SMPLH_MALE.pkl β βββ SMPLH_NEUTRAL.pkl βββ smplx βββ SMPLX_FEMALE.npz βββ SMPLX_FEMALE.pkl βββ SMPLX_MALE.npz βββ SMPLX_MALE.pkl βββ SMPLX_NEUTRAL.npz βββ SMPLX_NEUTRAL.pklPlease follow smplx tools to merge SMPL-H and MANO parameters.
-
Prepare Environment
-
Create and activate a fresh environment:
conda create -n interact python=3.8 conda activate interact pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
To install PyTorch3D, please follow the official instructions: Pytorch3D.
Install remaining packages:
pip install -r requirements.txt python -m spacy download en_core_web_sm bash install_human_body_prior.sh
- Prepare raw data
-
BEHAVE
Download the motion data from this link, and put them into ./data/behave/sequences. Download object data from this link, and put them into ./data/behave/objects.
Expected File Structure:
data/behave/ βββ sequences β βββ data_name β βββ object_fit_all.npz # object's pose sequences β βββ smpl_fit_all.npz # human's pose sequences βββ objects βββ object_name βββ object_name.jpg # one photo of the object βββ object_name.obj # reconstructed 3D scan of the object βββ object_name.obj.mtl # mesh material property βββ object_name_tex.jpg # mesh texture βββ object_name_fxxx.ply # simplified object mesh
-
OMOMO
Download the dataset from this link, and download the text annotations from this link.
Expected File Structure:
data/omomo/raw βββ omomo_text_anno_json_data # Annotation JSON data βββ captured_objects β βββ object_name_cleaned_simplified.obj # Simplified object mesh βββ test_diffusion_manip_seq_joints24.p # Test sequences βββ train_diffusion_manip_seq_joints24.p # Train sequences
-
InterCap
Dowload InterCap from the the project website. Please download the one with "new results via newly trained LEMO hand models"
Expected File Structure:
data/intercap/raw βββ 01 βββ 01 βββ Seg_id βββ res.pkl # Human and Object Motion βββ Mesh βββ 00000_second_obj.ply # Object mesh ... -
GRAB
Download GRAB from the project website.
Expected File Structure:
data/grab/raw βββ grab β βββ s1 β βββ seq_name.npz # Human and Object Motion ... βββ tools βββ object_meshes # Object mesh βββ object_settings βββ subject_meshes # Subject mesh βββ subject_settings
-
ParaHome
Download ParaHome from the project website
Expected File Structure:
data/parahome/raw βββ seq β βββ s1 β βββ text_annotations.json β βββ object_transformations.pkl β βββ object_in_scene.json β βββ joint_states.pkl β βββ joint_positions.pkl β βββ head_tips.pkl β βββ hand_joint_orientations.pkl β βββ bone_vectors.pkl β βββ body_joint_orientations.pkl β βββ body_global_transform.pkl ... βββ scan β βββ book β βββ simplified β βββ base.obj ... βββ smplx_seq βββ s1 βββ smplx_params.pkl βββ smplx_pose.pkl ... -
ARCTIC
Download raw sequences, and meta files from the project website
Download text annotations from this project website
Expected File Structure:
data/arctic βββ description β βββ s01 β βββ box_grab_01 β βββ description.txt ... ... βββ raw βββ meta β βββ object_vtemplates β βββ box β βββ bottom_keypoints_300.json β βββ bottom.obj ... β βββ mesh.obj ... β βββ parts.json β βββ top_keypoints_300.json β βββ top.obj ... β βββ subject_vtemplates β βββ s01.obj ... β βββ s10.obj ... βββ raw_seqs βββ s01 βββ box_grab_01.smplx.npy ... ...
-
Data Processing
After organizing the raw data, execute the following steps to process the datasets into our standard representations.
-
Run the processing scripts for each dataset:
python process/process_behave.py python process/process_grab.py python process/process_intercap.py python process/process_omomo.py python process/process_parahome.py python process/process_arctic.py
-
Canonicalize the object mesh:
python process/canonicalize_obj.py -
Segment the sequences according to annotations and generate associated text files:
python process/process_text.py python process/process_text_omomo.py
After processing, the directory structure under data/ should include all sub-datasets, including:
data βββ annotation βββ behave β βββ objects β β βββ object_name β β βββ object_name.obj β βββ sequences β βββ id β βββ human.npz β βββ object.npz β βββ text.txt βββ omomo β βββ objects β β βββ object_name β β βββ object_name.obj β βββ sequences β βββ id β βββ human.npz β βββ object.npz β βββ text.txt βββ intercap β βββ objects β β βββ object_name β β βββ object_name.obj β βββ sequences β βββ id β βββ human.npz β βββ object.npz β βββ text.txt βββ grab β βββ objects β β βββ object_name β β βββ object_name.obj β βββ sequences β βββ id β βββ human.npz β βββ object.npz β βββ text.txt βββ parahome β βββ objects β β βββ object_name β β βββ base.obj β βββ sequences β βββ id β βββ human.npz β βββ object_{object_name}_{part}.npz βββ arctic βββ objects β βββ object_name β βββ base.obj βββ sequences βββ id βββ human.npz βββ object.npz -
Canonicalize the human data by running:
python process/canonicalize_human.py # or multi_thread for speedup python process/canonicalize_human_multi_thread.py -
Sample object keypoints:
python process/sample_obj.py
-
Extract motion representations:
python process/motion_representation.py
-
Process the object bps for training:
python process/process_bps.py
Optional: HOI Correction
-
To get the corrected OMOMO, please fill out this form to request non-commercial access, or process from scratch following the scripts below.
-
Step1: Correct the full-body hoi by:
python ./hoi_correction/optimize_fullbody.py --dataset behave python ./hoi_correction/optimize_fullbody_intercap.py --dataset intercap
-
Step2: Correct the wrist by:
python ./hoi_correction/scan_diff.py --dataset omomo python ./hoi_correction/correct_wrist.py --dataset omomo
-
Step3: Correct the hand by:
python ./hoi_correction/optimize.py --dataset omomo python ./hoi_correction/optimize_hand_behave.py --dataset behave
Optional: Unify the motion to SMPL-H
-
Data
Register on the SMPL-X website, go to the downloads section to get the correspondences and sample data, by clicking on the Model correspondences button. Create a folder named
transfer_dataand extract the downloaded zip there. You should have the following folder structure now:process/smpl_conversion/transfer_data βββ meshes βΒ Β βββ smpl βΒ Β βββ smplx βββ smpl2smplh_def_transfer.pkl βββ smpl2smplx_deftrafo_setup.pkl βββ smplh2smpl_def_transfer.pkl βββ smplh2smplx_deftrafo_setup.pkl βββ smplx2smpl_deftrafo_setup.pkl βββ smplx2smplh_deftrafo_setup.pkl βββ smplx_mask_ids.npy
-
Unify the SMPL representation by:
cd ./process/smpl_conversion python -m transfer_model --exp-cfg config_files/smplx2smplh.yaml --dataset grab--dataset: dataset in [grab, omomo, chairs, intercap]We adapt the smpl conversion code from https://github.com/vchoutas/smplx.git , special thanks to them!
To load and explore our data, please refer to the demo notebook.
This pipeline depends on the requirements listed in the InterMimic project. Please make sure all dependencies are installed before running the script.
After completing the data preparation steps above, run the following to generate the simulation assets:
cd simulation
python interact2mimic.py --dataset_name [dataset]After processing, the generated files will be organized as follows:
-
Motion files (
.pt) are stored insimulation/intermimic/InterAct/{dataset} -
SMPL humanoid files (
.xml) are stored insimulation/intermimic/data/assets/{model_type} -
Object files (
.urdf) are stored insimulation/intermimic/data/assets/objects/{dataset}
For details on data loading, replaying, and training with the processed data, please refer to the InterMimic repository. We adapt the conversion code from PHC, special thanks to them!
Additional dependency:
pointnet2_opsis required for this module. Runbash install_pointnet2_ops.shfrom the project root to install it.
Prepare
Download pretrained model and evaluator models:
Training
To train on our benchmark, execute the following steps:
cd text2interaction
python -m train.hoi_diff --save_dir ./save/t2m_interact --dataset interact
Evaluation
To evaluate on our benchmark, execute the following steps
-
Evaluate on the marker representation:
cd text2interaction bash ./scripts/eval.sh -
Evaluate on the marker representation with contact guidance used:
cd text2interaction bash ./scripts/eval_wguide.sh
Inference
To inference with the trained model, execute the following steps
-
Inference with contact guidance:
cd text2interaction bash ./scripts/run_sample_guide_contact.sh -
Inference without contact guidance:
cd text2interaction bash ./scripts/run_sample_nonguide.sh
Training
To train on our benchmark, execute the following steps:
cd object2human
bash ./scripts/Train_markerContact_VecDist.sh
Evaluation
To evaluate on our benchmark, execute the following steps
cd object2human
bash ./scripts/Eval.sh
To visualize the dataset, execute the following steps:
-
Run the visualization script:
python visualization/visualize.py [dataset_name]
Replace [dataset_name] with one of the following: behave, neuraldome, intercap, omomo, grab, imhd, chairs.
-
To visualize markers, run:
python visualization/visualize_markers.py
If you find this repository useful for your work, please cite:
@inproceedings{xu2025interact,
title = {{InterAct}: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation},
author = {Xu, Sirui and Li, Dongting and Zhang, Yucheng and Xu, Xiyan and Long, Qi and Wang, Ziyin and Lu, Yunzhi and Dong, Shuchang and Jiang, Hezi and Gupta, Akshat and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {CVPR},
year = {2025},
}Please also consider citing the specific sub-dataset you used from InterAct as follows:
@inproceedings{taheri2020grab,
title = {{GRAB}: A Dataset of Whole-Body Human Grasping of Objects},
author = {Taheri, Omid and Ghorbani, Nima and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {ECCV},
year = {2020},
}
@inproceedings{brahmbhatt2019contactdb,
title = {{ContactDB}: Analyzing and Predicting Grasp Contact via Thermal Imaging},
author = {Brahmbhatt, Samarth and Ham, Cusuh and Kemp, Charles C. and Hays, James},
booktitle = {CVPR},
year = {2019},
}
@inproceedings{bhatnagar2022behave,
title = {{BEHAVE}: Dataset and Method for Tracking Human Object Interactions},
author = {Bhatnagar, Bharat Lal and Xie, Xianghui and Petrov, Ilya and Sminchisescu, Cristian and Theobalt, Christian and Pons-Moll, Gerard},
booktitle = {CVPR},
year = {2022},
}
@article{huang2024intercap,
title = {{InterCap}: Joint Markerless {3D} Tracking of Humans and Objects in Interaction from Multi-view {RGB-D} Images},
author = {Huang, Yinghao and Taheri, Omid and Black, Michael J. and Tzionas, Dimitrios},
journal = {IJCV},
year = {2024}
}
@inproceedings{huang2022intercap,
title = {{InterCap}: {J}oint Markerless {3D} Tracking of Humans and Objects in Interaction},
author = {Huang, Yinghao and Taheri, Omid and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {GCPR},
year = {2022},
}
@inproceedings{jiang2023full,
title = {Full-body articulated human-object interaction},
author = {Jiang, Nan and Liu, Tengyu and Cao, Zhexuan and Cui, Jieming and Zhang, Zhiyuan and Chen, Yixin and Wang, He and Zhu, Yixin and Huang, Siyuan},
booktitle = {ICCV},
year = {2023}
}
@inproceedings{zhang2023neuraldome,
title = {{NeuralDome}: A Neural Modeling Pipeline on Multi-View Human-Object Interactions},
author = {Juze Zhang and Haimin Luo and Hongdi Yang and Xinru Xu and Qianyang Wu and Ye Shi and Jingyi Yu and Lan Xu and Jingya Wang},
booktitle = {CVPR},
year = {2023},
}
@article{li2023object,
title = {Object Motion Guided Human Motion Synthesis},
author = {Li, Jiaman and Wu, Jiajun and Liu, C Karen},
journal = {ACM Trans. Graph.},
year = {2023}
}
@inproceedings{zhao2024imhoi,
author = {Zhao, Chengfeng and Zhang, Juze and Du, Jiashen and Shan, Ziwei and Wang, Junye and Yu, Jingyi and Wang, Jingya and Xu, Lan},
title = {{I'M HOI}: Inertia-aware Monocular Capture of 3D Human-Object Interactions},
booktitle = {CVPR},
year = {2024},
}
@inproceedings{kim2025parahome,
title = {Parahome: Parameterizing everyday home activities towards 3d generative modeling of human-object interactions},
author = {Kim, Jeonghwan and Kim, Jisoo and Na, Jeonghyeon and Joo, Hanbyul},
booktitle = {CVPR},
year = {2025}
}
@inproceedings{fan2023arctic,
title = {{ARCTIC}: A Dataset for Dexterous Bimanual Hand-Object Manipulation},
author = {Fan, Zicong and Taheri, Omid and Tzionas, Dimitrios and Kocabas, Muhammed and Kaufmann, Manuel and Black, Michael J. and Hilliges, Otmar},
booktitle = {CVPR},
year = {2023}
}
