Open Set Video HOI detection from Action-centric Chain-of-Look Prompting.

Code for the paper Open Set Video HOI detection from Action-centric Chain-of-Look Prompting.
Nan Xi, Jingjing Meng, Junsong Yuan, ICCV 2023.

Paper Link: link

Human-Object Interaction (HOI) detection is essential for understanding and modeling real-world events. Existing works on HOI detection mainly focus on static images and a closed setting, where all HOI classes are provided in the training set. In comparison, detecting HOIs in videos in open set scenarios is more challenging. First, under open set circumstances, HOI detectors are expected to hold strong generalizability to recognize unseen HOIs not included in the training data. Second, accurately capturing temporal contextual information from videos is difficult, but it is crucial for detecting temporal-related actions such as open, close, pull, push. To this end, we propose ACoLP, a model of Action-centric Chain-of-Look Prompting for open set video HOI detection. ACoLP regards actions as the carrier of semantics in videos, which captures the essential semantic information across frames. To make the model generalizable on unseen classes, inspired by the chain-of-thought prompting in natural language processing, we introduce the chain-of-look prompting scheme that decomposes prompt generation from large-scale vision-language model into a series of intermediate visual reasoning steps. Consequently, our model captures complex visual reasoning processes underlying the HOI events in videos, providing essential guidance for detecting unseen classes. Extensive experiments on two video HOI datasets, VidHOI and CAD120, demonstrate that ACoLP achieves competitive performance compared with the state-of-the-art methods in the conventional closed setting, and outperforms existing methods by a large margin in the open set setting.

Citation

If you find this repository helpful, please cite it as

@InProceedings{Xi_2023_ICCV, author = {Xi, Nan and Meng, Jingjing and Yuan, Junsong}, title = {Open Set Video HOI detection from Action-Centric Chain-of-Look Prompting}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {3079-3089} }

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
models		models
utils		utils
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Set Video HOI detection from Action-centric Chain-of-Look Prompting.

Citation

About

Releases

Packages

Languages

vhzy/ACoLP

Folders and files

Latest commit

History

Repository files navigation

Open Set Video HOI detection from Action-centric Chain-of-Look Prompting.

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages