The ActivityNet-QA dataset contains 58,000 human-annotated QA pairs on 5,800 videos derived from the popular ActivityNet dataset. The dataset provides a benckmark for testing the performance of VideoQA models on long-term spatio-temporal reasoning.
The dataset folder contains the json
files for the questions and answers. We do not maintain the raw video files, and video files can be obtained from the official website: ActivityNet 200 (v1.3)
We provide a simple script and a exmaple prediction json file under the evaluation folder to calculate the accuracy per type.
python evaluation/eval.py --pred_file evaluation/pred_val_example.json --gt_file dataset/val_a.json
The code and the dataset are distributed under MIT LICENSE. They are only allowed for non-commercial use.
If the project are helpful for your research, please cite
@inproceedings{yu2019activityqa,
author = {Yu, Zhou and Xu, Dejing and Yu, Jun and Yu, Ting and Zhao, Zhou and Zhuang, Yueting and Tao, Dacheng},
title = {ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering},
booktitle = {AAAI},
pages = {9127--9134},
year = {2019}
}