Skip to content

Latest commit

 

History

History

QFVS

📝 QFVS Data Preparation

QFVS dataset uses the same raw videos as the UT Egocentric (UT Ego) Dataset. We download the UT Ego raw videos and QFVS annotations. For quickstart and easy access for the users, we provide the preprocessed videos and annotations:

mkdir Datasets && cd Datasets
wget https://www.cis.jhu.edu/~shraman/EgoVLPv2/datasets/QFVS.tgz
tar -xvzf QFVS.tgz && rm QFVS.tgz

📊 Results

Method Video-1 Video-2 Video-3 Video-4 Average
EgoVLPv2 53.30 54.13 62.64 38.25 52.08

🎯 Fine-tuning on QFVS

Download EgoVLPv2 checkpoint, and update load_checkpoint in qfvs.json with its path. QFVS dataset contains only 4 videos. Use 3 videos for training, and the rest for evaluation. Perform 4 different training runs to evaluate on 4 different videos. Change device_ids based on available GPUs.

mkdir multimodal_features/
python main.py --train_videos 2,3,4 --test_video 1 --cuda_base cuda:0 --device_ids 0,1,2,3
python main.py --train_videos 1,3,4 --test_video 2 --cuda_base cuda:0 --device_ids 0,1,2,3
python main.py --train_videos 1,2,4 --test_video 3 --cuda_base cuda:0 --device_ids 0,1,2,3
python main.py --train_videos 1,2,3 --test_video 4 --cuda_base cuda:0 --device_ids 0,1,2,3

Due to the tiny training set, the results on the QFVS dataset significantly vary across different runs. We have performed multiple runs and reported the best results.

🙏 Acknowledgement

QFVS implementation partially uses VASNet codebase.