Skip to content

Latest commit

 

History

History

ccneg_dataset

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Downloading the CC-Neg Dataset

CC-Neg: Images

The images for CC-Neg come from the ImageLabels split of the CC-3M which we prepare and provide here. Please find a compressed file called ccneg_images.zip in this directory, download, and extract the images. Verify that the structure of the ccneg_images folder becomes

ccneg_images
|___ cc3m_subset_images_extracted_final
	 |___ image1.jpg
         |___ image2.jpg
         ...

CC-Neg: Annotations

The annotations containing the true caption and the negated (false) caption for each image in CC-Neg can be downloaded from here. This file, named ccneg_preprocessed.pt must be downloaded into this directory. The helper for using distractor images during fine-tuning is provided here, named distractor_image_mapping.pt.

Paths for Data Configs

This directory is specified in configs given in src/configs/__init__.py which is accessed by the src/data folder. Here, the src/data/evaluation_datasets.py and src/data/finetuning_datasets.py use the configs to load in the dataset. For finetuning CLIP to get CoN-CLIP, we use MS-COCO along with CC-Neg. The root folders for both these datasets must be specificed in src/configs/__init__.py. Be sure to check this before running code which utilizes CC-Neg.