Improving Cross-Modal Retrieval with Set of Diverse Embeddings

This repository contains the official source code for our paper:

Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim, Namyup Kim, and Suha Kwak
POSTECH CSE
CVPR (Highlight), Vancouver, 2023.

Acknowledgement

Parts of our codes are adopted from the following repositories.

Dataset

data 
├─ coco_download.sh  
├─ coco # can be downloaded with the coco_download.sh 
│  ├─ images
│  │  └─ ......
│  └─ annotations 
│     └─ ......
├─ coco_butd
│  └─ precomp  
│     ├─ train_ids.txt
│     ├─ train_caps.txt
│     └─ ......   
├─ f30k 
│  ├─ images
│  │  └─ ......
│  ├─ dataset_flickr30k.json
│  └─ ......  
└─ f30k_butd
   └─ precomp  
      ├─ train_ids.txt
      ├─ train_caps.txt
      └─ ......

vocab # included in this repo
├─ coco_butd_vocab.pkl
└─ ......

coco_butd and f30k_butd: Datasets used for the Faster-RCNN image backbone. We use the pre-computed features provided by SCAN, which can be downloaded via https://github.com/kuanghuei/SCAN#download-data.
coco and f30k: Datasets used for the CNN backbones. Please refer the COCO download script and Flickr30K website+Flickr30K .json to download the images and captions.

Note: Downloaded datasets should be placed according to the directory structure presented above.

Requirements

You can install requirements using conda.

conda create --name <env> --file requirements.txt

Training on COCO

sh train_eval_coco.sh

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
agg_block		agg_block
data		data
sync_batchnorm		sync_batchnorm
vocab		vocab
.gitignore		.gitignore
README.md		README.md
data.py		data.py
eval.py		eval.py
logger.py		logger.py
loss.py		loss.py
model_pie.py		model_pie.py
model_spm.py		model_spm.py
option.py		option.py
requirements.txt		requirements.txt
similarity.py		similarity.py
train.py		train.py
train_eval_coco.sh		train_eval_coco.sh
vocab.py		vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

Acknowledgement

Dataset

Requirements

Training on COCO

About

Releases

Packages

Languages

kdwonn/DivE

Folders and files

Latest commit

History

Repository files navigation

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

Acknowledgement

Dataset

Requirements

Training on COCO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages