FutureCap

"a future in the past" -- Assassin's Creed

This repository contains the reference code for the paper Efficient Modeling of Future Context for Image Captioning. In this paper, we aims to utilize mask-based non-autoregressive image caption (NAIC) model to improve the performance of conventional image captioning model with dynamic distribution calibration. As NAIC model is applied to calibrate the generated sentence, the length predictor is dropped.

1. Requirements

torch==1.10.1

transformers==4.11.3

clip

2. Dataset

To run the code, annotations and detection features for the COCO dataset are needed. Please download the annotations file annotations.zip and extract it. Image representation are firstly computed with the pre-trained model provided by CLIP.

3. Training

First, run python train_NAIC.py to obtain the non-autoregressive image captioning model, which serves as a teacher model. Then, run python train_combine.py to conduct a distribution calibration of conventional transformer image captioning model.

Training arguments are as followings:

Argument	Possible values
`--batch_size`	Batch size (default: 10)
`--workers`	Number of workers (default: 0)
`--warmup`	Warmup value for learning rate scheduling (default: 10000)
`--resume_last`	If used, the training will be resumed from the last checkpoint.
`--data_path`	Path to COCO dataset file
`--annotation_folder`	Path to folder with COCO annotations

4. Evaluation

To reproduce the results reported in our paper, download the pretrained model file from google drive and place it in the ckpt folder.

Run python inference.py using the following arguments:

Argument	Possible values
`--batch_size`	Batch size (default: 10)
`--workers`	Number of workers (default: 0)
`--data_path`	Path to COCO dataset file
`--annotation_folder`	Path to folder with COCO annotations

5. Acknowledgements

This repository is based on M2T and Huggingface, and you may refer to it for more details about the code.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
__pycache__		__pycache__
data		data
evaluation		evaluation
models		models
utils		utils
README.md		README.md
dataset.py		dataset.py
train_aic.py		train_aic.py
train_combine.py		train_combine.py
train_naic.py		train_naic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FutureCap

1. Requirements

2. Dataset

3. Training

4. Evaluation

5. Acknowledgements

About

Releases

Packages

Languages

feizc/Future-Caption

Folders and files

Latest commit

History

Repository files navigation

FutureCap

1. Requirements

2. Dataset

3. Training

4. Evaluation

5. Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages