GitHub - guptakhil/show-tell: Implementation of the paper "Show and Tell: A Neural Image Caption Generator" by Vinyals et al. (CVPR 2015)

Implementation in PyTorch

Done as part of the Course Project for IE-534/CS-547 (Deep Learning).

Contributors:

Akhil Gupta
Gregory Romanchek
Heba Flemban
Moitreya Chatterjee
Yan Zhang
Zihao Yang

Dependencies (with version numbers):

pycocotools 2.0.0
python 3.7.5
python-dateutil 2.8.1
pytorch 1.1.0
qt 5.9.7
readline 7.0
scikit-learn 0.21.3
scipy 1.3.1
six 1.13.0
torchvision 0.3.0
nltk

Dataset Pre-processing

Clone this repository
Download the COCO training and validation image sets and the annotation files as well. Extract and keep them in a folder called COCO. This folder is our data folder.

Execution Instruction :

The code has three main components. We provide the instructions for running them here, the full choice of different arguments maybe obtained the argparse in the main files:

The GRU based model:
- Training Regime:
  $ python main.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --embedding_length 512 --optimizer_type Adam --lr 0.0001 --num_epochs 100 --shuffle True --num_workers 16 --is_training 1 --is_testing 0
- Testing Regime:
  $ python main.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --load_model_test <model_name> --num_workers 16 --is_training 0 --is_testing 1
The LSTM based model (inside the LSTM folder)
- Training Regime:
  $ python main_lstm.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --embedding_length 512 --optimizer_type Adam --lr 0.0001 --num_epochs 100 --shuffle True --num_workers 16 --is_training 1 --is_testing 0
- Testing Regime:
  $ python main_lstm.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --load_model_test <model_name> --num_workers 16 --is_training 0 --is_testing 1
The Attention based model (inside the Attention folder)
- Training Regime (Attention Model with GRU decoder):
  $ python main_attn.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --embedding_length 512 --optimizer_type Adam --lr 0.0001 --num_epochs 100 --alpha_c 1.0 --shuffle True --num_workers 16 --is_training 1 --is_testing 0
- Testing Regime:
  $ python main_attn.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --load_model_test <model_name> --num_workers 16 --is_training 0 --is_testing 1
- Training Regime (Attention Model with LSTM decoder):
  $ python main_attn_LSTM.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --embedding_length 512 --optimizer_type Adam --lr 0.0001 --num_epochs 100 --alpha_c 1.0 --shuffle True --num_workers 16 --is_training 1 --is_testing 0
- Testing Regime:
  $ python main_attn_LSTM.py --data_dir <path/to/dataset/root> --output_dir <path/to/output/directory> --load_model_test <model_name> --num_workers 16 --is_training 0 --is_testing 1

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
Attention		Attention
LSTM		LSTM
Report		Report
evaluation		evaluation
.gitignore		.gitignore
LICENSE		LICENSE
Loss_Analysis.ipynb		Loss_Analysis.ipynb
README.md		README.md
beam_search.py		beam_search.py
cnn.py		cnn.py
config.json		config.json
files_info.txt		files_info.txt
main.py		main.py
rnn.py		rnn.py
utils.py		utils.py
vocab_builder.py		vocab_builder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Implementation in PyTorch

Contributors:

Dependencies (with version numbers):

Dataset Pre-processing

Execution Instruction :

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

guptakhil/show-tell

Folders and files

Latest commit

History

Repository files navigation

Implementation in PyTorch

Contributors:

Dependencies (with version numbers):

Dataset Pre-processing

Execution Instruction :

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages