Image Caption Generator

Overview

This project is an Image Caption Generator that utilizes a Pretrained ResNet-50 model for feature extraction from images and an LSTM (Long Short-Term Memory) model to generate captions for those images. The model has been trained on the COCO 2017 dataset, which contains a diverse collection of images and corresponding captions. The entire pipeline is deployed in a Streamlit app, allowing users to upload an image and receive a generated caption.

Getting Started

Prerequisites

Create a virtual environment in Python using the venv module.

Open a terminal or command prompt.
Navigate to the directory where you want to create the virtual environment. You can use the cd command to change your directory. For example:
```
cd path/to/your/desired/directory
```
Once you are in the desired directory, run the following command to create a virtual environment:

On macOS and Linux:
```
python3 -m venv venv_name
```
On Windows (using Command Prompt):
```
python -m venv venv_name
```
Replace venv_name with the name you want to give to your virtual environment. For example:
```
python3 -m venv myenv
```
Activate the virtual environment:

On macOS and Linux:
```
source venv_name/bin/activate
```
On Windows (using Command Prompt):
```
venv_name\Scripts\activate
```
After activation, your command prompt or terminal will show the virtual environment name, indicating that you are now working within the virtual environment.

Clone the Repository

Clone this repository to your local machine:

git clone https://github.com/KBVijayVarma/image-captioning.git
cd image-captioning

Installing Dependencies

Install the required packages using pip in the Virtual Environment:

pip install -r requirements.txt

Training

Dataset Preparation

Before training the model, you need to prepare the COCO 2017 dataset.

Download the following from the COCO Website.

Unzip the above files into a folder coco_dataset. Refer the Project Structure

Training the Model

Create a folder models in the working directory
Run the training.ipynb for training the Image Captioning Model
Rename the final pickle (.pkl) files in the models folder to encoder.pkl and decoder.pkl

Running Inference

Launching the Streamlit App

To use the Image Caption Generator, launch the Streamlit app:

streamlit run app.py

Image Input

In the Streamlit app, input the Image using the following options:

URL of the Image
File Uploader
Camera

Project Structure

image-captioning/
│
├── imgcaptioning/
│   ├── coco_dataset.py
│   ├── data_loader.py
│   ├── model.py
│   ├── inference_pipeline.py
│   ├── tokenizer.py
│   ├── utils.py
│   └── vocabulary.py
│
├── models/
│   ├── encoder.pkl
│   └── decoder.pkl
│
├── coco_dataset/
│   ├── annotations/
│   │   ├── captions_train2017.json
│   │   ├── captions_val2017.json
│   │   ├── image_info_test-dev2017.json
│   │   ├── image_info_test2017.json
│   │   ├── instances_train2017.json
│   │   ├── instances_val2017.json
│   │   ├── person_keypoints_train2017.json
│   │   └── person_keypoints_val2017.json
│   │
│   ├── train2017/
│   ├── test2017/
│   └── val2017/
│
├── .gitignore
├── app.py
├── requirements.txt
├── training.ipynb
└── vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Caption Generator

Overview

Table of Contents

Getting Started

Prerequisites

Clone the Repository

Installing Dependencies

Training

Dataset Preparation

Training the Model

Running Inference

Launching the Streamlit App

Image Input

Project Structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
imgcaptioning		imgcaptioning
models		models
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
training.ipynb		training.ipynb
vocab.json		vocab.json

KBVijayVarma/image-captioning

Folders and files

Latest commit

History

Repository files navigation

Image Caption Generator

Overview

Table of Contents

Getting Started

Prerequisites

Clone the Repository

Installing Dependencies

Training

Dataset Preparation

Training the Model

Running Inference

Launching the Streamlit App

Image Input

Project Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages