AI-Image-Captioning

An AI-powered image captioning app built with Streamlit, using ViT-GPT2 for caption generation and YOLOv8 for object detection. The app enhances captions by integrating detected objects into the generated text.

🔥 Features

AI-powered image captioning using ViT-GPT2.
Object detection with YOLOv8 to enhance captions.
Dark-themed UI with Streamlit.
Interactive settings for enabling/disabling object detection.
Optimized inference with GPU acceleration (CUDA support).

🚀 Demo

1️⃣ Upload an Image

2️⃣ Enable Object Detection and Generate Captions

3️⃣ View Enhanced Caption and Detected Objects

📂 Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/yourusername/AI-Image-Captioning.git
cd AI-Image-Captioning

2️⃣ Create a Virtual Environment (Optional but Recommended)

python -m venv venv
source venv/bin/activate   # On macOS/Linux
venv\Scripts\activate     # On Windows

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Run the Application

streamlit run app.py

🧠 Models Used

1️⃣ ViT-GPT2 (Image Captioning)

Pretrained Model: nlpconnect/vit-gpt2-image-captioning
Task: Generates textual descriptions for input images.

2️⃣ YOLOv8 (Object Detection)

Pretrained Model: yolov8n.pt
Task: Detects objects in the image to enhance captions.

⚙️ Project Structure

AI-Image-Captioning/
│── app.py                  # Main Streamlit application
│── requirements.txt        # Required dependencies
│── README.md               # Documentation
│── assest/                 # Store images/screenshots

🛠️ Usage Instructions

Upload an image in the app.
Choose whether to enable object detection.
Click 'Analyze Image' to generate a caption.
View enhanced captions and object detection results.

💡 Future Improvements

Add multilingual captioning support.
Optimize object detection performance.
Implement additional caption refinement techniques.

🤝 Contributing

Contributions are welcome! Feel free to fork this repository and create a pull request with your improvements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Image-Captioning

🔥 Features

🚀 Demo

1️⃣ Upload an Image

2️⃣ Enable Object Detection and Generate Captions

3️⃣ View Enhanced Caption and Detected Objects

📂 Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment (Optional but Recommended)

3️⃣ Install Dependencies

4️⃣ Run the Application

🧠 Models Used

1️⃣ ViT-GPT2 (Image Captioning)

2️⃣ YOLOv8 (Object Detection)

⚙️ Project Structure

🛠️ Usage Instructions

💡 Future Improvements

🤝 Contributing

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assest		assest
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

PrachiPatel15/AI-Image-Captioning

Folders and files

Latest commit

History

Repository files navigation

AI-Image-Captioning

🔥 Features

🚀 Demo

1️⃣ Upload an Image

2️⃣ Enable Object Detection and Generate Captions

3️⃣ View Enhanced Caption and Detected Objects

📂 Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment (Optional but Recommended)

3️⃣ Install Dependencies

4️⃣ Run the Application

🧠 Models Used

1️⃣ ViT-GPT2 (Image Captioning)

2️⃣ YOLOv8 (Object Detection)

⚙️ Project Structure

🛠️ Usage Instructions

💡 Future Improvements

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages