Image captioning generation using Swin transformer and GRU attention mechanism
-
Updated
Oct 8, 2024 - Jupyter Notebook
Image captioning generation using Swin transformer and GRU attention mechanism
A versatile app that converts images into short stories and lifelike audio locally. It combines Hugging Face's image captioning, Groq's story generation, and Parler TTS for local text-to-speech synthesis. Ideal for AI-driven projects with fast, reliable on-device TTS.
This repository contains an implementation of an image captioning model that leverages deep learning for generating textual descriptions of images. The model extracts high-level features from images using a pre-trained convolutional neural network (CNN) model, such as VGG16, and stores them for further processing. The extracted features are later u
ImgCap is an image captioning model designed to automatically generate descriptive captions for images. It has two versions CNN + LSTM model and CNN + LSTM + Attention mechanism model.
Transform your images into valuable insights and creative content using Google Gemini
This project is an image caption generator that uses a deep learning model to generate captions for images. The model is trained using the Flickr8k dataset and leverages a pre-trained Xception model for feature extraction and an LSTM network for sequence processing.
Generative AI Models is a comprehensive repository dedicated to the implementation of cutting-edge generative AI models using Python. It features various models, including those for image captioning and text-to-image generation, leveraging advanced architectures like Vision Transformers (ViT), GPT-2, and Stable Diffusion.
Repo for Implementing Research Papers & Projects related to Machine Learning
Image captioning model using CNN and LSTM
Image Captioning System using VggNet and LSTM Encoder-Decoder architecture
Here are all my code files of Advanced AI/ML architectures built from scratch using Pytorch.
BLIP-ImageCaption
A pytorch repetition of Novel Object Captioner (NOC)
This repo contains code for an Image Caption Generator using a Flask web app and LSTM neural network.
Pre-Trained CNN Architecture for Indonesian Image Captioning using Transformer.
Pytorch implemention of Deep CNN Encoder + LSTM Decoder with Attention for Image to Latex
Convert Image to audio using ViT, GPT and FastSpeech
Add a description, image, and links to the imagecaptioning topic page so that developers can more easily learn about it.
To associate your repository with the imagecaptioning topic, visit your repo's landing page and select "manage topics."