You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Visual Question Answering (VQA) project features a model with a simple GUI that handles both images and videos. It uses OpenAI's CLIP for encoding images and questions and GPT-2 for decoding embeddings to answer questions based on the VQA Version 2 dataset, which includes 265,016 images with multiple questions and answers.
Visual Question Answer (VQA) software! Powered by Flask, this project seamlessly combines images and questions to generate accurate responses. Explore the world of interactive visual understanding with ease.
SSG-VQA is a Visual Question Answering (VQA) dataset on laparoscopic videos providing diverse, geometrically grounded, unbiased and surgical action-oriented queries generated using scene graphs.