Securade.ai Sentinel - A monitoring and surveillance application that enables visual Q&A and video captioning for existing CCTV cameras.
-
Updated
Apr 6, 2025 - Python
Securade.ai Sentinel - A monitoring and surveillance application that enables visual Q&A and video captioning for existing CCTV cameras.
This repo contains the original implementation of VAuLT, the Vision-and-Augmented-Language Transformer. We provide instructions to download some multimodal social-media datasets, and scripts to experiment with. VAuLT is a stack of Transformers, a LM like BERT that preprocesses the text input of ViLT
A CLI and GUI for using the Vision-and-Language Transformer (ViLT) model for visual question answering (answering questions based on an image)
visual question answering in real-time
🔍 An AI tool for image-based Q&A and captioning, enabling users to upload images and receive concise answer to the question asked!
Compréhension d'images via les systèmes de Visual Question Answering (VQA), une tache située à l'intersection de la vision par ordinateur et du traitement du langage naturel (NLP), afin de comprendre et interpréter simultanément des informations visuelles et textuelles pour répondre à des questions sur des images.
Add a description, image, and links to the vilt topic page so that developers can more easily learn about it.
To associate your repository with the vilt topic, visit your repo's landing page and select "manage topics."