#3 Winner of Best Use of Zoom API at Stanford TreeHacks 2025! An AI-powered meeting assistant that captures video, audio and textual context from Zoom calls using multimodal RAG.
-
Updated
Feb 16, 2025 - JavaScript
#3 Winner of Best Use of Zoom API at Stanford TreeHacks 2025! An AI-powered meeting assistant that captures video, audio and textual context from Zoom calls using multimodal RAG.
Real-time AI assistant powered by Gemini 2.5 Flash Native Audio — sees your camera, speaks back. Built with FastAPI + React.
Verity is a forensic verification AI that analyzes contradictory evidence across multiple formats (video, audio, documents, images) to reconstruct verified timelines and detect inconsistencies. Built with Gemini 3's advanced reasoning capabilities and transparent thinking mode.
Apsara 2.5: Evolution from Langchain to Google Gemini API with multimodal capabilities, URL context analysis, and integrated tools for chat, voice, and visual interactions.
Sales-Catalyst is an AI-powered assistant that automates meeting prep, product recommendations, and sales follow-ups. Using LLMs, vision, and Retrieval-Augmented Generation (RAG), it delivers real-time insights, competitor comparisons, and workflow automation in a conversational interface
Privacy-first multimodal AI assistant powered by Chrome’s on-device APIs. Supports text, voice, images, translation, summarization, and smart proofreading with zero data leaving your device.
Aura is a multimodal AI assistant for effortless scheduling. Add tasks via voice or image, with local or cloud AI (Firebase AI Logic) ensuring consistency. Tasks are stored safely on-device, and conflicts are auto-resolved for smooth, productive days.
Add a description, image, and links to the multimodal-ai topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-ai topic, visit your repo's landing page and select "manage topics."