Skip to content

Visual Query Bot – Developed an interactive chat application that enables users to upload images, draw bounding boxes on specific regions, and ask targeted questions about those areas. Leveraged LangChain and LangGraph to build robust information retrieval agents for context-aware visual querying

Notifications You must be signed in to change notification settings

manu042k/VisionXAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisionXAI

VisionXAI is a comprehensive project that integrates machine learning models with a web-based frontend and backend services. This project is designed to provide a seamless experience for deploying and interacting with AI models.

Project Structure

  • LLM/: Contains the logic for the language model, including scripts and Jupyter notebooks for testing.
  • Frontend/: Houses the web application built with modern web technologies. It includes configuration files for TypeScript, Tailwind CSS, and Vercel deployment.
  • Backend/: Contains the server-side logic, including API services, model configurations, and memory management.

Getting Started

Prerequisites

  • Node.js and npm for the frontend
  • Python 3.x for the backend
  • Vercel CLI for deployment

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/VisionXAI.git
    cd VisionXAI
  2. Install Frontend Dependencies:

    cd Frontend
    npm install
  3. Install Backend Dependencies:

    cd ../Backend
    pip install -r requirements.txt

Running the Project

  • Frontend: Navigate to the Frontend directory and run:

    npm start
  • Backend: Navigate to the Backend directory and run:

    python app/main.py

Testing

  • Frontend: Use Jest for running tests.

    npm test
  • Backend: Use the provided Jupyter notebooks in the LLM directory for testing models.

ImageChatBot Functionality

The ImageChatBot class, located in Backend/app/memory.py, is an intelligent chatbot designed to analyze images and answer questions based on their content. It can optionally use external search results to provide more contextually accurate responses.

Key Features

  • Image Analysis: Utilizes the ChatGoogleGenerativeAI model to understand and describe visual elements within an image.
  • Search Integration: Employs the TavilySearchResults tool to fetch additional context from the web, enhancing the chatbot's ability to answer questions accurately.
  • Conditional Search: Determines whether a search is necessary based on the user's query and the image content.
  • Response Formatting: Provides responses in markdown format, including citations for any external sources used.

Usage Example

The chatbot can encode images to base64, decide if a search is needed, and generate responses with or without search results. It supports both synchronous and asynchronous response streaming.

Visual Demonstrations

Application Interface

Application Interface

The above image showcases the user interface of VisionXAI, highlighting its intuitive design and seamless integration of AI functionalities.

Workflow in Action

Workflow Demonstration

This GIF demonstrates the end-to-end workflow of VisionXAI, from uploading an image to receiving AI-powered insights and responses. It provides a glimpse into the real-time capabilities of the platform.

About

Visual Query Bot – Developed an interactive chat application that enables users to upload images, draw bounding boxes on specific regions, and ask targeted questions about those areas. Leveraged LangChain and LangGraph to build robust information retrieval agents for context-aware visual querying

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •