Skip to content

tedoaba/Vision-Question-Answering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project implements a Visual Question Answering (VQA) system using a pre-trained model from Hugging Face or a locally stored model.

Table of Contents

Installation

  1. Clone the repository:

    git clone https://github.com/tedoaba/Vision-Question-Answering.git
    cd Vision-Question-Answering
  2. Create a virtual environment:

    python -m venv venv
    source venv/Scripts/activate  # On Windows
    
  3. Install the dependencies:

    pip install -r requirements.txt

Usage

Using the Default Hugging Face Model (Online Mode)

Run the VQA system with an image and a question:

python scripts/run_vqa.py --image "path_to_image.jpg" --question "What is the person doing?"

Using a Local Model (Offline Mode)

Run the VQA system with a local model:

python scripts/run_vqa.py --image "path_to_image.jpg" --question "What is the color of the car?" --model_path "models/vqa/"

Using an Image from a URL

Run the VQA system with an image URL:

python scripts/run_vqa.py --image "https://example.com/image.jpg" --question "What is happening in the image?" --url

Running the Streamlit App

To run the Streamlit app, use the following command:

streamlit run app.py

Running Tests Locally

To run the tests locally, follow these steps:

  1. Install the dependencies:

    pip install -r requirements.txt
  2. Run the tests:

    python -m unittest discover -s tests

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.