Skip to content

Sparky is an interactive assistant designed to help users identify and understand various hardware and electronic components.

Notifications You must be signed in to change notification settings

dheerajkallakuri/Sparky-HardwareGuru

Repository files navigation

Sparky - The Hardware Guru

Sparky is an interactive assistant designed to help users identify and understand various hardware and electronic components. By leveraging voice recognition and image processing, Sparky can respond to questions about tools and electronics, providing concise, layman-friendly explanations. Users can ask about specific tools or show components.

Demo Video

Sparky Demo Video

Click the image above or Video Link to watch the video.

Features

  • Voice Recognition: Users can ask questions about hardware tools.
  • Image Recognition: Users can show images of components for identification.
  • Interactive Learning: Sparky provides explanations and helps users understand different tools and electronics.

Architecture

Sparky is built using a Raspberry Pi as the central hub. It integrates:

  • A microphone for voice input
  • A speaker for audio output
  • A camera for image recognition

The system employs software algorithms for processing voice commands and analyzing images of hardware items. All components are housed in a custom 3D-printed model, giving Sparky a unique and engaging design.

Main Files

  • hardwareguru.py: The main assistant that answers user questions.
  • toolidentify.py: Contains computer vision code to detect tools in the scene.

Component Files

  • 6 Additional Files: These include reference files for checking different hardware and assisting in the debugging process.

Technology Used

  • Computer Vision: The bot uses a custom-trained model to detect six classes of tools: hammer, pliers, screwdriver, wrench, ESP_32, and Raspberry Pi.
  • Model Training: The model is trained using YOLOv8, achieving an accuracy of approximately mAP50 of 82% and mAP50-95 of 67%. This allows the bot to recognize tools effectively.
  • Gemini API: Ensure that a Gemini API key is generated for interaction sessions with the chatbot.

Challenges

While developing Sparky, we encountered several challenges:

  • Dataset Collection: Gathering an appropriate dataset for the custom training model using YOLO was difficult. Achieving satisfactory accuracy required multiple training sessions.
  • Interactivity: Ensuring the system communicated effectively within a domain-specific context was complex.
  • Integration: Combining various components and deploying the system on hardware added further complications, especially given our limited timeframe.

Accomplishments

We are proud to have successfully assembled an assistant that effectively responds to user inquiries and provides live detection of tools, along with accurate summaries. Achieving an accuracy rate of around 80% across six different classes is a significant milestone for us. This accomplishment showcases our ability to blend technology and user experience, allowing Sparky to serve as a reliable assistant in the realm of hardware and electronics.

Results of Computer Vision Model:

Confusion Matrix F1-Curve

What We Learned

We discovered creative methods to generate domain-specific content using generative AI and learned how to activate computer vision models directly within the chatbot.

What's Next for Sparky - The Hardware Guru

Moving forward, we plan to enhance Sparky's capabilities by:

  • Improving accuracy through the addition of more images and expanding the dataset.
  • Enabling Sparky to recognize unknown tools by providing images for on-the-spot training.
  • Refining the user interface on the website and integrating the computer vision model.
  • Incorporating motors or servos into the hardware to enhance Sparky's aesthetic and functionality.

Getting Started

To get started with Sparky, ensure you have the following:

  1. Raspberry Pi: Set up with a microphone, speaker, and camera.
  2. Dependencies: Install necessary libraries for speech recognition and computer vision.
  3. API Key: Generate a Gemini API key to initiate interaction sessions.

Releases

No releases published

Packages

No packages published

Languages