Sparky is an interactive assistant designed to help users identify and understand various hardware and electronic components. By leveraging voice recognition and image processing, Sparky can respond to questions about tools and electronics, providing concise, layman-friendly explanations. Users can ask about specific tools or show components.
Click the image above or Video Link to watch the video.
- Voice Recognition: Users can ask questions about hardware tools.
- Image Recognition: Users can show images of components for identification.
- Interactive Learning: Sparky provides explanations and helps users understand different tools and electronics.
Sparky is built using a Raspberry Pi as the central hub. It integrates:
- A microphone for voice input
- A speaker for audio output
- A camera for image recognition
The system employs software algorithms for processing voice commands and analyzing images of hardware items. All components are housed in a custom 3D-printed model, giving Sparky a unique and engaging design.
hardwareguru.py
: The main assistant that answers user questions.toolidentify.py
: Contains computer vision code to detect tools in the scene.
- 6 Additional Files: These include reference files for checking different hardware and assisting in the debugging process.
- Computer Vision: The bot uses a custom-trained model to detect six classes of tools:
hammer
,pliers
,screwdriver
,wrench
,ESP_32
, andRaspberry Pi
. - Model Training: The model is trained using YOLOv8, achieving an accuracy of approximately mAP50 of 82% and mAP50-95 of 67%. This allows the bot to recognize tools effectively.
- Gemini API: Ensure that a Gemini API key is generated for interaction sessions with the chatbot.
While developing Sparky, we encountered several challenges:
- Dataset Collection: Gathering an appropriate dataset for the custom training model using YOLO was difficult. Achieving satisfactory accuracy required multiple training sessions.
- Interactivity: Ensuring the system communicated effectively within a domain-specific context was complex.
- Integration: Combining various components and deploying the system on hardware added further complications, especially given our limited timeframe.
We are proud to have successfully assembled an assistant that effectively responds to user inquiries and provides live detection of tools, along with accurate summaries. Achieving an accuracy rate of around 80% across six different classes is a significant milestone for us. This accomplishment showcases our ability to blend technology and user experience, allowing Sparky to serve as a reliable assistant in the realm of hardware and electronics.
Confusion Matrix | F1-Curve |
---|---|
We discovered creative methods to generate domain-specific content using generative AI and learned how to activate computer vision models directly within the chatbot.
Moving forward, we plan to enhance Sparky's capabilities by:
- Improving accuracy through the addition of more images and expanding the dataset.
- Enabling Sparky to recognize unknown tools by providing images for on-the-spot training.
- Refining the user interface on the website and integrating the computer vision model.
- Incorporating motors or servos into the hardware to enhance Sparky's aesthetic and functionality.
To get started with Sparky, ensure you have the following:
- Raspberry Pi: Set up with a microphone, speaker, and camera.
- Dependencies: Install necessary libraries for speech recognition and computer vision.
- API Key: Generate a Gemini API key to initiate interaction sessions.