A multi-modal AI-powered assistant built for wearable smart glasses — empowering real-time perception, interaction, and accessibility using vision, language, gestures, and sound.
AI-VIEW is an AI-driven vision assistant designed for smart glasses. It enables users — especially the visually impaired — to understand and interact with their surroundings in real time through:
- Object detection
- Emotion recognition
- Text reading (English & Arabic)
- AI-powered voice conversation
- Gesture-based control
Built with modularity and real-time performance in mind, AI-VIEW fuses advanced computer vision, speech, and NLP models into an edge-ready assistant.
- Detects and identifies known faces in real time
- Saves new faces locally using a smart face database
- Useful for remembering familiar people and providing personalized context (e.g., “Ahmed is nearby”)
- Based on DeepFace and face embedding comparison
- Detects surrounding objects using YOLOv8 (up to 15m range)
- Measures distance and gives color-coded warnings
- Ideal for navigation and obstacle avoidance
- Voice chat with an AI agent using OpenAI
- Natural text-to-speech responses via ElevenLabs
- English & Arabic conversation support
- Uses DeepFace to analyze nearby human expressions
- Identifies emotions like happy, sad, neutral
- Adds emotional context to conversations or alerts
- Optical Character Recognition (OCR) in English and Arabic
- Reads signs, documents, labels aloud in real time
- Based on EasyOCR + custom Arabic tuning
- Recognizes intuitive hand gestures (e.g., stop, thumbs-up)
- Non-verbal commands without buttons or voice
- Built using MediaPipe's hand landmarks
| Emotion Detection |
|---|
![]() |
| Gesture Mode | Face Recognition |
|---|---|
![]() |
![]() |
###🚧 Source Code: Currently Private I am actively fixing bugs and enhancing the code for improved performance and stability.
###💡 Got a feature in mind? Feel free to share any desired features or suggestions — I am listening and improving continuously!


