EchoGuide is a wearable assistive device that empowers visually impaired users with real-time audio feedback about their surroundings. Utilizing an ESP32-CAM module and a Python FastAPI server running lightweight computer vision models (e.g., MobileNet SSD or Tiny YOLO), EchoGuide identifies objects, estimates distances, and delivers simple voice cues. It can also save and recognize faces, making it a valuable companion for those with visual impairments.
- EchoGuide: AI-Powered Navigation for the Visually Impaired
- Table of Contents
- Features
- Technologies Used
- Hardware Requirements
- Software Requirements
- System Architecture
- Installation
- Running the Application
- Special Notes
- PPT
- Real-time object detection and distance estimation
- Voice feedback for objects and faces
- Face recognition for saved faces
- Easy-to-use Python API for customization and integration
- Compatible with ESP32-CAM modules
- ESP32-CAM
- ESP32
- Python
- Flask
- OpenCV
- YOLO
- DeepGram
- face_recongition
- ESP32-CAM module
- FTDI
- ESP32-CAM module Development Board (Optional)
- ESP32
- PAM8403 Amplifier
- Any small speaker
- Jumper Wires
- Breadboard or PCB
- Power Supply
- Micro USB Cable
- Arduino Cable
- Capacitors and Resistors for noise reduction (Optional)
- ESP32-CAM captures frames and streams via HTTP POST.
- Flask Server receives images, runs object detection and distance estimation.
- System Microphone listens for voice commands; server responds with targeted frame analysis.
- DeepGram processes audio and sends text to ESP32 for TTS and hence audio feedback to the user via speaker.
- Navigate to Root Directory.
- Create a virtual environment with Python 3.11.4 with name
.venv. - Activate the virtual environment using command
.\venv\Scripts\activate. - Run the following commands one by one
pip install --upgrade pip
pip install wheel setuptools
pip install cmake==3.27.7
pip install dlib==19.24.2
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
- While installing
face_recognition, if you encounter any error regardingcmakeordlib, then you will have to manually installcmakefrom here. - While installing, if there is any package compatibility issue, you will have to resolve it.
-
Navigate to Speaker directory.
-
Follow the tutorial to install the ESP32 development board.
-
Upload the code to your ESP32.
-
Connect ESP32, speaker and PAM8403 Amplifier to the ESP32 in following manners.
ESP32 PAM8403 Amplifier Speaker 3.3 V (or 5 V, Choose Accordingly) 5 V + GND GND GPIO 25 L IN L OUT + + (Positive Terminal) L OUT - - (Negative Terminal)
-
Navigate to ESP32_Camera directory.
-
Connect the ESP32-CAM module to the FTDI in following manner.
ESP32 Cam Module FTDI 5 V VCC GND GND UOT RX UOR TX -
Connect the IO0 and GND of the ESP32-CAM module. (This is only done while uploading the code)
-
Upload the code to your ESP32-CAM module.
-
Disconnect the IO0 and GND of the ESP32-CAM module.
-
Restart the ESP32-CAM module.
- Connect the ESP32, ESP32-CAM module and Laptop (or PC) to the same network (Common WiFi or can be Mobile Hotspot).
- Get a DeepGram API Key from here.
- Paste the API Key in
DEEPGRAM_API_KEY(line no. 35) ofEchoGuide.py. - Go to terminal and type
ipconfig. Note theIPv4 Addressand paste inpythonServerIP(line no. 13) ofSpeaker.ino. You can also also check whether the Python server is working or not by running the code and then typingYOUR_IPv4_ADDRESS_:5000. If it does not work, try with different IP Address. - Write your WiFi Name and Password in both ESP32 and ESP32-CAM Module before uploading.
- While uploading the code in ESP32-CAM Module, use Boards:
ESP32 Dev Boardand in Tools > Partition Scheme :Default 4 MB with spiffs (1.2 MB APP / 1.5 MB SPIFFS). If this does not works useHuge APP (3 MB No OTA / 1 MB SPIFFS). If this also does not works, check on the internet and try again. - After Uploading the codes in respective boards, open serial monitor and check their IP addresses on which they are running.
- Then paste the IP address of ESP32-CAM Module in
esp32_ip(line no. 32) and ESP32 inesp32_audio_ip(line no. 36) ofEchoGuide.py.
- After doing all the processes, connect the ESP32 and ESP32-CAM Module to a power source or laptop (or PC).
- Run the Python file. It should start the Flask server.
- It will automatically download
yolov8n.pt. - This system is able to detect only a few day-to-day items. But if you want, you can add more to it by reading YOLO documentation or using internet.
- You can visit
http://<YOUR_IP_ADDRESS>:5000to see the basic website created. There on clickingView Live Stream, you can see the live streaming of object detection in your browser. Remember to close this tab to able to run the application smoothly as this is only for viewing. - Say, for example,
Find <Object>and it will find Object for you. - If you find a face and say
Capture <Name>, it will save that face with the given name in a json file. - If you say
Detect <Name>, it will find that face. - The system gives Object Name, Distance (in meters) and Left/ Right/ Center.
- You can also use a microphone.By default it uses your systems microphone (or microphone module), but you will need to change little code here and there or will also need to write different function for microphone and then Speech-To-Text.
- In this code we have used PWM hardware (PAM8403). But it is recommended to use I2C type amplifier (e.g., MAX98357) and microphone (or microphone module) due to better compatibility with ESP boards, but you will need to change little code here and there.
- Here we used two boards as ESP32-CAM module is not that much powerful. But if you get a powerful board (which is costlier) then by tweaking the code, it should be fine.
- The ESP32-CAM module can be laggy, slow and even break break sometimes as it is not that much powerful. In that disconnect the power supply and try again.
- You can also use capacitors and resistors in the circuit to reduce the noise.







