This repository includes a step-by-step guide for building and understanding the system:
flowchart TD
A0["MediaPipe Library
"]
A1["Keypoint Extraction
"]
A2["Sequence Handling
"]
A3["LSTM Model
"]
A4["Real-Time Inference Loop
"]
A5["Label Mapping and Encoding
"]
A6["Data Splitting
"]
A0 -- "Provides Landmarks" --> A1
A1 -- "Feeds Frame Data" --> A2
A2 -- "Provides Sequence Data" --> A6
A5 -- "Provides Labels" --> A6
A6 -- "Supplies Training Data" --> A3
A3 -- "Provides Predictions" --> A4
A4 -- "Uses for Processing" --> A0
A4 -- "Maps Predictions" --> A5
- Real-time detection using webcam feed
- MediaPipe for precise hand landmark tracking
- LSTM-based classification on sequential landmark data
- Supports typical sign language gestures like “hello,” “thank you,” etc.
- Easy to extend: train new gestures by adding labeled sequences
- hello → ✅
- thanks → ✅
- iloveyou → ✅
- Additional gestures: _you, yes, no, please, etc
Accuracy on test set: ~98%
- Chapter 1: MediaPipe Library
- Chapter 2: Keypoint Extraction
- Chapter 3: Sequence Handling
- Chapter 4: Label Mapping , Encoding and Splitting
- Chapter 5: LSTM Model
This project was inspired and guided by the excellent tutorial on YouTube by Nicholas Renotte. Huge thanks to him for breaking down complex concepts and making machine learning accessible for real-time gesture recognition. 🎥🧠
If you're looking to dive deeper or see the original video that shaped this project, you can find it here: Real-Time Sign Language Detection with Python & LSTM