This project is an intelligent presentation controller built using Python, MediaPipe, Scikit-learn, and SpeechRecognition. It allows users to control slide presentations using either hand gestures or voice commands β offering a touch-free and smooth experience during talks or lectures.
- Dual Control Modes: Use either hand gestures or voice commands to navigate slides.
- Hand Gesture Control: Trained a Random Forest classifier on custom hand gesture dataset using MediaPipe landmarks.
- Voice Command Control: Uses Google's Speech Recognition API to interpret "next" and "previous" slide commands.
- Real-time Feedback: Displays live webcam feed with hand detection, gesture prediction, and confidence levels.
Model.ipynbβ Jupyter notebook used to extract MediaPipe hand landmarks and train a gesture recognition model using Random Forest.gesture_model.pklβ Saved trained model (exported via Joblib).presentation_control.pyβ Main script to run either hand gesture or voice command mode for controlling slides.
- Detects hand landmarks using MediaPipe.
- Extracts (x, y) positions of 21 landmarks and classifies the gesture using a Random Forest model.
- If "next" gesture is detected with high confidence for a duration, it sends a
β(next slide). - If "previous" gesture is detected, it sends a
β(previous slide).
- Listens to your microphone using the SpeechRecognition library.
- Recognizes voice commands like "next" or "previous".
- Triggers respective key presses to control the slides.
python Final.pyThen, input either 0 (hand gesture mode) or 1 (voice command mode) when prompted.
- Python 3.x
- Webcam and/or microphone
- Python libraries:
- opencv-python
- mediapipe
- numpy
- scikit-learn
- joblib
- pyautogui
- speechrecognition
pip install opencv-python mediapipe numpy scikit-learn joblib pyautogui SpeechRecognition
The dataset for training gestures was created manually using MediaPipe landmarks extracted from the webcam feed. The trained model is included as gesture_model.pkl.
- Presentations in classrooms or meetings
- Touch-free navigation during webinars or live demos
- Assistive technology for hands-free interaction
- Support for more gestures (e.g., start/pause slideshow)
- Improved voice intent recognition (e.g., "go back one slide")
- GUI interface for better usability
You can use your own hand gesture dataset to personalize this system. Make sure to capture two clear hand gesturesβone for "next" and one for "previous". Once your data is ready, run the Model.ipynb notebook with your dataset to train and export a new model.