Real-time workplace activity monitoring system using dual YOLOv8 models to detect and track employee activities including smoking, eating, sleeping, and phone usage with automatic time tracking and alerts.
- Overview
- Features
- System Architecture
- Installation
- Usage
- Configuration
- Model Information
- How It Works
- Screenshots
- Performance
- Future Enhancements
- Contributing
- License
This project implements an intelligent workplace activity monitoring system that uses computer vision and deep learning to automatically detect and track various employee activities in real-time. The system leverages two YOLOv8 models working in tandem:
- Custom YOLOv8 Model: Trained specifically for detecting smoking, eating, sleeping, and phone usage
- COCO Pre-trained YOLOv8 Model: Used to detect objects associated with activities (food items, phones, furniture, etc.)
The system provides real-time tracking with automatic time logging, configurable alerts, and a dual-screen display showing both the full view and zoomed detection regions.
- Dual Model Architecture: Combines custom and COCO models for improved accuracy
- Multi-Activity Tracking: Simultaneously monitors smoking, eating, sleeping, and phone usage
- Person Tracking: Uses IoU (Intersection over Union) based tracking to maintain consistent person IDs across frames
- Automatic Time Logging: Tracks total time spent on each activity per person
- Configurable Time Limits: Set custom time limits for specific activities
- Visual Warnings: Color-coded alerts when approaching or exceeding time limits
- Session Tracking: Separate tracking for current session and total accumulated time
- Activity Start Notifications: Instant alerts when a new activity is detected
- Time Limit Warnings: Automatic alerts when time limits are exceeded
- Sound Notifications: Optional audio alerts for critical events
- Console Logging: Detailed activity logs with timestamps
- Dual Screen Display:
- Full camera view with all detections
- Zoomed-in view of detection regions (2.5x magnification)
- Color-Coded Activities:
- 🔴 Red: Smoking
- 🟢 Green: Eating
- 🟣 Purple: Sleeping
- 🟠 Orange: Phone usage
- Real-Time Statistics: Live activity logs and time counters overlay
- Model Source Indicators: Visual indicators showing which model detected each activity
- Multi-threaded Architecture: Separate threads for detection and display
- Frame Queue Management: Efficient frame processing with queue-based system
- Configurable Confidence Thresholds: Adjustable per-activity confidence levels
- CPU/GPU Support: Configurable CUDA usage
┌─────────────────────────────────────────────────────────────┐
│ Camera Input (Webcam) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Frame Queue (Buffer) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Detection Thread (Parallel Processing) │
│ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ Custom YOLOv8 Model │ │ COCO YOLOv8 Model │ │
│ │ (Smoking, Eating, │ │ (Objects: Phone, Food, │ │
│ │ Sleeping, Phone) │ │ Bed, Couch, etc.) │ │
│ └──────────┬───────────┘ └──────────┬───────────────┘ │
│ │ │ │
│ └────────────┬───────────────┘ │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ Detection Fusion & │ │
│ │ Activity Mapping │ │
│ └──────────┬──────────────┘ │
└─────────────────────────┼──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Person Tracking System │
│ • IoU-based tracking │
│ • Unique ID assignment │
│ • Activity transition detection │
│ • Time accumulation │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Alert & Logging System │
│ • Time limit checking │
│ • Sound notifications │
│ • Console logging │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Result Queue │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Visualization Engine │
│ • Full view display │
│ • Zoomed detection view │
│ • Activity statistics overlay │
│ • Color-coded bounding boxes │
└─────────────────────────────────────────────────────────────┘
- Python 3.8 or higher
- Webcam or camera device
- GPU (optional, for better performance)
git clone https://github.com/alihassanml/Workplace-Activity-Detection-Using-Computer-Vision-Yolo.git
cd Workplace-Activity-Detection-Using-Computer-Vision-Yolo# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activatepip install -r requirements.txtrequirements.txt:
ultralytics>=8.0.0
opencv-python>=4.8.0
numpy>=1.24.0
playsound>=1.3.0
torch>=2.0.0- Custom YOLOv8 Model: Place your trained model at
models/yolov8s.pt - COCO YOLOv8 Model: Place the pre-trained model at
models/coco_yolov8n.pt
# Create models directory
mkdir models
# Download COCO model (if needed)
wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt -O models/coco_yolov8n.ptmkdir sound
# Place your alert sound file as: sound/drop.mp3python main.pyTo use a different camera:
# In main.py, modify:
cap = cv2.VideoCapture(0) # Change 0 to 1, 2, etc.- ESC: Quit the application
- The application will display final statistics on exit
=== Dual Model Activity Tracker Started ===
Green boxes = Custom model detections
Blue boxes = COCO model detections (mapped to activities)
Press ESC to quit
⚠️ Person ID 1 started: PHONE (detected as 'cell phone' by coco model)
⚠️ Person ID 2 started: EATING (detected as 'eating' by custom model)
🚨 WARNING: Person ID 1 has been using PHONE for more than 15 seconds!
🔊 Playing time limit sound for phone
==================================================
Final Activity Statistics:
==================================================
Person ID 1:
Phone: 00:00:23 (Limit: 15s) ⚠️ EXCEEDED
Person ID 2:
Eating: 00:01:45
Edit the TIME_LIMITS dictionary in main.py:
TIME_LIMITS = {
'phone': 15, # 15 seconds
'smoking': 30, # 30 seconds
'eating': None, # No limit
'sleeping': 300, # 5 minutes
}Adjust detection sensitivity:
CONFIDENCE_THRESHOLDS = {
'smoking': 0.50, # 50% confidence
'eating': 0.25, # 25% confidence
'sleeping': 0.25,
'phone': 0.25
}Configure when to play sounds:
PLAY_SOUND_ON_START = ['smoking'] # Play sound when smoking starts
PLAY_SOUND_ON_TIME_LIMIT = ['phone', 'smoking'] # Play when limit exceededCustomize visualization colors (BGR format):
ACTIVITY_COLORS = {
'smoking': (0, 0, 255), # Red
'eating': (0, 255, 0), # Green
'sleeping': (255, 0, 255), # Purple
'phone': (0, 165, 255) # Orange
}Adjust the detection zoom level:
zoom_factor = 2.5 # Change to 3.0, 4.0 for more zoom# Disable CUDA (use CPU only)
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
# Enable GPU (comment out the line above or set to GPU ID)
# os.environ["CUDA_VISIBLE_DEVICES"] = "0"Classes:
0: Smoking1: Eating2: Sleeping3: Phone
Training Details:
- Base Model: YOLOv8s
- Custom dataset with labeled workplace activities
- Optimized for real-time detection
Mapped Classes:
| COCO Class | Mapped Activity | Class ID |
|---|---|---|
| Cell Phone | Phone | 67 |
| Bottle, Cup, Fork, etc. | Eating | 39-55 |
| Bed | Sleeping | 59 |
| Couch | Sleeping | 57 |
The system runs two YOLO models simultaneously:
- Custom model detects direct activities
- COCO model detects associated objects
- Results are merged for improved accuracy
For each detection:
1. Calculate IoU with existing tracked persons
2. If IoU > 0.3: Match to existing person
3. Else: Create new person with unique ID
4. Update activity and timing information
For each person:
- Track start_time when activity begins
- Accumulate total_time for each activity
- Calculate current_session_time
- Check against time limits
- Trigger alerts if exceeded
If person changes activity:
1. Stop timer for old activity
2. Add elapsed time to old activity's total
3. Start timer for new activity
4. Reset alert flags
┌────────────────────────────────┬────────────────────────────────┐
│ FULL VIEW │ ZOOMED DETECTION │
│ │ │
│ Person 1 Activity Log: │ [Enlarged detection area] │
│ - Phone: 00:00:12 [Limit:15s] │ │
│ - Eating: 00:01:30 │ ID: 1 🟢 │
│ │ PHONE │
│ [Full camera view with boxes] │ Time: 00:00:12 │
│ │ ⏰ 3s left │
└────────────────────────────────┴────────────────────────────────┘
- FPS: 15-30 (depending on hardware)
- Latency: < 100ms per frame
- CPU Usage: 40-60% (quad-core)
- GPU Usage: 30-50% (if enabled)
- Custom Model: ~85% mAP on test set
- COCO Model: ~45% mAP (official YOLOv8n)
- Tracking Accuracy: ~90% ID consistency
| Component | Minimum | Recommended |
|---|---|---|
| CPU | Intel i5 / AMD Ryzen 5 | Intel i7 / AMD Ryzen 7 |
| RAM | 8 GB | 16 GB |
| GPU | None (CPU only) | NVIDIA GTX 1060+ |
| Storage | 2 GB | 5 GB |
| Camera | 720p @ 15fps | 1080p @ 30fps |
- Multi-camera Support: Monitor multiple areas simultaneously
- Database Integration: Store activity logs in SQL/NoSQL database
- Web Dashboard: Real-time monitoring through web interface
- Advanced Analytics: Generate daily/weekly/monthly reports
- Cloud Integration: Upload data to cloud storage
- Mobile App: Remote monitoring via smartphone
- Face Recognition: Identify specific employees
- Action Recognition: Detect more complex activities
- Anomaly Detection: Identify unusual behavior patterns
- API Endpoints: RESTful API for integration with other systems
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create your feature branch
git checkout -b feature/AmazingFeature
- Commit your changes
git commit -m 'Add some AmazingFeature' - Push to the branch
git push origin feature/AmazingFeature
- Open a Pull Request
- Follow PEP 8 style guide for Python code
- Add comments for complex logic
- Update README if adding new features
- Test thoroughly before submitting PR
This project is licensed under the MIT License - see the LICENSE file for details.
Ali Hassan
- GitHub: @alihassanml
- LinkedIn: Connect with me
- Email: your.email@example.com
- Ultralytics YOLOv8 for the amazing YOLO implementation
- OpenCV for computer vision tools
- COCO dataset for pre-trained models
- The open-source community for inspiration and support
If you encounter any issues or have questions:
- Check existing issues: GitHub Issues
- Create new issue: Provide detailed description with error logs
- Discussions: Join our GitHub Discussions
If you find this project useful, please consider giving it a star! ⭐
Made with ❤️ by Ali Hassan