🎯 Workplace Activity Detection Using Computer Vision (YOLOv8)

Real-time workplace activity monitoring system using dual YOLOv8 models to detect and track employee activities including smoking, eating, sleeping, and phone usage with automatic time tracking and alerts.

📋 Table of Contents

Overview
Features
System Architecture
Installation
Usage
Configuration
Model Information
How It Works
Screenshots
Performance
Future Enhancements
Contributing
License

🔍 Overview

This project implements an intelligent workplace activity monitoring system that uses computer vision and deep learning to automatically detect and track various employee activities in real-time. The system leverages two YOLOv8 models working in tandem:

Custom YOLOv8 Model: Trained specifically for detecting smoking, eating, sleeping, and phone usage
COCO Pre-trained YOLOv8 Model: Used to detect objects associated with activities (food items, phones, furniture, etc.)

The system provides real-time tracking with automatic time logging, configurable alerts, and a dual-screen display showing both the full view and zoomed detection regions.

✨ Features

🎥 Real-Time Detection

Dual Model Architecture: Combines custom and COCO models for improved accuracy
Multi-Activity Tracking: Simultaneously monitors smoking, eating, sleeping, and phone usage
Person Tracking: Uses IoU (Intersection over Union) based tracking to maintain consistent person IDs across frames

⏱️ Time Management

Automatic Time Logging: Tracks total time spent on each activity per person
Configurable Time Limits: Set custom time limits for specific activities
Visual Warnings: Color-coded alerts when approaching or exceeding time limits
Session Tracking: Separate tracking for current session and total accumulated time

🔔 Smart Alerts

Activity Start Notifications: Instant alerts when a new activity is detected
Time Limit Warnings: Automatic alerts when time limits are exceeded
Sound Notifications: Optional audio alerts for critical events
Console Logging: Detailed activity logs with timestamps

🖥️ Advanced Visualization

Dual Screen Display:
- Full camera view with all detections
- Zoomed-in view of detection regions (2.5x magnification)
Color-Coded Activities:
- 🔴 Red: Smoking
- 🟢 Green: Eating
- 🟣 Purple: Sleeping
- 🟠 Orange: Phone usage
Real-Time Statistics: Live activity logs and time counters overlay
Model Source Indicators: Visual indicators showing which model detected each activity

⚡ Performance Optimizations

Multi-threaded Architecture: Separate threads for detection and display
Frame Queue Management: Efficient frame processing with queue-based system
Configurable Confidence Thresholds: Adjustable per-activity confidence levels
CPU/GPU Support: Configurable CUDA usage

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Camera Input (Webcam)                    │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                    Frame Queue (Buffer)                      │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│              Detection Thread (Parallel Processing)          │
│  ┌──────────────────────┐    ┌──────────────────────────┐  │
│  │  Custom YOLOv8 Model │    │ COCO YOLOv8 Model        │  │
│  │  (Smoking, Eating,   │    │ (Objects: Phone, Food,   │  │
│  │   Sleeping, Phone)   │    │  Bed, Couch, etc.)       │  │
│  └──────────┬───────────┘    └──────────┬───────────────┘  │
│             │                            │                   │
│             └────────────┬───────────────┘                   │
│                          ▼                                   │
│              ┌─────────────────────────┐                     │
│              │  Detection Fusion &     │                     │
│              │  Activity Mapping       │                     │
│              └──────────┬──────────────┘                     │
└─────────────────────────┼──────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    Person Tracking System                    │
│  • IoU-based tracking                                        │
│  • Unique ID assignment                                      │
│  • Activity transition detection                             │
│  • Time accumulation                                         │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Alert & Logging System                      │
│  • Time limit checking                                       │
│  • Sound notifications                                       │
│  • Console logging                                           │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                    Result Queue                              │
└────────────────────────┬────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                  Visualization Engine                        │
│  • Full view display                                         │
│  • Zoomed detection view                                     │
│  • Activity statistics overlay                               │
│  • Color-coded bounding boxes                                │
└─────────────────────────────────────────────────────────────┘

🚀 Installation

Prerequisites

Python 3.8 or higher
Webcam or camera device
GPU (optional, for better performance)

Step 1: Clone the Repository

git clone https://github.com/alihassanml/Workplace-Activity-Detection-Using-Computer-Vision-Yolo.git
cd Workplace-Activity-Detection-Using-Computer-Vision-Yolo

Step 2: Create Virtual Environment (Recommended)

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

requirements.txt:

ultralytics>=8.0.0
opencv-python>=4.8.0
numpy>=1.24.0
playsound>=1.3.0
torch>=2.0.0

Step 4: Download Models

Custom YOLOv8 Model: Place your trained model at models/yolov8s.pt
COCO YOLOv8 Model: Place the pre-trained model at models/coco_yolov8n.pt

# Create models directory
mkdir models

# Download COCO model (if needed)
wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt -O models/coco_yolov8n.pt

Step 5: Add Alert Sound (Optional)

mkdir sound
# Place your alert sound file as: sound/drop.mp3

📖 Usage

Basic Usage

python main.py

Camera Selection

To use a different camera:

# In main.py, modify:
cap = cv2.VideoCapture(0)  # Change 0 to 1, 2, etc.

Keyboard Controls

ESC: Quit the application
The application will display final statistics on exit

Example Output

=== Dual Model Activity Tracker Started ===
Green boxes = Custom model detections
Blue boxes = COCO model detections (mapped to activities)
Press ESC to quit

⚠️ Person ID 1 started: PHONE (detected as 'cell phone' by coco model)
⚠️ Person ID 2 started: EATING (detected as 'eating' by custom model)
🚨 WARNING: Person ID 1 has been using PHONE for more than 15 seconds!
🔊 Playing time limit sound for phone

==================================================
Final Activity Statistics:
==================================================

Person ID 1:
  Phone: 00:00:23 (Limit: 15s) ⚠️ EXCEEDED
  
Person ID 2:
  Eating: 00:01:45

⚙️ Configuration

Activity Time Limits

Edit the TIME_LIMITS dictionary in main.py:

TIME_LIMITS = {
    'phone': 15,      # 15 seconds
    'smoking': 30,    # 30 seconds
    'eating': None,   # No limit
    'sleeping': 300,  # 5 minutes
}

Confidence Thresholds

Adjust detection sensitivity:

CONFIDENCE_THRESHOLDS = {
    'smoking': 0.50,  # 50% confidence
    'eating': 0.25,   # 25% confidence
    'sleeping': 0.25,
    'phone': 0.25
}

Alert Sounds

Configure when to play sounds:

PLAY_SOUND_ON_START = ['smoking']  # Play sound when smoking starts
PLAY_SOUND_ON_TIME_LIMIT = ['phone', 'smoking']  # Play when limit exceeded

Activity Colors

Customize visualization colors (BGR format):

ACTIVITY_COLORS = {
    'smoking': (0, 0, 255),      # Red
    'eating': (0, 255, 0),       # Green
    'sleeping': (255, 0, 255),   # Purple
    'phone': (0, 165, 255)       # Orange
}

Zoom Factor

Adjust the detection zoom level:

zoom_factor = 2.5  # Change to 3.0, 4.0 for more zoom

GPU/CPU Configuration

# Disable CUDA (use CPU only)
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

# Enable GPU (comment out the line above or set to GPU ID)
# os.environ["CUDA_VISIBLE_DEVICES"] = "0"

🤖 Model Information

Custom YOLOv8 Model

Classes:

0: Smoking
1: Eating
2: Sleeping
3: Phone

Training Details:

Base Model: YOLOv8s
Custom dataset with labeled workplace activities
Optimized for real-time detection

COCO YOLOv8 Model

Mapped Classes:

COCO Class	Mapped Activity	Class ID
Cell Phone	Phone	67
Bottle, Cup, Fork, etc.	Eating	39-55
Bed	Sleeping	59
Couch	Sleeping	57

🔧 How It Works

1. Dual Model Detection

The system runs two YOLO models simultaneously:

Custom model detects direct activities
COCO model detects associated objects
Results are merged for improved accuracy

2. Person Tracking Algorithm

For each detection:
  1. Calculate IoU with existing tracked persons
  2. If IoU > 0.3: Match to existing person
  3. Else: Create new person with unique ID
  4. Update activity and timing information

3. Time Tracking

For each person:
  - Track start_time when activity begins
  - Accumulate total_time for each activity
  - Calculate current_session_time
  - Check against time limits
  - Trigger alerts if exceeded

4. Activity Transition Detection

If person changes activity:
  1. Stop timer for old activity
  2. Add elapsed time to old activity's total
  3. Start timer for new activity
  4. Reset alert flags

📸 Screenshots

Main Interface

┌────────────────────────────────┬────────────────────────────────┐
│         FULL VIEW              │      ZOOMED DETECTION          │
│                                │                                │
│  Person 1 Activity Log:        │    [Enlarged detection area]   │
│  - Phone: 00:00:12 [Limit:15s] │                                │
│  - Eating: 00:01:30            │    ID: 1 🟢                     │
│                                │    PHONE                        │
│  [Full camera view with boxes] │    Time: 00:00:12              │
│                                │    ⏰ 3s left                   │
└────────────────────────────────┴────────────────────────────────┘

📊 Performance

Detection Speed

FPS: 15-30 (depending on hardware)
Latency: < 100ms per frame
CPU Usage: 40-60% (quad-core)
GPU Usage: 30-50% (if enabled)

Accuracy Metrics

Custom Model: ~85% mAP on test set
COCO Model: ~45% mAP (official YOLOv8n)
Tracking Accuracy: ~90% ID consistency

System Requirements

Component	Minimum	Recommended
CPU	Intel i5 / AMD Ryzen 5	Intel i7 / AMD Ryzen 7
RAM	8 GB	16 GB
GPU	None (CPU only)	NVIDIA GTX 1060+
Storage	2 GB	5 GB
Camera	720p @ 15fps	1080p @ 30fps

🔮 Future Enhancements

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create your feature branch
```
git checkout -b feature/AmazingFeature
```
Commit your changes
```
git commit -m 'Add some AmazingFeature'
```
Push to the branch
```
git push origin feature/AmazingFeature
```
Open a Pull Request

Contribution Guidelines

Follow PEP 8 style guide for Python code
Add comments for complex logic
Update README if adding new features
Test thoroughly before submitting PR

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👨‍💻 Author

Ali Hassan

🙏 Acknowledgments

Ultralytics YOLOv8 for the amazing YOLO implementation
OpenCV for computer vision tools
COCO dataset for pre-trained models
The open-source community for inspiration and support

📞 Support

If you encounter any issues or have questions:

Check existing issues: GitHub Issues
Create new issue: Provide detailed description with error logs
Discussions: Join our GitHub Discussions

⭐ Star History

If you find this project useful, please consider giving it a star! ⭐

Made with ❤️ by Ali Hassan

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.vscode		.vscode
Docs		Docs
models		models
sound		sound
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
activity_settings.json		activity_settings.json
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py

License

alihassanml/Workplace-Activity-Detection-Using-Computer-Vision-Yolo

Folders and files

Latest commit

History

Repository files navigation