Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions docs/HLD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# High-Level Design (HLD) - Face Detection Attendance System

## 1. Introduction

This document outlines the high-level architecture for the Face Detection Attendance System. The system is designed as a web-based application that allows for user registration via face capture and marks attendance in real-time using facial recognition.

## 2. System Architecture

The application is composed of five primary components:

![System Architecture Diagram](https://i.imgur.com/3aG3d2J.png)

* **Frontend (UI):**
* **Technology:** Flask-rendered HTML templates with Bootstrap CSS.
* **Responsibility:** Provides the user interface for registering new users and initiating the attendance process. It communicates with the backend via standard HTTP requests.

* **Backend (API Server):**
* **Technology:** Python with Flask.
* **Responsibility:** Exposes API endpoints to handle requests from the frontend. It orchestrates all business logic, from handling user data to managing the face recognition process.

* **Business Logic:**
* **User Management:** Handles the logic for creating, storing, and retrieving user information.
* **Attendance Tracking:** Manages the process of marking and recording attendance.
* **Model Training:** Retrains the face recognition model whenever a new user is added to ensure the system can identify them.

* **ML/CV Module:**
* **Technology:** OpenCV and Scikit-learn.
* **Responsibility:**
* **Face Detection:** Uses a pre-trained Haar Cascade model (`haarcascade_frontalface_default.xml`) to locate faces in a video stream.
* **Face Recognition:** Uses a K-Nearest Neighbors (KNN) classifier trained on the images of registered users to identify faces.

* **Data Storage:**
* **Technology:** Filesystem for face images and a proposed **SQLite database** for metadata.
* **Responsibility:**
* **Face Data:** Stores captured face images, organized in folders.
* **Metadata:** Stores user information (Name, ID) and attendance logs. Migrating this to a database is a key part of the proposed improvements.

## 3. Data Flow

### 3.1. New User Registration

1. **UI:** A user submits their name on the registration form.
2. **Backend:** The backend receives the request and triggers the **User Management** logic.
3. **Business Logic:** An automatic, unique User ID is generated.
4. **ML/CV Module:** The application accesses the webcam, captures a series of face images, and processes them.
5. **Data Storage:** The images are saved to the filesystem, and the user's name and ID are stored in the database.
6. **Model Training:** The `train_model()` function is called to retrain the KNN model with the new user's face data.

### 3.2. Take Attendance

1. **UI:** A user clicks the "Take Attendance" button.
2. **Backend:** The backend receives the request and starts the attendance process.
3. **ML/CV Module:** The application accesses the webcam. For each frame, it detects faces and uses the trained KNN model to identify them.
4. **Business Logic:** If a recognized face belongs to a registered user who has not yet been marked present today, the **Attendance Tracking** logic is invoked.
5. **Data Storage:** The user's attendance (Name, ID, Timestamp) is recorded in the database.
6. **UI:** The frontend dynamically updates to show the list of present users for the day.
77 changes: 77 additions & 0 deletions docs/PROJECT_STATUS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Project Status & Roadmap

This document details the current status of the project, outlines necessary modifications and new features, and presents a target codebase structure.

---

### 1. Work Already Done

- **Web Interface:** A functional UI exists using Flask and a Bootstrap HTML template.
- **Face Capture:** The system can access a webcam and capture images for new users.
- **Model Training:** A KNN model is trained on existing user faces.
- **Face Recognition:** The system can identify registered users in a live video stream.
- **Attendance Logging:** Basic attendance is recorded in daily CSV files.

---

### 2. Modifications Required

- [ ] **Fix Critical Security Vulnerability (Path Traversal)**
* **Problem:** User-provided input (`newusername`) is used directly to create a directory path, allowing an attacker to write files to unintended locations.
* **Solution:** Sanitize the username before using it in any filesystem operations.

- [ ] **Prevent Duplicate Attendance Entries**
* **Problem:** A user's attendance is marked every time their face is detected, leading to multiple entries for the same person in a single session.
* **Solution:** Before writing to the attendance log, check if the user has already been marked present for the day.

- [ ] **Refactor Inefficient Model Training**
* **Problem:** The entire model is retrained every time a single user is added, which is slow and will not scale.
* **Solution:** While a full refactor can be part of a later phase, an initial improvement would be to only retrain when necessary.

- [ ] **Improve General Error Handling**
* **Problem:** The application crashes if a webcam is not found or if critical files like `haarcascade_frontalface_default.xml` are missing.
* **Solution:** Add `try-except` blocks and conditional checks to handle these scenarios gracefully and provide feedback to the user.

---

### 3. New Implementations

- [ ] **Implement Automatic User ID Generation**
* **Problem Solved:** Removes the need for manual ID entry, preventing errors and duplicate IDs. Simplifies the registration process.
* **Implementation:** The backend will automatically generate a new 4-digit, zero-padded ID by finding the last highest ID in the database.

- [ ] **Migrate Data Storage to SQLite**
* **Problem Solved:** Replaces fragile and insecure CSV files and filename parsing with a robust database. This fixes the CSV injection vulnerability, improves data integrity, and makes the application more scalable and efficient.
* **Implementation:** Create a simple SQLite database with `users` and `attendance` tables. All functions that currently read/write to CSVs or parse filenames will be updated to use the database.

---

### 4. Proposed Codebase Structure (Post-Implementation)

```
.FaceDetection_Prototype3/
├── app.py # Main Flask application logic
├── database.py # (New) All SQLite database functions (CRUD operations)
├── attendance.db # (New) The SQLite database file
├─ⷨ docs/
│ ├── HLD.md
│ └── PROJECT_STATUS.md
├─ⷨ static/
│ └─ⷨ faces/
│ └── USERNAME_0001/
├─ⷨ templates/
│ └── home.html
├── haarcascade_frontalface_default.xml
├── requirements.txt
└── ...
```

---

### 5. Problems Solved by These Implementations

* **Security:** Eliminates critical vulnerabilities like Path Traversal and CSV Injection.
* **Reliability:** Prevents data corruption from duplicate attendance entries and improves overall stability with better error handling.
* **Scalability:** Migrating to a database and optimizing training logic ensures the application can handle a growing number of users without significant performance degradation.
* **Maintainability:** Centralizing data logic into a database and a dedicated Python module makes the code cleaner, more organized, and easier to manage.
* **User Experience:** Automating ID generation simplifies the registration process for the end-user.
42 changes: 42 additions & 0 deletions docs/task.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Project Roadmap: Face Detection Attendance System

This document outlines the development plan based on the project's high-level design. The goal is to build a robust and secure prototype by organizing contributions into clear phases.

---
Phase 1: Foundational Fixes (Priority Tasks)
---

Our immediate focus is on making the prototype stable and secure. The following tasks will be addressed first:

1. **Implement Automatic User ID Generation:**
* **Action:** Remove the "User ID" input field from the registration form.
* **Backend Logic:** When a new user is added, the system will automatically generate a new 4-digit ID. It will do this by scanning the 'static/faces' directory, finding the highest existing ID, and incrementing it by one. The first user will be '0001'.
* **Status:** To be implemented.

2. **Fix Critical Security Vulnerability (Path Traversal):**
* **Action:** Sanitize the 'newusername' input in the '/add' route to prevent malicious inputs like '../../'.
* **Status:** To be implemented.

3. **Prevent Duplicate Attendance Entries:**
* **Action:** Modify the attendance marking logic. Before adding a new entry to the daily attendance file, check if the user's ID is already present.
* **Status:** To be implemented.

4. **Improve Error Handling:**
* **Action:** Add checks to ensure a webcam is available before starting capture. Add a check to ensure the 'haarcascade_frontalface_default.xml' file exists at startup.
* **Status:** To be implemented.

---
Phase 2: Core Improvements (Future Work)
---

Once the foundation is stable, we will focus on improving the architecture.

* **Migrate to SQLite Database:** Replace CSV files and directory name parsing with a proper database for storing user info and attendance logs.
* **Refactor Inefficient Code:** Optimize loops and redundant operations.

---
Phase 3: New Features (Future Work)
---

* **Administrator Dashboard:** Create a separate interface for managing users.
* **Asynchronous Tasks:** Move long-running processes like model training to the background.