Skip to content

skp3214/Doc-Scanner-And-Matching-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ Document Scanning & Matching System

πŸ” A self-contained document scanning and matching system with a built-in credit system. Users can upload documents, scan them for matches, and manage their credits, while admin can view analytics and approve credit requests.


πŸ–₯️ Project Video

docscanner.mp4

✨ Features

πŸ”‘ User Management

βœ… User registration & login
βœ… User roles: Regular Users & Admin
βœ… Profile section with credits, past scans, and credit requests

πŸ’° Credit System

🎟️ Users get 20 free scans per day (auto-reset at midnight πŸ•›)
βž• Users can request additional credits, which admins can approve/deny
πŸ“„ Each document scan deducts 1 credit

πŸ“„ Document Scanning & Matching

πŸ“‚ Upload plain text, PDF, or Word documents
🧠 Scanning methods:

  • Basic Text Matching (Levenshtein distance)
  • AI-powered Matching (spaCy for semantic similarity)
    πŸ“Š Displays matching documents with percentage similarity

πŸ“Š Smart Analytics Dashboard (Admin Only)

πŸ“Œ Track number of scans per user
πŸ“Œ Identify most common scanned document topics
πŸ“Œ View top users by scans & credit usage
πŸ“Œ Generate credit usage statistics

πŸ”’ Security

πŸ” Secure user authentication with hashed passwords
🚫 Admin-only access to sensitive features (e.g., analytics, credit approval)


πŸ› οΈ Tech Stack

  • 🌐 Frontend: HTML, CSS, JavaScript
  • πŸ–₯️ Backend: Django (Python)
  • πŸ“¦ Database: SQLite (for development)
  • πŸ“ File Storage: Local storage for uploaded documents
  • 🧠 AI Matching: spaCy for semantic similarity

πŸš€ Setup Instructions

πŸ”— Prerequisites

1️⃣ Python 3.8+ installed on your system
2️⃣ pip (Python package manager)

βš™οΈ Installation

πŸ“₯ Clone the Repository

git clone https://github.com/skp3214/document-scanning-and-matching-system.git
cd document-scanning-and-matching-system

πŸ› οΈ Create a Virtual Environment

python -m venv venv
venv\Scripts\activate

πŸ“₯ Move to Project Directory

cd Doc_Scanner_Matcher

πŸ“¦ Install Dependencies

pip install -r requirements.txt

πŸ—„οΈ Set Up the Database

python manage.py migrate

πŸ”‘ Create an Admin User

python manage.py createsuperuser

πŸ‘‰ Follow the prompts to set a username, email, and password for the admin user.

▢️ Run the Development Server

python manage.py runserver

🌍 Access the App: Open http://127.0.0.1:8000/ in your browser

πŸ“Œ Important Note:

  • To allow the admin user to approve/deny credit requests, follow these steps:
    • Log in to the Django admin panel at http://127.0.0.1:8000/admin/ using the superuser credentials.
    • Navigate to the UserProfiles table.
    • Add the admin user to this table to grant them the required privileges.

Now, the admin user can approve/deny credit requests as shown in the video. πŸš€


🎯 Usage

πŸ‘€ User Features

1️⃣ Register & Log In πŸ“
2️⃣ Upload & Scan Documents πŸ“‚
3️⃣ Check Profile (Credits, Past Scans, Requests) πŸ‘€
4️⃣ Request More Credits βž•

πŸ› οΈ Admin Features

1️⃣ Approve/Deny Credit Requests βœ”οΈβŒ
2️⃣ View Analytics Dashboard πŸ“Š


πŸ“¬ Contact

For questions or feedback, reach out:

πŸ“§ Email: spsm1818@gmail.com
πŸ™ GitHub: skp3214


πŸš€ Happy Coding! 😊


Endpoint URL Path View Function Description
User Registration auth/register/ register Registers a new user.
User Login auth/login/ user_login Logs in a user.
Home Page / home Displays the homepage.
User Profile user/profile/ profile Shows user profile.
Request Credits credits/request/ request_credits Allows users to request additional credits.
Admin Credit Management credits/admins/ admin_credits Admins manage credit requests.
Scan Document scan/ scan_document Upload and scan a document for matches.
View Matches matches/<int:doc_id>/ matches Displays matching documents.
Document Details document/<int:doc_id>/ document_detail Shows details of a scanned document.
Download Scan History download-scan-history/ download_scan_history Downloads user's scan history.
Admin Analytics admins/analytics/ analytics Admin dashboard with scan and credit statistics.
User Logout logout/ user_logout Logs out the user.

This documentation provides an overview of the Django views used in the Document Scanning and Matching System. The views handle user authentication, document scanning, credit management, and analytics.


register

  • Purpose: Handles user registration.
  • Methods:
    • GET: Displays the registration form.
    • POST: Processes the form data, creates a new user, and logs them in.
  • Redirects: To the home page after successful registration.
  • Template: register.html

user_login

  • Purpose: Handles user login.
  • Methods:
    • GET: Displays the login form.
    • POST: Authenticates the user and logs them in.
  • Redirects: To the home page (or the next URL if provided) after successful login.
  • Template: login.html

user_logout

  • Purpose: Handles user logout.
  • Methods:
    • GET: Logs out the user.
  • Redirects: To the home page after logout.

home

  • Purpose: Displays the home page.
  • Methods:
    • GET: Renders the home page with the logged-in user's username.
  • Template: home.html

profile

  • Purpose: Displays the user's profile, including their credits, past scans, and credit requests.
  • Methods:
    • GET: Renders the profile page with the user's data.
  • Template: profile.html

request_credits

  • Purpose: Handles credit requests from users.
  • Methods:
    • GET: Displays the credit request form.
    • POST: Processes the form data and creates a new CreditRequest.
  • Redirects: To the profile page after submitting the request.
  • Template: creditsrequest.html

admin_credits

  • Purpose: Allows admins to approve or deny credit requests.
  • Methods:
    • GET: Displays a list of pending credit requests.
    • POST: Processes the admin's action (approve/deny) and updates the request status.
  • Redirects: To the profile page if the user is not an admin.
  • Template: admincredits.html

scan_document

  • Purpose: Handles document uploads and scans.
  • Methods:
    • GET: Displays the document upload form.
    • POST: Processes the uploaded file, extracts its content, and creates a new Document.
  • Redirects: To the matches page after successful upload.
  • Template: scan.html

matches

  • Purpose: Displays documents similar to the uploaded document.
  • Methods:
    • GET: Renders the matches page with the uploaded document and its matches.
  • Template: matches.html

document_detail

  • Purpose: Displays the content of a specific document.
  • Methods:
    • GET: Renders the document detail page.
  • Template: document_detail.html

download_scan_history

  • Purpose: Allows users to download their scan history as a text file.
  • Methods:
    • GET: Generates and serves a text file with the user's scan history.
  • Response: A plain text file with the scan history.

analytics

  • Purpose: Displays analytics for admins, including scans per user, credit usage, and common topics.
  • Methods:
    • GET: Renders the analytics dashboard.
  • Redirects: To the profile page if the user is not an admin.
  • Template: analytics.html

reset_credits

  • Purpose: Resets the user's credits at midnight.
  • Usage: Called in the profile view to ensure credits are reset daily.

find_matches

  • Purpose: Finds documents similar to the uploaded document using basic text matching (e.g., Levenshtein distance).
  • Usage: Called in the matches view.

ai_find_matches

  • Purpose: Finds documents similar to the uploaded document using AI-powered matching (spaCy).
  • Usage: Called in the matches view.

  1. User Registration:

    • A new user registers using the register view.
    • They are redirected to the home page after registration.
  2. User Login:

    • The user logs in using the user_login view.
    • They are redirected to the home page (or the next URL) after login.
  3. Document Upload:

    • The user uploads a document using the scan_document view.
    • They are redirected to the matches page to view similar documents.
  4. Credit Request:

    • The user requests additional credits using the request_credits view.
    • An admin approves or denies the request using the admin_credits view.
  5. Analytics:

    • An admin views analytics using the analytics view.

  • register.html: Registration form.
  • login.html: Login form.
  • home.html: Home page.
  • profile.html: User profile page.
  • creditsrequest.html: Credit request form.
  • admincredits.html: Admin credit approval page.
  • scan.html: Document upload form.
  • matches.html: Matches page.
  • document_detail.html: Document detail page.
  • analytics.html: Analytics dashboard.

  • spaCy: Used for AI-powered document matching.
    • Install with: pip install spacy
    • Download the English model: python -m spacy download en_core_web_sm

Stores additional user-related data, such as credits and the last reset time.

Fields:

  • user: One-to-One relationship with Django's built-in User model.
  • credits: Integer field to track the user's available credits (default: 20).
  • last_reset: DateTime field to store the last time credits were reset (auto-generated on creation).

Stores user-uploaded documents and their processed data.

Fields:

  • user: ForeignKey relationship with the User model (one user can have multiple documents).
  • file: FileField to store the uploaded document in the documents/ directory.
  • uploaded_at: DateTime field to record when the document was uploaded (auto-generated).
  • content: TextField to store extracted text content from the document.
  • vector: BinaryField to store the document’s processed vector representation (nullable and optional).

Methods:

  • save(self, *args, **kwargs): Overrides the default save method to:
    • Process document content using SpaCy (en_core_web_sm).
    • Convert the processed content into a vector and store it as a binary field.
    • Call the parent class's save() method to persist data.

Handles credit requests from users.

Fields:

  • user: ForeignKey relationship with the User model (one user can make multiple credit requests).
  • requested_credits: Integer field to store the number of credits requested.
  • status: CharField to store the request status (pending by default, can be updated later).
  • requested_at: DateTime field to record when the request was made (auto-generated).

Notes

  • spacy is used for natural language processing, extracting vectors from document content.
  • Binary vectors are stored in the vector field for future similarity searches.
  • Credits help manage document processing, ensuring fair usage.

About

πŸ” A self-contained document scanning and matching system with a built-in credit system. Users can upload documents, scan them for matches, and manage their credits, while admins can view analytics and approve credit requests.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors