📄 Resume Scanner Pro API: Dual-Engine ATS Optimizer

📌 Overview

Resume Scanner Pro API is a high-performance REST API designed to help candidates optimize their resumes for Applicant Tracking Systems (ATS).

Unlike simple keyword counters, this tool utilizes a Dual-Engine Architecture. It combines statistical analysis (TF-IDF) to pass strict robotic filters with semantic AI (SBERT) to ensure context relevance for human recruiters. It is stateless, containerized, and ready for integration into career platforms or personal portfolios.

✨ Key Features

🤖 Dual-Mode Analysis

Strict Mode (ATS Logic): Uses TF-IDF Vectorization to check for exact keyword matches. This simulates legacy ATS software that rejects resumes missing specific hard skills.
Flexible Mode (AI Logic): Uses Sentence-BERT (SBERT) embeddings to understand context. It recognizes that "Python" $\approx$ "Coding", rewarding candidates for semantic relevance even if exact wording differs.

📂 Intelligent PDF Extraction

Raw Text Extraction: powered by PyPDF2 to strip formatting and clean artifacts.
Page Count Logic: Automatically detects resume length and provides strategic advice (e.g., warning users if their resume exceeds 1 page, which risks being overlooked).

🎯 Critical Skills Verification

Auto-Detection: Automatically identifies the top 5 most frequent/important keywords in a Job Description.
Gap Analysis: Compares these critical terms against the CV and flags "High Risk" applications if core skills are missing.

🛡️ Robust Backend

Stateless Architecture: No databases required; processes data in-memory for maximum privacy and speed.
Model Caching: The SBERT model loads once during the application lifespan (startup), ensuring low latency for subsequent requests.
Dockerized: Fully containerized for consistent deployment across any cloud environment.

🛠️ Tech Stack

Framework: FastAPI (Asynchronous)
NLP (Statistical): Scikit-Learn (TF-IDF)
NLP (AI): Sentence-Transformers (paraphrase-multilingual-MiniLM-L12-v2)
Text Processing: PyPDF2, Pandas, NLTK
Deployment: Hugging Face Spaces (Docker)

🚀 The Processing Pipeline

Extraction: User uploads a PDF. System extracts text and validates length.
Preprocessing: Text is cleaned (newlines removed, whitespace trimmed) to normalize input.
Vectorization:
- Strict: Maps text to a frequency matrix based on the Job Description's vocabulary.
- Flexible: Encodes text into 384-dimensional dense vectors.
Similarity Calculation: Computes Cosine Similarity (0-100%) between the Resume vector and JD vector.
Response: Returns the score, missing keywords list, and critical skills safety check.

🔌 Integration Guide (API Contract)

Live Base URL

https://silvio0-resume-scanner.hf.space

1. Extract Text (PDF Upload)

Parses a PDF file and returns its raw text content ready for analysis.

Endpoint: /extract
Method: POST
Content-Type: multipart/form-data
Body:
- cv_file: Binary File (.pdf)
Response (JSON):

{
  "total_pages": 1,
  "Info": "✅ **Optimal Length:** Single-page resume detected. This concise format is highly preferred by recruiters for rapid screening and parsing.",
  "cv_text": "Silvio Christian Joe\nData Scientist\n..."
}

2. Analyze Resume (Scoring)

The core engine that compares the CV against a Job Description.

Endpoint: /analyze
Method: POST
Content-Type: application/x-www-form-urlencoded
Body:
- cv_text: (String) Raw text from the extraction step.
- jd_text: (String) The Job Description text.
- mode: (String) "strict" or "flexible".
- manual_keywords: (List, Optional) Specific skills to check (e.g., "Python, SQL").
Response (JSON):

{
  "score": 85.5,
  "mode": "strict",
  "missing_keywords": [
    "kubernetes",
    "docker"
  ],
  "available_keywords": [
    "python",
    "sql",
    "aws",
    "machine learning",
    "data",
    "kubernetes",
    "docker"
  ],
  "default_critical_keywords": [
    "python",
    "sql",
    "aws",
    "machine learning",
    "data"
  ],
  "critical_check": {
    "keywords_checked": [
      "python",
      "sql",
      "aws",
      "machine learning",
      "data"
    ],
    "missing_critical": [],
    "status": "SAFE"
  }
}

📚 Interactive Documentation (Swagger UI)

Test the API workflow directly in your browser:

Access Docs: https://silvio0-resume-scanner.hf.space/docs
Step 1: Extract Text (Get your CV data)
- Click on POST /extract -> Try it out.
- Upload your PDF Resume in the cv_file field.
- Click Execute and copy the content inside "cv_text" from the Response Body.
Step 2: Analyze Match (Check your score)
- Click on POST /analyze -> Try it out.
- Paste your copied text into the cv_text field.
- Paste a sample Job Description into the jd_text field.
- Click Execute to see your score and missing keywords.

📦 Local Installation

Clone the Repository

git clone https://github.com/viochris/resume-scanner-api.git
cd resume-scanner-api

Install Dependencies

pip install -r requirements.txt

Run the Server

uvicorn api:app --reload

Output: Uvicorn running on http://127.0.0.1:8000

Author: Silvio Christian, Joe "Optimize for the robot, write for the human."

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api.py		api.py
function.py		function.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Resume Scanner Pro API: Dual-Engine ATS Optimizer

📌 Overview

✨ Key Features

🤖 Dual-Mode Analysis

📂 Intelligent PDF Extraction

🎯 Critical Skills Verification

🛡️ Robust Backend

🛠️ Tech Stack

🚀 The Processing Pipeline

🔌 Integration Guide (API Contract)

Live Base URL

1. Extract Text (PDF Upload)

2. Analyze Resume (Scoring)

📚 Interactive Documentation (Swagger UI)

📦 Local Installation

About

Uh oh!

Releases

Packages

Languages

License

viochris/resume-scanner-api

Folders and files

Latest commit

History

Repository files navigation

📄 Resume Scanner Pro API: Dual-Engine ATS Optimizer

📌 Overview

✨ Key Features

🤖 Dual-Mode Analysis

📂 Intelligent PDF Extraction

🎯 Critical Skills Verification

🛡️ Robust Backend

🛠️ Tech Stack

🚀 The Processing Pipeline

🔌 Integration Guide (API Contract)

Live Base URL

1. Extract Text (PDF Upload)

2. Analyze Resume (Scoring)

📚 Interactive Documentation (Swagger UI)

📦 Local Installation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages