Resume Scanner Pro API is a high-performance REST API designed to help candidates optimize their resumes for Applicant Tracking Systems (ATS).
Unlike simple keyword counters, this tool utilizes a Dual-Engine Architecture. It combines statistical analysis (TF-IDF) to pass strict robotic filters with semantic AI (SBERT) to ensure context relevance for human recruiters. It is stateless, containerized, and ready for integration into career platforms or personal portfolios.
- Strict Mode (ATS Logic): Uses TF-IDF Vectorization to check for exact keyword matches. This simulates legacy ATS software that rejects resumes missing specific hard skills.
-
Flexible Mode (AI Logic): Uses Sentence-BERT (SBERT) embeddings to understand context. It recognizes that "Python"
$\approx$ "Coding", rewarding candidates for semantic relevance even if exact wording differs.
- Raw Text Extraction: powered by
PyPDF2to strip formatting and clean artifacts. - Page Count Logic: Automatically detects resume length and provides strategic advice (e.g., warning users if their resume exceeds 1 page, which risks being overlooked).
- Auto-Detection: Automatically identifies the top 5 most frequent/important keywords in a Job Description.
- Gap Analysis: Compares these critical terms against the CV and flags "High Risk" applications if core skills are missing.
- Stateless Architecture: No databases required; processes data in-memory for maximum privacy and speed.
- Model Caching: The SBERT model loads once during the application lifespan (startup), ensuring low latency for subsequent requests.
- Dockerized: Fully containerized for consistent deployment across any cloud environment.
- Framework: FastAPI (Asynchronous)
- NLP (Statistical): Scikit-Learn (TF-IDF)
- NLP (AI): Sentence-Transformers (
paraphrase-multilingual-MiniLM-L12-v2) - Text Processing: PyPDF2, Pandas, NLTK
- Deployment: Hugging Face Spaces (Docker)
- Extraction: User uploads a PDF. System extracts text and validates length.
- Preprocessing: Text is cleaned (newlines removed, whitespace trimmed) to normalize input.
- Vectorization:
- Strict: Maps text to a frequency matrix based on the Job Description's vocabulary.
- Flexible: Encodes text into 384-dimensional dense vectors.
- Similarity Calculation: Computes Cosine Similarity (0-100%) between the Resume vector and JD vector.
- Response: Returns the score, missing keywords list, and critical skills safety check.
https://silvio0-resume-scanner.hf.space
Parses a PDF file and returns its raw text content ready for analysis.
-
Endpoint:
/extract -
Method:
POST -
Content-Type:
multipart/form-data -
Body:
cv_file: Binary File (.pdf)
-
Response (JSON):
{
"total_pages": 1,
"Info": "✅ **Optimal Length:** Single-page resume detected. This concise format is highly preferred by recruiters for rapid screening and parsing.",
"cv_text": "Silvio Christian Joe\nData Scientist\n..."
}
The core engine that compares the CV against a Job Description.
-
Endpoint:
/analyze -
Method:
POST -
Content-Type:
application/x-www-form-urlencoded -
Body:
cv_text: (String) Raw text from the extraction step.jd_text: (String) The Job Description text.mode: (String) "strict" or "flexible".manual_keywords: (List, Optional) Specific skills to check (e.g., "Python, SQL").
-
Response (JSON):
{
"score": 85.5,
"mode": "strict",
"missing_keywords": [
"kubernetes",
"docker"
],
"available_keywords": [
"python",
"sql",
"aws",
"machine learning",
"data",
"kubernetes",
"docker"
],
"default_critical_keywords": [
"python",
"sql",
"aws",
"machine learning",
"data"
],
"critical_check": {
"keywords_checked": [
"python",
"sql",
"aws",
"machine learning",
"data"
],
"missing_critical": [],
"status": "SAFE"
}
}
Test the API workflow directly in your browser:
- Access Docs: https://silvio0-resume-scanner.hf.space/docs
- Step 1: Extract Text (Get your CV data)
- Click on
POST /extract->Try it out. - Upload your PDF Resume in the
cv_filefield. - Click Execute and copy the content inside
"cv_text"from the Response Body.
- Click on
- Step 2: Analyze Match (Check your score)
- Click on
POST /analyze->Try it out. - Paste your copied text into the
cv_textfield. - Paste a sample Job Description into the
jd_textfield. - Click Execute to see your score and missing keywords.
- Click on
- Clone the Repository
git clone https://github.com/viochris/resume-scanner-api.git
cd resume-scanner-api
- Install Dependencies
pip install -r requirements.txt
- Run the Server
uvicorn api:app --reload
Output: Uvicorn running on http://127.0.0.1:8000
Author: Silvio Christian, Joe "Optimize for the robot, write for the human."