Automated Resume Parser

A Python-based application that automatically extracts and analyzes information from resume documents (PDF and DOCX formats) using natural language processing.

Features

Multiple Format Support: Parse resumes in PDF and DOCX formats
Intelligent Information Extraction: Extract key details including:
- Candidate name
- Email address
- Phone number
- Skills
Database Storage: Automatically store parsed information in PostgreSQL database
RESTful API: Simple API endpoint for resume parsing
Scalable Architecture: Modular design for easy extensions and modifications

Technology Stack

Backend: Python 3.9+, Flask
Database: PostgreSQL
NLP: SpaCy
Document Processing: PyPDF2, python-docx
Development Tools: pytest, black, flake8

Installation

Clone the repository:

git clone https://github.com/stephenombuya/Automated-Resume-Parser
cd resume-parser

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Download SpaCy model:

python -m spacy download en_core_web_sm

Set up environment variables:

cp .env.example .env
# Edit .env with your database credentials

Initialize the database:

flask db upgrade

Usage

Start the Flask application:

python app.py

Send a POST request to parse a resume:

curl -X POST -F "file=@/path/to/resume.pdf" http://localhost:5000/parse

Example Response

{
    "name": "John Doe",
    "email": "john.doe@email.com",
    "phone": "+1 123-456-7890",
    "skills": ["python", "java", "sql"]
}

Project Structure

resume-parser/
├── app/
│   ├── __init__.py
│   ├── config.py
│   ├── models.py
│   ├── parser/
│   │   ├── pdf_parser.py
│   │   ├── docx_parser.py
│   │   └── nlp_processor.py
│   └── utils.py
├── tests/
├── requirements.txt
├── .env.example
└── README.md

Development

Run tests:

pytest

Format code:

black .

Check code style:

flake8

Contributing

Fork the repository
Create your feature branch: git checkout -b feature/new-feature
Commit your changes: git commit -am 'Add new feature'
Push to the branch: git push origin feature/new-feature
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

SpaCy for providing excellent NLP capabilities
PyPDF2 and python-docx for document parsing functionality

Future Improvements

Add support for more document formats
Implement machine learning for better information extraction
Add bulk processing capabilities
Create a web interface for file uploads
Enhance skills detection with industry-specific vocabularies
Add export functionality to various formats

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automated Resume Parser

Features

Technology Stack

Installation

Usage

Example Response

Project Structure

Development

Contributing

License

Acknowledgments

Future Improvements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

stephenombuya/Automated-Resume-Parser

Folders and files

Latest commit

History

Repository files navigation

Automated Resume Parser

Features

Technology Stack

Installation

Usage

Example Response

Project Structure

Development

Contributing

License

Acknowledgments

Future Improvements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages