This project leverages the google-generativeai library to extract and infer data from documents.
The primary goal is to sanitize and analyze the content of resumes to provide structured information and insights.
- Data Sanitization: Removes sensitive or inappropriate content from the data before processing.
- Data Inference: Uses a generative AI model to infer specific details from the provided data, such as educational background and work experience.
-
Install Dependencies: Ensure you have all the required packages installed. You can install them using
pip:pip install -r requirements.txt
-
Environment Variables: Create a
.envfile in the root directory of the project and add your API key:GEMINI_API_KEY="your_api_key_here"
To run the project, execute the query_data.py script. This script will load the environment variables, configure the generative AI model, and process the data.
You can add files to pdf files to the Files folder, change queries in the query_data.py file, and run the script to see the results.
python main.pyFor testing purposes, a sample resume is used instead of sensitive client files. This ensures that the functionality can be verified without exposing any confidential information.
This project is licensed under the MIT License. See the LICENSE file for more details.