This is a Flask-based web application that predicts whether a data scientist will stay with a company or leave. The project provides tools for training machine learning models, evaluating their performance, and making predictions based on user inputs. The application is designed for HR analytics and aims to assist in decision-making processes.
- Train Machine Learning Models: Train Logistic Regression, K-Nearest Neighbors, or SVM models on different datasets (normal, oversampled, or undersampled).
- Evaluate Models: View evaluation metrics, including train/test scores and a detailed classification report.
- Make Predictions: Predict an employee's likelihood of staying or leaving based on their features.
- User-Friendly Interface: Interact with the application through a clean and intuitive web interface.
- Python 3.7+
- Flask
- scikit-learn
- pandas
- numpy
- matplotlib
- joblib
-
Clone the repository:
git clone https://github.com/your-username/hr-analytics-predictor.git cd hr-analytics-predictor -
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Place your datasets (
normal_data.csv,oversample.csv,undersample_data.csv) in thedatafolder. -
Run the application:
python main.py
-
Open your browser and navigate to: http://127.0.0.1:5000
hr-analytics-predictor/
├── main.py # Main Flask application
├── train.py # Handles model training and evaluation
├── predict.py # Handles predictions
├── templates/ # HTML templates for the web interface
├── data/ # Folder for datasets
├── static/ # Static files (CSS, JS, etc.)
└── requirements.txt # Python dependencies
- Navigate to the Train Models page.
- Select a dataset type (normal, oversampled, or undersampled).
- Choose a model (Logistic Regression, KNN, or SVM).
- Train the model and view its evaluation metrics.
- Navigate to the Predict page.
- Input the employee's details (e.g., city development index, gender, experience, etc.).
- Submit the form to get the prediction result.
The project expects datasets in CSV format with the following columns:
city_development_indexgenderrelevant_experienceenrolled_universityeducation_levelmajor_disciplineexperiencecompany_sizecompany_typelast_new_jobtraining_hourstarget(0: Stays, 1: Leaves)
- scikit-learn for machine learning tools
- Flask for the web framework
- Kaggle for providing the HR Analytics dataset