This project aims to build a recommender system for a grocery store that predicts whether a user will reorder a product and provides product recommendations. It includes a machine learning model for reorder prediction, product popularity-based recommendations, and a Flask API for making real-time predictions.
- Predict if a user reorders from the grocery store.
- Predict if the user reorders the same product based on their purchase history.
- Recommend the most popular products to users based on purchase history.
- Deploy a Flask API using streamlit for users to interact with the system.
- Python
- Pandas
- NumPy
- Streamlit
- Scikit-learn
- XGBoost
- Flask
Built a machine learning model for reorder prediction. The following steps were taken:
- Performed feature engineering to create relevant features for reorder prediction.
- Encoded categorical features using
TargetEncoder
- Split the data into training and testing sets.
For reorder prediction and buying the same product again prediction, we used a gradient boosted decision tree model (XGBoost) since it typically performs well for classification problems with stratified kfold with 5 splits cross validation to reduce the imbalance of the target variable in the dataset.
The model was evaluated using ROC AUC as the primary evaluation metric. Here are the ROC AUC results for each fold and the mean ROC AUC:
- Fold 0: ROC AUC = 0.7575
- Fold 1: ROC AUC = 0.7559
- Fold 2: ROC AUC = 0.7574
- Fold 3: ROC AUC = 0.7597
- Fold 4: ROC AUC = 0.7559
- Mean ROC AUC = 0.7573
The results indicate that the model is effective in predicting whether a user will reorder a specific product.
built a model to predict whether a user will reorder the same product. Here are the ROC AUC results for each fold and the mean ROC AUC:
- Fold 0: ROC AUC = 0.7555
- Fold 1: ROC AUC = 0.7576
- Fold 2: ROC AUC = 0.7597
- Fold 3: ROC AUC = 0.7564
- Fold 4: ROC AUC = 0.7569
- Mean ROC AUC = 0.7572
These results suggest that the model is reasonably effective in predicting whether a user will reorder the same product.
roduct recommendations based on popularity:
- Calculate the popularity of each product.
- Filter out products the user has already purchased.
- Sort products by popularity.
- Recommend the top N products.
Install the venv and dependencies:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
To run the Flask API, execute the following command:
python api.py
A Streamlit web app is available for users to explore the recommender system interactively. It can be run using the following command:
streamlit run gui.py
It will open up a web browser where users can input their purchase history and receive reorder predictions and product recommendations.
here is the democode for the API usage: