For full explanation, please read my Medium article, CRISP-DM Methodology With Python (Model Deployment Using Flask Included) | Classification Case Study Using KNN Model.
The e-commerce platform is facing challenges in predicting customer ratings accurately, which can lead to reduced customer satisfaction, lower sales, and negative brand reputation. Despite the availability of customer rating data, the current system lacks the ability to analyze and interpret the data in a meaningful way, resulting in inaccurate predictions. This creates a need for a more sophisticated and accurate machine learning model that can analyze customer rating data to predict customer ratings with high accuracy, leading to improved customer satisfaction, increased sales, and positive brand reputation.
To achieve an accuracy rate of at least 80% in predicting customer ratings within a year using specific order details. This will help businesses make data-driven decisions about their products, marketing strategies, and customer service.
The dataset was obtained from the E-Commerce Shipping Data from Kaggle.
To determine whether a customer would give rating 1, we can map the rating 2 to 5 as 0 to represent rating other than 1. It will become a binary classification instead of a multiclass classification.
The final model is a k-Nearest Neighbors (KNN) classifier. The final model returned accuracy score of about 77.60% for the testing data. Besides, I also created a flask application with a proper frontend and UI that can be run on the local computer (as shown in the video below).
Customer.Rating.Prediction.-.Google.Chrome.2023-04-10.16-57-24.online-video-cutter.com.mp4
- Use other classification machine learning algorithms, such as support vector machine, random forest classifier, and logistic regression.
- Utilize Randomized Search or Grid Search to tune the hyperparameters to check if the accuracy score would be higher.
- Handle the class imbalance using SMOTE, SMOTE-TOMEK, ADASYN, or SMOTE-ENN techniques.
- You may try to build multiclass classification model using this dataset, which means you may not need to map the rating 2 to 5 as 0 before you train the model. For a multiclass multiclass classification model, it will return the result that indicates whether a customer will give rating 1, 2, 3, 4, or 5 specifically, instead of indicating whether a customer will give rating 1 or not.