-
Introductions - Slide 1
-
Fist to five slide for Ryan - Slide 2
-
Machine Learning and Interactive App - Slides 8-12
-
Data preparation for machine learning (Slide 9)
- Data profiling using pandas_profiling to get an insight how various features affect review score.
- Data resampling using undersampling as number of positive reviews are more than negative reviews.
- Converting catagorical varibales to numerical variables using label encoder and lambda function.
- Converting the target variable to binary using lambda variable
-
Selection of Machine Learning Model (Slide 10)
- We selected Random Forest Classification Machine Learning Model out of 5 models (Logistic calssification, KNN, Decision Tree, Random Forest and ANN)
- Randome Forest gave us the maximum accuracy of 98%.
-
Feature Selection and Interactive App (Slide 11)
- We used random Forest Feature Selection technique to selelct the features that affect the review score the most.
- We included zipcode, price, payment value, freight cost, # of photos, Shipping duration, and Shipping delays/early in our app.
-
Recommendations: (Slide 12)
- Provide accurate estimations of the delivery times.
- Keep the customer updated
- Reduce the delivery times
app_video.mp4
-