Skip to content

ML Model to estimate the shipping price of an order, based on one e-commerce dataset + Deploy with FastAPI, Docker and AWS

Notifications You must be signed in to change notification settings

FelipeMezzarana/ShippingPriceEstimate

Repository files navigation

Shipping Price Estimate

On an e-commerce site, the shipping price can be a determining factor in deciding whether the user will complete the purchase or not. The delivery service is offered by third parties, and because its price depends on a number of factors, there will not always be tools that perform the price calculation automatically. Thus, in this situation, a model that predicts the amount of freight to be charged assertively can allow the reduction of the amounts charged and consequently help in converting purchases, generating value for the business.

Therefore, the goal of this project will be to build a machine learning model to estimate the freight price of an online order.We will work with an ecommerce public dataset. The dataset has nine tables with multiple information (about the product, seller, customer, etc) of more them 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil.

We chose to separate the project in two files to not make the notebooks extremely large.

The first notebook ‘creating_and_treating_df’ consists of:

  • Select the relevant infos from the database
  • Merge everything in a single DataFrame
  • Treat missing values
  • Initial Feature Selection

The second notebook ‘shipping_price_estimate’ consists of:

  • Feature Engineering
  • Feature Selection
  • Dealing with Outliers
  • Establish metrics and baseline
  • Hyperparameter Tuning with grid search (XGBoost and RandomForest)
  • Model selection - evaluation between three models(XGBoost, RandomForest and Linear Regression)
  • Overfitting/Underfitting Analysis with learning curve

Project Structure

│   .gitignore
│   creating_and_treating_df.ipynb
│   README.md
│   shipping_price_estimate.ipynb
│
├───API
│   │   API_test.ipynb
│   │   dockerfile
│   │   requirements.txt
│   │   server.py
│   │   shipping_estimate_model.pkl
│
├───generated_files
│       olist_shipping_base_df.csv
│       shipping_estimate_model.pkl
│
└───olist_database_files
        olist_customers_dataset.csv
        olist_geolocation_dataset.csv
        olist_orders_dataset.csv
        olist_order_items_dataset.csv
        olist_order_payments_dataset.csv
        olist_order_reviews_dataset.csv
        olist_products_dataset.csv
        olist_sellers_dataset.csv

About

ML Model to estimate the shipping price of an order, based on one e-commerce dataset + Deploy with FastAPI, Docker and AWS

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published