Guidance by - Pradeep Tripathi
KAGGLE DATASET: https://www.kaggle.com/datasets/mkechinov/ecommerce-behavior-data-from-multi-category-store
File Name: 2019-Nov.csv
File Size: 8 GB
Being excellent at data analysis and visualization, I volunteered to do the data cleaning and preprocessing in pyspark. Head over to PySpark.py and check my code! Handling such a large data was fun and a learning experience!