11.11 sale was going on Daraz where discount was given on most of the products. From each category, we will go to the sub-categories page. Then fetch the top selling products data from the first 3 pages and prepare a dataset. Henceafter, we will apply data transformation and do data analysis using Tableau.
I collected all the data data using web scraping with selenium. You can find all the scraper files and the scraped data within "scrapers" folder. Firstly I collected categorywise product data in different csv files. After that, I merged all the csv files into one. For some reason, I missed the sub-categories name. In place of the name I kept the urls. That's why to replace the urls with names, I collected the names in different csv files. Check out the overview of the initial collected data in the following table.File Name | Dataset Type | File Extension | Rows | Columns |
---|---|---|---|---|
merged_data.csv | Tabular | csv | 12907 | 14 |
subcategories.csv | Tabular | csv | 119 | 3 |
File Name | Dataset Type | File Extension | Rows | Columns |
---|---|---|---|---|
Top_Selling_Product_Data.csv | Tabular | csv | 12907 | 15 |
Analysis Requirements Blueprint
- How many sub-categories does each category have?
- Among all the products, how many of them are offered with standard delivery and free delivery?
- How many Flagship and General Stores are there? Which one is more in number?
- Which category is offered with the most and the least discount?
- Which product has the most discount? What was its original and discount price? How much the discount prices scatter from the original prices?
- Considering all types of ratings, which product has the highest number of ratings among all the categories and also sub-categories?
- How many unique products are sold by each seller? Which seller is selling the most number of products based on category and sub-categories?
- Does one seller sell its products in different categories?
- Does 3 factors:
- Chat Response Rate(%),
- Positive Seller Ratings(%)
- & Ship On Time(%) affects a seller on being a flagship store?
- Among the categories, which sub-categories are offered with the most & the least discounts?
- As much as the discount is, the more the seller has to sell the products to reach a break-even point equal to the original price of 50 products. Is it true?
- Number of sub-categories within the categories ranges from 7-12.
- Mother & Baby
- TV & Home Appliances
- Watches, Bags & Jewellery
have the most number of sub-categories.
- Daraz has more non-flagship stores than the flagship ones.
- This e-commerce website offered more products with free delivery facility than the ones with standard delivery.
- Categorically the most discount was offered in Men's & Boy's Fashion and the least discount was offered within Groceries.
-
Top 5 most discounted products within all categories -
Product Name Discount(%) (11 Taka Deal) Dancing cactus talking cactus Stuffed Plush Toy Electronic toy with song plush cactus potted toy Early Education Toy For kids 98 New Style Leather feragamo Belt For Men
Black Leather Formal Belt For Men92 Men's Pu Leather Wallet High Quality Men Long Wallet Male Business Pu Leather Purse
Black High quality Leather Long Wallet For Men91 Je-ep Chocolate Artificial Leather Long Wallet for Men
Grey Regular Fit China Cotton Golf Cap For Men - Cap For Men - Cap - Winter Cap
Furdani Stylish High quality Artificial Leather wallet for men
Canvas Wild Polyester Belts For Men - Belt For Men90 Superb Indispensable -Upscale Living -Black Color Cotton DJ Cap for Men- Inventive Choice Remarkable - Disclose Styles & Luxe
Men Wallets Men Jeep Wallet with Coin Bag Small Money Purses New Design Dollar Slim Purse Money Clip Wallet
Jeep Long High quality Artificial Leather wallet For Men
Jeep Chocolate High quality High Capacity Artificial Leather Long Wallet for Man
Jeep Black Long Artificial Leather Wallet for men
High quality Artificial Leather belt for men
11 Taka Deal Joya Ultra Comfort Wings - 8 Pads Pack - pad89 -
There were many products where no discount was offered.
-
Top 5 products with most number of ratings among all categories -
Product Name | Number of Ratings |
---|---|
Diamond potato 1kg ± 25 gm | 11914 |
S8 Ultra Smart Watch Series 8 Ultra Men Women Bluetooth Call Wireless Charging Fitness Bracelet 1.95/1.44 Inch HD Screen | 8908 |
Local Onion 1 kg (± 25 gm) - onion | 6280 |
QKZ DM10 Zinc Alloy HiFi Earphones | 5430 |
Imported Onion 1 kg | 4868 |
- There are various types of products among all categories but potato and onion which are take place in the most number of ratings products.
- Top 3 sellers selling most number of distinct products: -
Seller Name | Product Count |
---|---|
Daraz Fresh | 114 |
SWAP | 101 |
Well-being Distribution Ltd. | 66 |
FOGG Bangladesh | 62 |
Unilever | 61 |
-
There are many sellers who sells products in different categories and also in various sub-categories.
-
A seller's Avg.
- Chat Response Rate(%)
- Positive Seller Ratings(%)
- Ship On Time(%)
don't affect it on being a flagship seller.
-
There is a relationship between the offered discount and the number of products to be sold to reach a break-even point. So, as much as the discount is, the more the products to be sold to reach the break-even point equal to the price of 50 products.
-
Sub-categorical most discount -
Category | Sub-category | Discount(%) |
---|---|---|
Men's & Boy's Fashion | Accessories | 67.52 |
Watches, Bags, Jewellery | Shop Men's Bags Online in Bangladesh | 61.55 |
Electronic Device | Mobile Accessories | 51.18 |
Electronic Accessories | Audio | 49.89 |
Mother & Baby | Sports & Outdoor Play | 46.18 |
Sports & Outdoors | Shoes & Clothing | 46.36 |
Automotive & Motorbike | Interior Accessories | 44.84 |
TV & Home Appliances | TV & Video Accessories | 41.11 |
Health & Beauty | Beauty Tools | 38.64 |
Groceries | Cooking Ingredents | 33.64 |
- Clone the repo
git clone https://github.com/Neloy-Barman/Daraz-11.11-Top-Selling-Product-Data-Analysis.git
- Create & activate virtual Environment
virtualenv --no-site-packages venv
source venv/bin/activate
- Install dependencies
pip install -r requirements.txt
- Run the product data scraper
python product_data_scraper.py
- Run the subcategories data scraper
python subcategory_scraper.py
- In my case, the scrapers got stopped after running for hours. Then I had to find the last scraped index and restart from there.
- While the loop continues to run, some data just may get missed. If checked, then there remains the same elements as others but still may miss.
- As there are many categories along with subcategories so, it took a lot of time to scrape data.