This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discount strategies, and category-level performance.
📂 Dataset Source: The dataset was obtained from Kaggle, contributed by Palvinder.
- Explore and clean raw product data to ensure accuracy and consistency.
- Analyze discount trends, pricing strategies, and stock status.
- Derive insights on product performance, revenue, and value metrics.
- Created the
zeptotable with detailed product-level fields. - Verified null values, duplicates, and anomalies.
- Checked product availability (in-stock vs out-of-stock).
- Removed invalid records where MRP = 0.
- Converted price data from paise to rupees for consistency.
| Query | Description |
|---|---|
| Q1 | Top 10 best-value products based on discount percentage. |
| Q2 | High-MRP products that are out of stock. |
| Q3 | Estimated revenue generated by each category. |
| Q4 | Premium products (MRP > ₹500) with minimal discounts (<10%). |
| Q5 | Top 5 categories offering the highest average discounts. |
| Q6 | Price-per-gram calculation to determine best-value items. |
| Q7 | Weight-based classification: Low, Medium, Bulk. |
| Q8 | Total inventory weight per category. |
- High discounts highlight best-value products that attract customers.
- Premium items (>₹500) typically offer lower discounts, maintaining brand value.
- Bulk and medium-weight items dominate total inventory weight.
- High-MRP products going out of stock indicate strong customer demand.
-
Language: SQL
-
Database: PostgreSQL
-
Focus Areas:
- Data Cleaning
- Aggregation
- Analytical Querying
- Business Insight Generation
-- Q1: Top 10 Best-Value Products
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;
-- Q3: Estimated Revenue by Category
SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue;Narasimman S 📍 Chennai | Data Analyst | SQL & Data Science Enthusiast