A complete PostgreSQL-based SQL project using a real-world e-commerce inventory dataset from Zepto, one of Indiaβs fastest-growing quick-commerce startups. This project demonstrates end-to-end data analytics skills β from data exploration to business-focused insights β using real analyst techniques like CASE, JOINS, CTEs, and Subqueries.
- β Realistic SQL project with retail use-case
 - π§Ή Data cleaning & transformation (e.g. paise β rupees)
 - π Business insights: revenue, stock status, product value
 - π§  Uses advanced SQL techniques: 
CASE,JOIN,CTEs,Subqueries - πΌ Perfect for interview prep, portfolio building, or SQL learning
 
- Source: Kaggle (scraped from Zeptoβs mobile app)
 - Real-world e-commerce catalog structure
 - Duplicate product names with different packaging/weights
 - Each row = unique SKU (Stock Keeping Unit)
 
| Column | Description | 
|---|---|
| sku_id | Unique product ID | 
| name | Product name | 
| category | Product category (e.g. Fruits, Snacks) | 
| mrp | Max Retail Price (βΉ) | 
| discountPercent | % discount on MRP | 
| discountedSellingPrice | Final discounted price | 
| availableQuantity | Inventory units available | 
| weightInGms | Product weight (grams) | 
| outOfStock | Boolean - Is the product out of stock? | 
| quantity | No. of units per SKU | 
- Removed products with MRP or price = 0
 - Converted 
mrp&discountedSellingPricefrom paise to rupees - Handled null values and inconsistent entries
 
- Counted total rows
 - Identified distinct categories
 - Compared stock vs out-of-stock
 - Detected repeated product names (same product, different SKU)
 
- Top discounted products
 - High MRP but out-of-stock products
 - Revenue by category
 - Expensive products with low discount
 - Avg. discount by category
 - Price-per-gram value ranking
 
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;
SELECT DISTINCT name, mrp 
FROM zepto
WHERE outOfStock = TRUE AND mrp > 300
ORDER BY mrp DESC;
SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue DESC;
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
WHERE mrp > 500 AND discountPercent < 10
ORDER BY mrp DESC, discountPercent DESC;
SELECT category,
ROUND(AVG(discountPercent), 2) AS avg_discount
FROM zepto
GROUP BY category 
ORDER BY avg_discount DESC
LIMIT 5;
SELECT DISTINCT name, weightInGms, discountedSellingPrice,
ROUND(discountedSellingPrice / weightInGms, 2) AS price_per_gram
FROM zepto
WHERE weightInGms >= 100
ORDER BY price_per_gram;
SELECT DISTINCT name, weightInGms,
  CASE 
    WHEN weightInGms < 1000 THEN 'Low'
    WHEN weightInGms > 5000 THEN 'Medium'
    ELSE 'Bulk'
  END AS weight_category
FROM zepto;
SELECT category,
SUM(weightInGms * availableQuantity) AS total_weight
FROM zepto
GROUP BY category
ORDER BY total_weight DESC;