Skip to content

This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.

License

Notifications You must be signed in to change notification settings

S-Narasimman/Zepto_Inventory_SQL_Data_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🛒 Zepto Data Analysis Using SQL

This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discount strategies, and category-level performance.

📂 Dataset Source: The dataset was obtained from Kaggle, contributed by Palvinder.


📊 Key Objectives

  • Explore and clean raw product data to ensure accuracy and consistency.
  • Analyze discount trends, pricing strategies, and stock status.
  • Derive insights on product performance, revenue, and value metrics.

🧩 SQL Operations Performed

1️⃣ Table Creation & Data Exploration

  • Created the zepto table with detailed product-level fields.
  • Verified null values, duplicates, and anomalies.
  • Checked product availability (in-stock vs out-of-stock).

2️⃣ Data Cleaning

  • Removed invalid records where MRP = 0.
  • Converted price data from paise to rupees for consistency.

3️⃣ Data Analysis & Insights

Query Description
Q1 Top 10 best-value products based on discount percentage.
Q2 High-MRP products that are out of stock.
Q3 Estimated revenue generated by each category.
Q4 Premium products (MRP > ₹500) with minimal discounts (<10%).
Q5 Top 5 categories offering the highest average discounts.
Q6 Price-per-gram calculation to determine best-value items.
Q7 Weight-based classification: Low, Medium, Bulk.
Q8 Total inventory weight per category.

💡 Key Insights

  • High discounts highlight best-value products that attract customers.
  • Premium items (>₹500) typically offer lower discounts, maintaining brand value.
  • Bulk and medium-weight items dominate total inventory weight.
  • High-MRP products going out of stock indicate strong customer demand.

🧠 Tech Stack

  • Language: SQL

  • Database: PostgreSQL

  • Focus Areas:

    • Data Cleaning
    • Aggregation
    • Analytical Querying
    • Business Insight Generation

🧾 Example Queries

-- Q1: Top 10 Best-Value Products
SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;

-- Q3: Estimated Revenue by Category
SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue;

🧑‍💻 Author

Narasimman S 📍 Chennai | Data Analyst | SQL & Data Science Enthusiast


About

This project focuses on data cleaning, exploration, and analysis of product information from the Zepto dataset using SQL. It provides actionable insights into pricing, stock availability, discounts, and category-level performance.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published