Skip to content

k178412/sql-data-warehouse-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL Data Warehouse Project

A hands-on data warehouse project using SQL Server, covering ETL processes, and data modeling.


🏗️ Data Architecture

This project follows the Medallion Data Architecture, breaking the data pipeline into three layers (Bronze, Silver, and Gold layers) to ensure clarity, maintainability, and scalability.

data architecture

  1. Bronze Layer - Stores raw data exactly as received from source systems.
  2. Silver Layer - Cleans and transforms data for consistency, applying standardization and normalization.
  3. Gold Layer - Contains business-ready data, optimized for reporting and insights.

🔍 Project Overview

This project shows the full data warehouse lifecycle, from source data ingestion to business-ready data models.
Key components include:

  1. 🧱 Data Architecture - Designing a structured data warehouse using Medallion Architecture.
  2. 🔄 ETL Pipelines - Extracting, transforming, and loading data using SQL scripts.
  3. 🧮 Data Modeling - Creating fact and dimension tables for optimized querying and analytics.

📂 Project Files

  1. Datasets/ - Source CRM and ERP data stored as CSV files, used for ingestion into the warehouse.
  2. Docs/ - Diagrams (created in Draw.io) for architecture, data flow, and data modeling.
  3. Scripts/ - SQL scripts for database setup, table creation, ETL processes, and transformations.

📊 Diagrams

  1. Data Architecture - Defines the structural flow of the data warehouse.
  2. Data Flow - Illustrates the journey from raw to refined data.
  3. Data Integration - Highlights how different source systems connect.
  4. Data Model - Represents logical schema for fact and dimension tables.
  5. ETL Pipeline - Shows different extraction, transformation, and loading processes.

🛠️ Tools & Technologies

  1. SQL Server - Core database platform for data storage and transformation.
  2. Notion - For planning and tracking project progress.
  3. Draw.io - Used to design diagrams and workflows.
  4. Git - To manage version control and repository tracking.

📌 Project Tracking

You can view the detailed plan, and progress here:

Notion Project Link: Data Warehouse Project


🔒 License

This project is licensed under MIT License.


🤝 Contributing

Contributions, issues, and feature requests are welcome!


⭐️ If you find this project useful, please consider giving it a star!