Create Data Lake on AWS S3 to store dimensional tables after processing data using Spark on AWS EMR cluster
-
Updated
Oct 10, 2019 - Python
Create Data Lake on AWS S3 to store dimensional tables after processing data using Spark on AWS EMR cluster
Enterprise-grade Data Platform for NYC Taxi Analytics. Orchestrated with Airflow (Astro) & dbt, served via FastAPI & Power BI. Features Medallion Architecture, Data Quality Observability (Slack), and Star Schema modeling.
Open-source Supply Chain analytics on Microsoft Fabric: a scalable Bronze-Silver-Gold pipeline with automated CSV ingestion, Delta Lake transforms, semantic modeling (DAX & RLS) and interactive Power BI reports. Join to enhance pipelines, refine models, and build next-gen supply-chain insights!
Build a data warehouse from scratch, including full load, daily incremental load, design schema, SCD Type 1 and 2.
This is a flask application that converts an informational model of a decision problem to a snow-flaked star schema
Building Data Warehouse and ETL pipelines using Amazon S3 and Redshift
Data Modeling with Apache Cassandra
Model an star schema from raw normalized Olympic Games data using dbt - postgres, airflow and docker
Transformed raw HR data into a star schema using GCP & Cloud SQL, wrote SQL queries for business reporting and analyzed trends like age vs. income, performance, and hiring by gender. Visualized insights in Tableau for data-driven HR decisions. Tools: Google Cloud SQL(Postgres), GCP, Tableau.
Batch & streaming data pipelines built using Databricks with Pyspark and modeled the data into star schema to analyze in PowerBI, Formula-1 racing data from multiple data sources, APIs.
Simple scripts for data cleaning, etl transformations and data reorganisations
ETL pipeline that extracts and transforms student athlete academic performance data, then populates a data warehouse using a star schema dimensional model.
University lab exercises with processing big data.
ETL Pipeline that Scrapes, Cleans, and Loads Book Data into PostgreSQL, then builds a Star-Schema Data Warehouse for Optimized Analysis.
End‑to‑end Russell 3000 market data pipeline: Polygon → Snowflake → dbt → Streamlit. Daily U.S. equities analytics pipeline with Airflow, Snowflake, dbt, Streamlit. Market data ELT stack for Russell 3000 signals, marts, and dashboards.
Udacity project: implementing an ETL process on a PostgreSQL DB to create a star schema data model
Dimensional data-modeling for a BI warehouse using SQL and Power BI ETL.
DLH about NASA exoplanets
A Postgres database using a star schema to facilitate the analysis of user behaviour on a music streaming app.
Streaming + batch sales pipeline: Kafka→S3→Spark/Databricks→Snowflake/dbt, Airflow on AWS; Tableau, DQ tests & alerts.
Add a description, image, and links to the star-schema topic page so that developers can more easily learn about it.
To associate your repository with the star-schema topic, visit your repo's landing page and select "manage topics."