Data Engineering with Microsoft Azure - Udacity Nanodegree

This repository contains projects completed as part of the Data Engineering with Microsoft Azure Nanodegree on Udacity. This program focused on applying Azure-based data engineering skills to real-world scenarios, covering essential topics like data modeling, NoSQL databases, ETL (Extract, Transform, Load) processes, and data pipeline automation with Azure services.

Table of Contents 🎓 📚

Project Overview 🚀
Skills and Tools 🛠️
Course Structure
Projects 💾 👷
Usage
Contributing
License

Project Overview 📘

This repository contains several projects developed to demonstrate key data engineering concepts and Azure technologies. The projects are designed to build expertise in:

Creating and optimizing data models
Working with NoSQL databases using Cassandra
Developing ETL pipelines and workflows using Azure Data Factory
Analyzing data with Azure Synapse Analytics
Managing and automating data flows across various Azure services
SQL-based data manipulation and transformation

Each project folder contains:

Project-specific documentation
Source code
Configuration files
Instructions for replicating the environment and workflows

Skills and Tools 🧰

The course and projects helped build skills with the following tools and technologies:

Data Modeling: Database schema design, optimization for analytical or transactional use.
Azure Data Factory (ADF): Orchestrating ETL pipelines, automation, and scheduling.
Azure Synapse Analytics: Data warehousing, big data analytics, and querying large datasets.
NoSQL Database (Cassandra): Working with non-relational databases for scalable data storage.
SQL: Querying, transforming, and managing data.
Python: Writing custom ETL logic, data transformation scripts.
Data Flows: Visual data transformations and data orchestration.

Course Structure 📑

The course was structured into key sections, each with projects designed to reinforce the specific skills taught. Here is a breakdown of each course module:

Data Modeling and Ingestion 🔧

Key Topics: Data modeling fundamentals, database schema design, data ingestion strategies.
Tools: Azure SQL Database, Azure Synapse Analytics, Data Factory.

NoSQL Databases with Cassandra

Key Topics: NoSQL principles, Cassandra data modeling, query optimization.
Tools: Cassandra, Python.

Azure Data Factory for ETL Pipelines

Key Topics: Building ETL pipelines, data transformation, automation.
Tools: Azure Data Factory, Data Flows, JSON-based configuration.

Data Warehousing with Azure Synapse Analytics

Key Topics: Data warehousing, query optimization, distributed computing in Synapse.
Tools: Azure Synapse Analytics, SQL, Synapse Studio.

Advanced Data Engineering and Automation

Key Topics: Complex data flows, automation, and monitoring in Azure.
Tools: Azure Data Factory, Azure Logic Apps, PowerShell, SQL.

Projects 💾 👷

Project 1: Data Modeling and ETL Pipeline with Azure SQL Database

Objective: Develop a database schema and ETL pipeline for a retail data use case.
Technologies: Azure SQL Database, Azure Data Factory, SQL, Python.

Project 2: Building a NoSQL Database with Cassandra

Objective: Design a NoSQL data model for a streaming service and implement it in Cassandra.
Technologies: Cassandra, Python.

Project 3: Data Warehousing and Analytics with Azure Synapse 👷

Objective: Create a data warehouse and perform analytics using Azure Synapse.
Technologies: Azure Synapse Analytics, SQL.

Each project folder includes detailed instructions and example code to help you replicate the project.

Usage 🔧

Each project has its own README file with specific instructions on usage, running pipelines, and testing the outcomes.

Configure Azure services (e.g., set up Azure Data Factory, Synapse, SQL Database) as described in the project documentation.
Run ETL pipelines or scripts, following instructions in each project folder.
Monitor results via Azure Portal or relevant tools (e.g., Synapse Studio for data warehousing projects).

Contributing 🙌

This repository is intended as a personal portfolio, but feedback and suggestions are welcome! If you'd like to make improvements, please feel free to fork the repository and submit a pull request.

License 📜

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
01-Data Modeling/Cassandra		01-Data Modeling/Cassandra
02- Cloud DW/DW-demo-Sakila		02- Cloud DW/DW-demo-Sakila
03-bike-divvy		03-bike-divvy
Arm templates		Arm templates
assets		assets
nycpayroll_project		nycpayroll_project
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineering with Microsoft Azure - Udacity Nanodegree

Table of Contents 🎓 📚

Project Overview 📘

Each project folder contains:

Skills and Tools 🧰

Course Structure 📑

Data Modeling and Ingestion 🔧

NoSQL Databases with Cassandra

Azure Data Factory for ETL Pipelines

Data Warehousing with Azure Synapse Analytics

Advanced Data Engineering and Automation

Projects 💾 👷

Project 1: Data Modeling and ETL Pipeline with Azure SQL Database

Project 2: Building a NoSQL Database with Cassandra

Project 3: Data Warehousing and Analytics with Azure Synapse 👷

Usage 🔧

Contributing 🙌

License 📜

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

Mehranmzn/DataEngineering_Nano_degree

Folders and files

Latest commit

History

Repository files navigation

Data Engineering with Microsoft Azure - Udacity Nanodegree

Table of Contents 🎓 📚

Project Overview 📘

Each project folder contains:

Skills and Tools 🧰

Course Structure 📑

Data Modeling and Ingestion 🔧

NoSQL Databases with Cassandra

Azure Data Factory for ETL Pipelines

Data Warehousing with Azure Synapse Analytics

Advanced Data Engineering and Automation

Projects 💾 👷

Project 1: Data Modeling and ETL Pipeline with Azure SQL Database

Project 2: Building a NoSQL Database with Cassandra

Project 3: Data Warehousing and Analytics with Azure Synapse 👷

Usage 🔧

Contributing 🙌

License 📜

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages