Skip to content

jmostella/modern-data-warehouse-dataops

 
 

Repository files navigation

page_type languages products description
sample
python
C#
TypeScript
bicep
Azure
Azure-Data-factory
Azure-Databricks
Azure-Stream-Analytics
Azure-Data-Lake-Gen2
Azure-Functions
Code samples showcasing how to apply DevOps concepts to the Modern Data Warehouse Architecture leveraging different Azure Data Technologies.

DataOps for the Modern Data Warehouse

This repository contains numerous code samples and artifacts on how to apply DevOps principles to data pipelines built according to the Modern Data Warehouse (MDW) architectural pattern on Microsoft Azure.

The samples are either focused on a single azure service (Single Tech Samples) or showcases an end to end data pipeline solution as a reference implementation (End to End Samples). Each sample contains code and artifacts relating one or more of the following

  • Infrastructure as Code (IaC)
  • Build and Release Pipelines (CI/CD)
  • Testing
  • Observability / Monitoring

Single Technology Samples

End to End samples

  • Parking Sensor Solution - This demonstrates batch, end-to-end data pipeline following the MDW architecture, along with a corresponding CI/CD process. See here for the presentation which includes a detailed walk-through of the solution. Architecture
  • Temperature Events Solution - This demonstrate a high-scale event-driven data pipeline with a focus on how to implement Observability and Load Testing. Architecture
  • Dataset Versioning Solution - This demonstrates how to use DataFactory to Orchestrate DataFlow, to do DeltaLoads into DeltaLake On DataLake(DoDDDoD).
  • MDW Data Governance and PII data detection - This sample demonstrates how to deploy the Infrastructure of an end-to-end MDW Pipeline using Azure DevOps pipelines along with a focus around Data Governance and PII data detection.
    • Technology stack: Azure DevOps, Azure Data Factory, Azure Databricks, Azure Purview, Presidio

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

About

DataOps for the Modern Data Warehouse on Microsoft Azure

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 35.8%
  • Python 17.3%
  • PowerShell 10.9%
  • C# 8.5%
  • Jupyter Notebook 7.1%
  • Bicep 6.4%
  • Other 14.0%