GitHub

A ready-to-use, declarative, metadata-driven ETL framework for Azure Databricks. This unified solution supports both batch and streaming data processing, and is designed to reduce data engineering effort by up to 50% through built-in automation, consistency, and reusability.

Key Features:

Declarative Configuration: Define ETL pipelines using metadata, not code.
Dual-Mode Support: Seamlessly handles both batch and streaming workloads.
Automation Built-In: Handles schema evolution, error handling, upserts, auditing, data validation, data lineage tracking and more.
Reusable Components: Common transformations and validations abstracted for consistency.
Optimized for Databricks: Leverages Delta Lake, Spark Structured Streaming, and Azure integrations.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
etl		etl
mdd		mdd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

julianwucn/databricks_mdd

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages