Skip to content

julianwucn/databricks_mdd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

A ready-to-use, declarative, metadata-driven ETL framework for Azure Databricks. This unified solution supports both batch and streaming data processing, and is designed to reduce data engineering effort by up to 50% through built-in automation, consistency, and reusability.

Key Features:

  • Declarative Configuration: Define ETL pipelines using metadata, not code.
  • Dual-Mode Support: Seamlessly handles both batch and streaming workloads.
  • Automation Built-In: Handles schema evolution, error handling, upserts, auditing, data validation, data lineage tracking and more.
  • Reusable Components: Common transformations and validations abstracted for consistency.
  • Optimized for Databricks: Leverages Delta Lake, Spark Structured Streaming, and Azure integrations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published