- Data Architecture
- DAG Sample
- Directory Structure
- Prerequisites
- How to Run This Project
- Lessons learned
flowchart LR
public[Prod DB] --> dbt
dbt --clean, raw data--> dbt_src
dbt --non redundant--> dbt_dim
dbt --analytical--> dbt_fact
dbt --scd--> snapshots
flowchart LR
film_category --prod to dwh--> src_film_category --dimension--> dim_film
category --prod to dwh--> src_category --dimension--> dim_film
film --prod to dwh--> src_film --dimension--> dim_film
language --prod to dwh--> src_language --dimension--> dim_film
dim_film --data tests--> movie_release_year_not_less_than_2006
project
│ README.md
│ requirements.txt
| .gitignore
| packages.yml
| dbt_project.yml
│
└───miscellaneous
│ │ dvdrental.tar : example DB for postgres
│
└───models
│ │ schema.yml : for generic test & documentation
| └───source.yml : Data sources
│ └───src : all core sql file
│ └───dim : data generated from src
│ └───fact : data generated from src
│
└───seeds
│ │ *.csv : static file want to export in DWH
│
└───snapshots
│ │ *.sql : history/log of tables
│
└───tests
│ │ *.sql : custom singular test file
│
└───macros
│ │ test_*.sql : generic(parameterized) test file
- Python
- SQL
- PostgreSQL
- dbt
- Linux/mac
-
create python venv
python3 -m venv venv_name
-
activate:
source venv_name/bin/activate -
install dependency
pip install -r requirements.txt
-
DB setup(
postgres)- DBName: dvdrental
- restore the DB:
miscellaneous --> dvdrental.tarinpublicschema
-
clone the repo
-
clone and rename the
profiles_sample.ymltoprofiles.yml -
Update the credentials in
profiles.yml -
check config:
dbt debug --target dev -
install dbt packages:
dbt deps --target dev -
test project:
dbt test --target dev -
run project:
dbt run --target dev -
snapshot/log generate:
dbt snapshot --target dev -
Documentation of project:
dbt docs generate --target dev
dbt docs serve --port 8002 --target dev
- materialization
- table
- view
- incremental
- ephemeral (
not implemented)
- macros
- test
- generic
- singular
- snapshot
- SCD2
- dbt docs
- configurations
dbt_project.ymlsource.ymlschema.yml