This project sets up an ELT pipeline using Snowflake and dbt, following best practices for modular and scalable data transformations.
- Create a Snowflake account.
- Configure a new database, schema, and warehouse.
- Generate user credentials and set up access.
- Set up a Snowflake warehouse and assign compute resources.
- Create roles and grant necessary permissions.
- Configure network policies and security settings.
- Like in the image below:
- Make sure to write your username in this line
grant role dbt_roLe to user <username>
- with
dbt init - Configure your dbt-project
- Configure
profiles.ymlwith Snowflake credentials:models: snowflake_workshop: staging: materialized: view snowflake_warehouse: dbt_wh marts: materialized: table snowflake_warehouse: dbt_wh
- Define
sourcesinmodels/schema.ymlto map raw data. - Create staging models (
models/staging/) for data standardization.
- Define reusable macros in
macros/. - Example macro to handle null values:
{% macro replace_null(column, value) %} coalesce({{ column }}, {{ value }}) {% endmacro %}
- Create fact and dimension tables under
models/marts/. - Build transformation models using SQL and CTEs.
- Define tests in
tests/andschema.yml. - Example:
tests: - unique - not_null
- Run tests:
dbt test
# Initialize dbt project
$ dbt init my_project
# Run transformations
$ dbt run
OR
$ dbr run -s <filename>
# Run tests
$ dbt test- After doing tests & models make sure to check out snowflake or snowflake worksheet to see your models/tables/columns.
