-
Notifications
You must be signed in to change notification settings - Fork 146
Description
Describe the bug
When creating materialized views in Databricks using dbt, the process is executing the same query twice, which significantly increases processing time. The first query appears to be a validation/schema check, followed by the actual materialized view creation, but this behavior has suddenly started and is doubling our processing time.
Steps To Reproduce
Create a standard dbt model that uses materialized view materialization
Run a command like dbt run --select ft_table
Observe that dbt runs two queries:
First query: SELECT * FROM source_table WHERE conditions
Second query: CREATE MATERIALIZED VIEW target_table (...columns...) AS SELECT * FROM source_table WHERE conditions
Sample model code
-- models/marts/xx/ft_table.sql
SELECT
*
FROM {{ source('source', 'ft_table') }}
Model configuration
__models.yml
materialized: materialized_view
other settings...
Expected behavior
dbt should create the materialized view with a single query execution. Previous versions of dbt didn't require this double execution.
it is the dbt version
Core:
- installed: 1.10.2
- latest: 1.10.2 - Up to date!
Plugins:
- databricks: 1.10.4 - Up to date!
- spark: 1.9.2 - Up to date!
Now I have to roll back to
Core:
- installed: 1.10.2
- latest: 1.10.2 - Up to date!
Plugins:
- databricks: 1.10.1 - Update available!
- spark: 1.9.2 - Up to date!
Impact
This issue is doubling our processing time for materialized views, which has a significant impact on our development and production refresh cycles. For large datasets, this means a build that previously took 7 minutes now takes 15 minutes or more.