-
Notifications
You must be signed in to change notification settings - Fork 348
Description
When running create_external_sources followed by plan with a new MV, sqlmesh first creates the MV, then drops it and recreates it again unnecessarily. This is a regression compared to 0.197.4 version where this big performance problem does not appear.
How to reproduce
- Create a RW project with a dummy MV
MODEL (
name demo_mv,
kind VIEW (materialized true)
);
SELECT event_id from click_events
- Manually create a table in a RW database:
CREATE TABLE click_events (
event_id INT PRIMARY KEY,
event_type VARCHAR,
event_value INT,
event_timestamp TIMESTAMPTZ
);
- Execute
sqlmesh create_external_models - Execute
sqlmesh plan
Current bug behavior
MV demo_mv is created, then deleted and recreated again.
Log
2025-09-09 13:48:01,229 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage CreateSnapshotRecordsStage (evaluator.py:125)
2025-09-09 13:48:01,484 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage PhysicalLayerSchemaCreationStage (evaluator.py:125)
2025-09-09 13:48:01,485 - ThreadPoolExecutor-1_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'sqlmesh__default' (evaluator.py:1355)
2025-09-09 13:48:01,485 - ThreadPoolExecutor-1_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /* SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / CREATE SCHEMA IF NOT EXISTS "sqlmesh__default" (base.py:2466)
2025-09-09 13:48:01,540 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage BackfillStage (evaluator.py:125)
2025-09-09 13:48:01,543 - ThreadPoolExecutor-2_0 - sqlmesh.core.snapshot.evaluator - INFO - Listing data objects in schema sqlmesh__default (evaluator.py:1458)
2025-09-09 13:48:01,543 - ThreadPoolExecutor-2_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / SELECT current_catalog (base.py:2466)
2025-09-09 13:48:01,555 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Evaluating snapshot SnapshotId<"demo_mv": 281835372> (evaluator.py:701)
2025-09-09 13:48:01,557 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / SELECT 1 FROM "information_schema"."tables" WHERE "table_name" = 'demo_mv__827871489' AND "table_schema" = 'sqlmesh__default' (base.py:2466)
2025-09-09 13:48:01,561 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating view 'sqlmesh__default.demo_mv__827871489' (evaluator.py:2358)
2025-09-09 13:48:01,565 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / CREATE MATERIALIZED VIEW "sqlmesh__default"."demo_mv__827871489" AS SELECT "click_events"."event_id" AS "event_id" FROM "click_events" AS "click_events" (base.py:2466)
2025-09-09 13:48:02,911 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Inserting data for snapshot SnapshotId<"demo_mv": 281835372> (evaluator.py:905)
2025-09-09 13:48:02,916 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Inserting batch (2025-09-08 00:00:00, 2025-09-09 00:00:00) into sqlmesh__default.demo_mv__827871489' (evaluator.py:935)
2025-09-09 13:48:02,917 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / SELECT 1 FROM "information_schema"."tables" WHERE "table_name" = 'demo_mv__827871489' AND "table_schema" = 'sqlmesh__default' (base.py:2466)
2025-09-09 13:48:02,920 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Replacing view 'sqlmesh__default.demo_mv__827871489' (evaluator.py:2322)
2025-09-09 13:48:02,921 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / DROP MATERIALIZED VIEW IF EXISTS "sqlmesh__default"."demo_mv__827871489" CASCADE (base.py:2466)
2025-09-09 13:48:03,112 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / CREATE MATERIALIZED VIEW "sqlmesh__default"."demo_mv__827871489" AS SELECT "click_events"."event_id" AS "event_id" FROM "click_events" AS "click_events" (base.py:2466)
2025-09-09 13:48:04,579 - ThreadPoolExecutor-3_0 - sqlmesh.core.state_sync.db.facade - INFO - Adding interval (2025-09-08 00:00:00, 2025-09-09 00:00:00) for snapshot SnapshotId<"demo_mv": 281835372> (facade.py:622)
2025-09-09 13:48:04,579 - ThreadPoolExecutor-3_0 - sqlmesh.core.state_sync.db.interval - INFO - Pushing intervals for snapshot SnapshotId<"demo_mv": 281835372> (interval.py:214)
2025-09-09 13:48:04,681 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage EnvironmentRecordUpdateStage (evaluator.py:125)
2025-09-09 13:48:04,682 - MainThread - sqlmesh.core.state_sync.db.facade - INFO - Promoting environment 'prod' (facade.py:165)
2025-09-09 13:48:04,852 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage UnpauseStage (evaluator.py:125)
2025-09-09 13:48:05,028 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage VirtualLayerUpdateStage (evaluator.py:125)
2025-09-09 13:48:05,032 - ThreadPoolExecutor-4_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'default' (evaluator.py:1355)
2025-09-09 13:48:05,033 - ThreadPoolExecutor-4_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / CREATE SCHEMA IF NOT EXISTS "default" (base.py:2466)
2025-09-09 13:48:05,085 - ThreadPoolExecutor-5_0 - sqlmesh.core.snapshot.evaluator - INFO - Updating view 'default.demo_mv' to point at table 'sqlmesh__default.demo_mv__827871489' (evaluator.py:1727)
2025-09-09 13:48:05,086 - ThreadPoolExecutor-5_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 / DROP VIEW IF EXISTS "default"."demo_mv" CASCADE (base.py:2466)
2025-09-09 13:48:05,089 - ThreadPoolExecutor-5_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: / SQLMESH_PLAN: 366b6344f46249c7a942bef1fc7f13b1 */ CREATE VIEW "default"."demo_mv" AS SELECT * FROM "sqlmesh__default"."demo_mv__827871489" (base.py:2466)
2025-09-09 13:48:05,146 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage FinalizeEnvironmentStage (evaluator.py:125)
2025-09-09 13:48:05,146 - MainThread - sqlmesh.core.state_sync.db.environment - INFO - Finalizing environment 'prod' (environment.py:141)
Expected behavior
The MV demo_mv is created only once. For reference, here is the log from version 0.197.4, where the issue does not appear:
Log
2025-09-09 13:37:18,767 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage CreateSnapshotRecordsStage (evaluator.py:115)
2025-09-09 13:37:18,898 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage PhysicalLayerUpdateStage (evaluator.py:115)
2025-09-09 13:37:18,899 - ThreadPoolExecutor-1_0 - sqlmesh.core.snapshot.evaluator - INFO - Listing data objects in schema sqlmesh__default (evaluator.py:350)
2025-09-09 13:37:18,899 - ThreadPoolExecutor-1_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: SELECT current_catalog (base.py:2237)
2025-09-09 13:37:18,912 - MainThread - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'sqlmesh__default' (evaluator.py:1143)
2025-09-09 13:37:18,913 - MainThread - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: CREATE SCHEMA IF NOT EXISTS "sqlmesh__default" (base.py:2237)
2025-09-09 13:37:18,916 - ThreadPoolExecutor-2_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: SELECT 1 FROM "information_schema"."tables" WHERE "table_name" = 'demo_mv__3545757078' AND "table_schema" = 'sqlmesh__default' (base.py:2237)
2025-09-09 13:37:18,921 - ThreadPoolExecutor-2_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating view 'sqlmesh__default.demo_mv__3545757078' (evaluator.py:1940)
2025-09-09 13:37:18,926 - ThreadPoolExecutor-2_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: CREATE MATERIALIZED VIEW "sqlmesh__default"."demo_mv__3545757078" AS SELECT "click_events"."event_id" AS "event_id" FROM "click_events" AS "click_events" (base.py:2237)
2025-09-09 13:37:20,301 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage BackfillStage (evaluator.py:115)
2025-09-09 13:37:20,304 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Evaluating snapshot SnapshotId<"demo_mv": 4095825826> (evaluator.py:636)
2025-09-09 13:37:20,308 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Inserting batch (2025-09-08 00:00:00, 2025-09-09 00:00:00) into sqlmesh__default.demo_mv__3545757078' (evaluator.py:690)
2025-09-09 13:37:20,312 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: SELECT 1 FROM "information_schema"."tables" WHERE "table_name" = 'demo_mv__3545757078' AND "table_schema" = 'sqlmesh__default' (base.py:2237)
2025-09-09 13:37:20,317 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Skipping creation of the view 'sqlmesh__default.demo_mv__3545757078' (evaluator.py:1891)
2025-09-09 13:37:20,318 - ThreadPoolExecutor-3_0 - sqlmesh.core.state_sync.db.facade - INFO - Adding interval (2025-09-08 00:00:00, 2025-09-09 00:00:00) for snapshot SnapshotId<"demo_mv": 4095825826> (facade.py:621)
2025-09-09 13:37:20,318 - ThreadPoolExecutor-3_0 - sqlmesh.core.state_sync.db.interval - INFO - Pushing intervals for snapshot SnapshotId<"demo_mv": 4095825826> (interval.py:213)
2025-09-09 13:37:20,414 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage EnvironmentRecordUpdateStage (evaluator.py:115)
2025-09-09 13:37:20,414 - MainThread - sqlmesh.core.state_sync.db.facade - INFO - Promoting environment 'prod' (facade.py:169)
2025-09-09 13:37:20,561 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage UnpauseStage (evaluator.py:115)
2025-09-09 13:37:20,569 - MainThread - sqlmesh.core.state_sync.db.snapshot - INFO - Unpausing snapshot SnapshotId<"demo_mv": 4095825826> (snapshot.py:144)
2025-09-09 13:37:20,570 - MainThread - sqlmesh.core.state_sync.db.snapshot - INFO - Unpausing snapshot SnapshotId<"click_events": 886993094> (snapshot.py:144)
2025-09-09 13:37:20,643 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage VirtualLayerUpdateStage (evaluator.py:115)
2025-09-09 13:37:20,646 - MainThread - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'default' (evaluator.py:1143)
2025-09-09 13:37:20,646 - MainThread - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: CREATE SCHEMA IF NOT EXISTS "default" (base.py:2237)
2025-09-09 13:37:20,782 - ThreadPoolExecutor-4_0 - sqlmesh.core.snapshot.evaluator - INFO - Updating view 'default.demo_mv' to point at table 'sqlmesh__default.demo_mv__3545757078' (evaluator.py:1450)
2025-09-09 13:37:20,783 - ThreadPoolExecutor-4_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: DROP VIEW IF EXISTS "default"."demo_mv" CASCADE (base.py:2237)
2025-09-09 13:37:20,786 - ThreadPoolExecutor-4_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: CREATE VIEW "default"."demo_mv" AS SELECT * FROM "sqlmesh__default"."demo_mv__3545757078" (base.py:2237)
2025-09-09 13:37:20,839 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage FinalizeEnvironmentStage (evaluator.py:115)
2025-09-09 13:37:20,839 - MainThread - sqlmesh.core.state_sync.db.environment - INFO - Finalizing environment 'prod' (environment.py:139)
Affected versions
0.216.0
Notes
- Interestingly, it seems that the issue does not appear when step 3. (execute
sqlmesh create_external_models) is omitted. - Related discussion: Fix(risingwave): Recreate materialized views #5195