feat(planner): Add MaterializedViewScanNode #26492

tdcmeehan · 2025-10-31T12:18:26Z

Description

Implements the planning-oriented materialized view framework from RFC-0016. Adds MaterializedViewScanNode that holds both the materialized data plan and view query plan, letting the optimizer choose based on freshness during planning instead of at analysis time.

New code path is opt-in via legacy_materialized_views=false session property. Legacy SQL stitching path is preserved as default.

Motivation and Context

See RFC-0016.

Impact

No user-facing changes with default settings.

Test Plan

Tests are included in this PR, existing Memory connector end to end tests validate the behavior for unpartitioned tables.

Contributor checklist

Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.
If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== NO RELEASE NOTE ==

sourcery-ai · 2025-10-31T12:18:35Z

Reviewer's Guide

This PR adds support for non-legacy materialized views by capturing view metadata in the analyzer, introducing a new MaterializedViewScanNode, extending the statement and relation planners to produce unified plans over both data table and view query, and adding a rewrite rule that chooses between fresh and materialized data at runtime, with SPI, optimizer, and test updates.

Sequence diagram for planning and rewriting a MaterializedViewScanNode

sequenceDiagram
    participant Analyzer
    participant StatementAnalyzer
    participant RelationPlanner
    participant Optimizer
    participant MaterializedViewRewrite

    Analyzer->>StatementAnalyzer: setMaterializedViewInfo(table, mvInfo)
    StatementAnalyzer->>RelationPlanner: process(table)
    RelationPlanner->>RelationPlanner: planMaterializedView(table, mvInfo, context)
    RelationPlanner->>RelationPlanner: create MaterializedViewScanNode
    Optimizer->>MaterializedViewRewrite: apply(MaterializedViewScanNode, ...)
    MaterializedViewRewrite->>MaterializedViewScanNode: getDataTablePlan()/getViewQueryPlan()
    MaterializedViewRewrite->>Optimizer: return rewritten ProjectNode

Class diagram for the new MaterializedViewScanNode and related changes

classDiagram
    class MaterializedViewScanNode {
        +PlanNode dataTablePlan
        +PlanNode viewQueryPlan
        +QualifiedObjectName materializedViewName
        +Map<VariableReferenceExpression, VariableReferenceExpression> dataTableMappings
        +Map<VariableReferenceExpression, VariableReferenceExpression> viewQueryMappings
        +List<VariableReferenceExpression> outputVariables
        +getDataTablePlan()
        +getViewQueryPlan()
        +getMaterializedViewName()
        +getDataTableMappings()
        +getViewQueryMappings()
        +getOutputVariables()
        +getSources()
        +sourceOutputLayout(int)
        +replaceChildren(List<PlanNode>)
        +accept(PlanVisitor, context)
        +assignStatsEquivalentPlanNode(Optional<PlanNode>)
    }
    MaterializedViewScanNode --|> PlanNode

    class Analysis {
        +Map<NodeRef<Table>, MaterializedViewInfo> materializedViewInfoMap
        +setMaterializedViewInfo(Table, MaterializedViewInfo)
        +getMaterializedViewInfo(Table): Optional<MaterializedViewInfo>
    }
    class Analysis.MaterializedViewInfo {
        +QualifiedObjectName materializedViewName
        +QualifiedName storageTableName
        +Table dataTable
        +Query viewQuery
        +MaterializedViewDefinition materializedViewDefinition
        +getMaterializedViewName()
        +getStorageTableName()
        +getDataTable()
        +getViewQuery()
        +getMaterializedViewDefinition()
    }
    Analysis "1" *-- "*" Analysis.MaterializedViewInfo

    class MaterializedViewRewrite {
        +apply(MaterializedViewScanNode, Captures, Context): Result
    }
    MaterializedViewRewrite ..> MaterializedViewScanNode

    class PlanVisitor {
        +visitMaterializedViewScan(MaterializedViewScanNode, context)
    }
    PlanVisitor ..> MaterializedViewScanNode

File-Level Changes

Change	Details	Files
Capture and propagate materialized view metadata in analysis	Added materializedViewInfoMap map with set/get methods Introduced immutable MaterializedViewInfo class to hold view metadata	`presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java`
Extend StatementAnalyzer to plan non-legacy materialized views	Branched on legacy_materialized_views session property to choose new planning path Parsed original view SQL and storage table into MaterializedViewInfo Processed view query and data table to build scopes and unregister temporary named query	`presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java`
Implement MaterializedViewScanNode planning in RelationPlanner	Checked for MaterializedViewInfo when visiting Table nodes Built data and view query plans, mapped output variables, and created MaterializedViewScanNode	`presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java`
Define new MaterializedViewScanNode in SPI	Added new PlanNode class with JSON annotations and mapping logic Implemented getSources, replaceChildren, visitor accept, and sourceOutputLayout	`presto-spi/src/main/java/com/facebook/presto/spi/plan/MaterializedViewScanNode.java`
Add rewrite rule to choose between data table or view query	Implemented MaterializedViewRewrite rule using MetadataResolver.getMaterializedViewStatus Integrated rule into PlanOptimizers and enabled in ApplyConnectorOptimization	`presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MaterializedViewRewrite.java` `presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java` `presto-main-base/src/main/java/com/facebook/presto/sql/planner/plan/Patterns.java` `presto-spi/src/main/java/com/facebook/presto/spi/plan/PlanVisitor.java` `presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/ApplyConnectorOptimization.java`
Ensure validator and rule infrastructure supports the new scan node	Updated ValidateDependenciesChecker to validate MaterializedViewScanNode mappings Extended test PlanBuilder to construct MaterializedViewScanNode Added default overload in MetadataResolver SPI	`presto-main-base/src/main/java/com/facebook/presto/sql/planner/sanity/ValidateDependenciesChecker.java` `presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/test/PlanBuilder.java` `presto-spi/src/main/java/com/facebook/presto/spi/analyzer/MetadataResolver.java`
Add end-to-end tests for materialized view planning and rewrite	Created TestMaterializedViewRewrite for full/partial materialization scenarios Added TestMemoryMaterializedViewPlanner for connector-level planning verification	`presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestMaterializedViewRewrite.java` `presto-memory/src/test/java/com/facebook/presto/plugin/memory/TestMemoryMaterializedViewPlanner.java`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

tdcmeehan · 2025-10-31T12:32:34Z

@sourcery-ai review

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java:187-188` </location>
<code_context>
     @Override
     protected RelationPlan visitTable(Table node, SqlPlannerContext context)
     {
+        Optional<Analysis.MaterializedViewInfo> materializedViewInfo = analysis.getMaterializedViewInfo(node);
+        if (materializedViewInfo.isPresent()) {
+            return planMaterializedView(node, materializedViewInfo.get(), context);
+        }
</code_context>

<issue_to_address>
**issue:** Consider handling the case where the viewQueryPlan and dataTablePlan have mismatched field counts.

Adding a check or assertion for matching field counts and orderings between dataTablePlan and viewQueryPlan would help prevent runtime errors and ensure query correctness.
</issue_to_address>

### Comment 2
<location> `presto-memory/src/test/java/com/facebook/presto/plugin/memory/TestMemoryMaterializedViewPlanner.java:70` </location>
<code_context>
+    }
+
+    @Test
+    public void testMaterializedViewRefreshed()
+    {
+        assertUpdate("CREATE TABLE base_table (id BIGINT, value BIGINT)");
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion to add error condition tests for querying dropped materialized views.

Please add a test that verifies querying a dropped materialized view results in the expected error and no execution plan is generated.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java

...emory/src/test/java/com/facebook/presto/plugin/memory/TestMemoryMaterializedViewPlanner.java

tdcmeehan · 2025-10-31T15:08:38Z

@sourcery-ai review

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

There’s a lot of duplicated logic in StatementAnalyzer.processMaterializedView between the legacy and new materialized‐view paths – consider extracting the common setup and teardown steps into helper methods to reduce maintenance overhead.
Using NodeRef as the key for materializedViewInfoMap could lead to lookup misses if Table nodes aren’t guaranteed to be canonical; consider keying by QualifiedObjectName or otherwise ensuring stable identity.
The default getMaterializedViewStatus overload discards the baseQueryDomain, which might hide important filtering information from connectors – either make the overload abstract or document its limitations more explicitly.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There’s a lot of duplicated logic in StatementAnalyzer.processMaterializedView between the legacy and new materialized‐view paths – consider extracting the common setup and teardown steps into helper methods to reduce maintenance overhead.
- Using NodeRef<Table> as the key for materializedViewInfoMap could lead to lookup misses if Table nodes aren’t guaranteed to be canonical; consider keying by QualifiedObjectName or otherwise ensuring stable identity.
- The default getMaterializedViewStatus overload discards the baseQueryDomain, which might hide important filtering information from connectors – either make the overload abstract or document its limitations more explicitly.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java:338-339` </location>
<code_context>
+        ImmutableMap.Builder<VariableReferenceExpression, VariableReferenceExpression> dataTableMappingsBuilder = ImmutableMap.builder();
+        ImmutableMap.Builder<VariableReferenceExpression, VariableReferenceExpression> viewQueryMappingsBuilder = ImmutableMap.builder();
+
+        for (Field field : dataTableDescriptor.getVisibleFields()) {
+            int fieldIndex = dataTableDescriptor.indexOf(field);
+            VariableReferenceExpression dataTableVar = dataTableVariables.get(fieldIndex);
+            VariableReferenceExpression viewQueryVar = viewQueryVariables.get(fieldIndex);
</code_context>

<issue_to_address>
**issue (bug_risk):** Potential mismatch if visible fields differ from output variable order.

This approach relies on visible fields and output variables having matching order, which may not always be guaranteed. Please ensure consistency or clarify this assumption in the documentation.
</issue_to_address>

### Comment 2
<location> `presto-spi/src/main/java/com/facebook/presto/spi/plan/MaterializedViewScanNode.java:123-132` </location>
<code_context>
+        return unmodifiableList(sources);
+    }
+
+    public List<VariableReferenceExpression> sourceOutputLayout(int sourceIndex)
+    {
+        if (sourceIndex == 0) {
+            // Data table plan - use dataTableMappings
+            return outputVariables.stream()
+                    .map(dataTableMappings::get)
+                    .collect(toList());
+        }
+        else if (sourceIndex == 1) {
+            // View query plan - use viewQueryMappings
+            return outputVariables.stream()
+                    .map(viewQueryMappings::get)
+                    .collect(toList());
+        }
+        else {
+            throw new IllegalArgumentException("MaterializedViewScanNode only has 2 sources (index 0 and 1), but got index " + sourceIndex);
+        }
</code_context>

<issue_to_address>
**issue (bug_risk):** sourceOutputLayout may return nulls if mappings are incomplete.

Currently, missing mappings result in nulls in the returned list, which could lead to errors later. Please validate that all output variables have corresponding mappings and handle missing cases appropriately.
</issue_to_address>

### Comment 3
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestMaterializedViewRewrite.java:47-48` </location>
<code_context>
+public class TestMaterializedViewRewrite
+        extends BaseRuleTest
+{
+    @Test
+    public void testUseFreshDataWhenFullyMaterialized()
+    {
+        QualifiedObjectName materializedViewName = QualifiedObjectName.valueOf("catalog.schema.mv");
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for error conditions and invalid mappings in MaterializedViewScanNode.

Additional tests should cover scenarios with invalid output-to-source variable mappings and cases where exceptions are thrown by underlying plans to ensure robustness against edge cases.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-10-31T15:10:03Z

presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java

+        for (Field field : dataTableDescriptor.getVisibleFields()) {
+            int fieldIndex = dataTableDescriptor.indexOf(field);


issue (bug_risk): Potential mismatch if visible fields differ from output variable order.

This approach relies on visible fields and output variables having matching order, which may not always be guaranteed. Please ensure consistency or clarify this assumption in the documentation.

sourcery-ai · 2025-10-31T15:10:03Z

presto-spi/src/main/java/com/facebook/presto/spi/plan/MaterializedViewScanNode.java

+    public List<VariableReferenceExpression> sourceOutputLayout(int sourceIndex)
+    {
+        if (sourceIndex == 0) {
+            // Data table plan - use dataTableMappings
+            return outputVariables.stream()
+                    .map(dataTableMappings::get)
+                    .collect(toList());
+        }
+        else if (sourceIndex == 1) {
+            // View query plan - use viewQueryMappings


issue (bug_risk): sourceOutputLayout may return nulls if mappings are incomplete.

Currently, missing mappings result in nulls in the returned list, which could lead to errors later. Please validate that all output variables have corresponding mappings and handle missing cases appropriately.

sourcery-ai · 2025-10-31T15:10:03Z

...rc/test/java/com/facebook/presto/sql/planner/iterative/rule/TestMaterializedViewRewrite.java

+    @Test
+    public void testUseFreshDataWhenFullyMaterialized()


suggestion (testing): Consider adding tests for error conditions and invalid mappings in MaterializedViewScanNode.

Additional tests should cover scenarios with invalid output-to-source variable mappings and cases where exceptions are thrown by underlying plans to ensure robustness against edge cases.

hantangwangd

Thanks for this feature. Overall looks good to me! Just one nit and a few points for discussion.

hantangwangd · 2025-11-03T09:02:14Z

...rc/test/java/com/facebook/presto/sql/planner/iterative/rule/TestMaterializedViewRewrite.java

+        }
+
+        @Override
+        public Map<String, com.facebook.presto.spi.ColumnHandle> getColumnHandles(TableHandle tableHandle)


Suggested change

public Map<String, com.facebook.presto.spi.ColumnHandle> getColumnHandles(TableHandle tableHandle)

public Map<String, ColumnHandle> getColumnHandles(TableHandle tableHandle)

nit: use static import

hantangwangd · 2025-11-03T09:13:19Z

presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java

+        builder.add(new IterativeOptimizer(
+                metadata,
+                ruleStats,
+                statsCalculator,
+                estimatedExchangesCostCalculator,
+                ImmutableSet.of(new MaterializedViewRewrite(metadata))));


A little curious — is there a specific reason this rule needs to be wrapped in an IterativeOptimizer? I was wondering if it could simply work as a standalone PlanOptimizer.

In general, although this isn't written down, I believe it's preferable to use the iterative optimizer framework, and at one point in time we were trying to deprecate direct usage of PlanOptimizer. In the future, I'll combine this rule and other rules to simplify expressions into this single IterativeOptimizer, as partition stitching will create potentially complicated expression trees which can be simplified. I also appreciate that the scope of the Rules are confined to the nodes we are interested in (i.e. I only see MaterializedViewScanNode in MaterializedViewRewrite, not the entire plan).

hantangwangd · 2025-11-03T09:21:57Z

...se/src/main/java/com/facebook/presto/sql/planner/iterative/rule/MaterializedViewRewrite.java

+        MaterializedViewStatus status = metadataResolver.getMaterializedViewStatus(node.getMaterializedViewName());
+
+        boolean useDataTable = status.isFullyMaterialized();
+        PlanNode chosenPlan = useDataTable ? node.getDataTablePlan() : node.getViewQueryPlan();
+        Map<VariableReferenceExpression, VariableReferenceExpression> chosenMappings =
+                useDataTable ? node.getDataTableMappings() : node.getViewQueryMappings();


Just to confirm, when the status is PARTIALLY_MATERIALIZED, should the stitching logic be supported here as well?

That's right. For now, if it's anything other than FULLY_MATERIALIZED, then we use the view plan. In subsequent PRs, I'll refine this with the techniques outlined in RFC-0016.

Add MaterializedViewScanNode

755237d

prestodb-ci added the from:IBM PR from IBM label Oct 31, 2025

sourcery-ai bot reviewed Oct 31, 2025

View reviewed changes

presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java Show resolved Hide resolved

...emory/src/test/java/com/facebook/presto/plugin/memory/TestMemoryMaterializedViewPlanner.java Show resolved Hide resolved

Code review comments

781324f

sourcery-ai bot reviewed Oct 31, 2025

View reviewed changes

tdcmeehan changed the title ~~Add MaterializedViewScanNode~~ feat(planner): Add MaterializedViewScanNode Oct 31, 2025

tdcmeehan requested a review from zation99 October 31, 2025 19:54

tdcmeehan marked this pull request as ready for review October 31, 2025 19:54

tdcmeehan requested review from a team, ClarenceThreepwood, feilong-liu and jaystarshot as code owners October 31, 2025 19:54

prestodb-ci requested review from a team, NivinCS and nishithakbhaskaran and removed request for a team October 31, 2025 19:55

tdcmeehan requested a review from hantangwangd October 31, 2025 19:55

hantangwangd reviewed Nov 3, 2025

View reviewed changes

Fix import

ef0a1b5

tdcmeehan requested a review from aaneja November 3, 2025 16:06

		for (Field field : dataTableDescriptor.getVisibleFields()) {
		int fieldIndex = dataTableDescriptor.indexOf(field);

	public Map<String, com.facebook.presto.spi.ColumnHandle> getColumnHandles(TableHandle tableHandle)
	public Map<String, ColumnHandle> getColumnHandles(TableHandle tableHandle)

feat(planner): Add MaterializedViewScanNode #26492

Are you sure you want to change the base?

feat(planner): Add MaterializedViewScanNode #26492

Conversation

tdcmeehan commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Uh oh!

sourcery-ai bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for planning and rewriting a MaterializedViewScanNode

Class diagram for the new MaterializedViewScanNode and related changes

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

tdcmeehan commented Oct 31, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tdcmeehan commented Oct 31, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

hantangwangd left a comment

Choose a reason for hiding this comment

Uh oh!

hantangwangd Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

tdcmeehan Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

hantangwangd Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

tdcmeehan Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hantangwangd Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

tdcmeehan Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tdcmeehan commented Oct 31, 2025 •

edited

Loading

sourcery-ai bot commented Oct 31, 2025 •

edited

Loading

tdcmeehan Nov 3, 2025 •

edited

Loading