-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support inlining view / dataframes logical plan #3923
Conversation
503b14e
to
420a2f4
Compare
@@ -148,6 +149,7 @@ impl Optimizer { | |||
/// Create a new optimizer using the recommended list of rules | |||
pub fn new(config: &OptimizerConfig) -> Self { | |||
let mut rules: Vec<Arc<dyn OptimizerRule + Sync + Send>> = vec![ | |||
Arc::new(InlineTableScan::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb @andygrove this might be a good candidate as well to an Analysis
phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👨🍳 👌 -- this is very cool @Dandandan
// under the License. | ||
|
||
//! Optimizer rule to replace TableScan references | ||
//! such as DataFrames and Views and inlines the LogicalPlan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// Match only on scans without filter/projection | ||
// As DataFrames / Views don't have those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we couldn't also create a projection / limit node as part of this rewrite as well if the table scan had them -- maybe we could file that as a future optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe filters/projections/limit etc. won't be on the tablescan directly for view- / dataframes so removed it as it would be mostly dead code (and requires some more tests to cover those cases).
Might be a future possibility for tablescans with those set or there is a good usecase for it for custom tableproviders, not sure...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe the comment could be updated to say "table scan won't have projecton / filters at this stage" (especially if this is run as one of the first optimizer passes)
We could also potentially add a debug!
log if they were ever not None
to hint to someone in the future it could be improved
fetch: None, | ||
projected_schema, | ||
projection: None, | ||
}) if filters.is_empty() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, if it has filters, we could add a LogicalPlan::Filter here I think
@@ -148,6 +149,7 @@ impl Optimizer { | |||
/// Create a new optimizer using the recommended list of rules | |||
pub fn new(config: &OptimizerConfig) -> Self { | |||
let mut rules: Vec<Arc<dyn OptimizerRule + Sync + Send>> = vec![ | |||
Arc::new(InlineTableScan::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
There seems to be some subtle bug with projection with alias being wrong. Needs more investigation |
Turns out there was a small bug being introduced recently with avoiding duplicate projections during projection pushdown |
080085c
to
275ef8e
Compare
@@ -527,7 +527,9 @@ fn optimize_plan( | |||
} | |||
|
|||
fn projection_equal(p: &Projection, p2: &Projection) -> bool { | |||
p.expr.len() == p2.expr.len() && p.expr.iter().zip(&p2.expr).all(|(l, r)| l == r) | |||
p.expr.len() == p2.expr.len() | |||
&& p.alias == p2.alias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @andygrove a small change because otherwise it also removes projections with (different) alias.
Going to merge this when everything is green |
* Inline TableScans for views and dataframes * Inline TableScans for views and dataframes * Inline TableScans for views and dataframes * WIP * WIP * WIP * WIP * WIP * WIP * fmt * doc * Fix test * Simplify * Fix * Rename test source * Use plan instead of projected schema * Docs * Use SubqueryAlias * Revert "Use SubqueryAlias" This reverts commit 207c2a9. * WIP * Fix issue * Clippy * Format
Which issue does this PR close?
Closes #3913
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?