Introduce ProjectionExprs::unproject_exprs/project_exprs and improve docs#20193
Introduce ProjectionExprs::unproject_exprs/project_exprs and improve docs#20193alamb wants to merge 1 commit intoapache:mainfrom
Conversation
| ) | ||
| }) | ||
| }) | ||
| .map(|filter| projection.unproject_expr(&filter)) |
There was a problem hiding this comment.
The whole point of the PR is to pull this function call into a method on ProjectionExprs and add documentation
| }; | ||
| // Now remap column indices to match the table schema. | ||
| let remapped_filters: Result<Vec<_>> = filters_to_remap | ||
| let remapped_filters = filters_to_remap |
There was a problem hiding this comment.
driveby cleanup (no functional change)
| expr: &Arc<dyn PhysicalExpr>, | ||
| projected_exprs: &[ProjectionExpr], | ||
| sync_with_child: bool, | ||
| unproject: bool, |
There was a problem hiding this comment.
I found this parameter name to be super confusing as there are no "children" involved. I suspect this is historical and predates the work to push expressions down
unproject as the best I could come up with -- but would love some thoughts on better names
e7d571b to
ee658ae
Compare
| // For example, the filter being pushed down is `c1_c2 > 5` and it was created | ||
| // against the output schema of the this `DataSource` which has projection `c1 + c2 as c1_c2`. | ||
| // Thus we need to rewrite the filter back to `c1 + c2 > 5` before passing it to the file source. | ||
| // This is necessary because filters refer to the output schema of this `DataSource` |
There was a problem hiding this comment.
Clarified that the "different schema" is the "table schema"
| @@ -842,7 +901,7 @@ pub fn combine_projections( | |||
| pub fn update_expr( | |||
There was a problem hiding this comment.
I contemplated going even farther and deprecating this function (and routing everything through ProjectionExprs) but I think that will result in some non trivial API churn and I figured we could always do it later
|
I have not been able to wrap my head around |
Thanks -- no rush. |
Which issue does this PR close?
Rationale for this change
I am going through how the various layers of expression pushdown and schema rewrites work, and I spent a long time confused about what the
sync_with_childparameter onupdate_exprdid -- like what was the different betweenupdate_expr(.., true)vsupdate_expr(..., false)😕After some study I concluded it controls which way the rewrite is done (either project the expressions to refer to the projection expressions, or the opposite, 'unproject' an expression and substitute the projection definitions back in)
What changes are included in this PR?
sync_with_childtounprojectAre these changes tested?
Yes by existing ci
Are there any user-facing changes?
There are new APIs, but no changes to existing APIs