Skip to content

Introduce ProjectionExprs::unproject_exprs/project_exprs and improve docs#20193

Open
alamb wants to merge 1 commit intoapache:mainfrom
alamb:alamb/consolidate_remapping
Open

Introduce ProjectionExprs::unproject_exprs/project_exprs and improve docs#20193
alamb wants to merge 1 commit intoapache:mainfrom
alamb:alamb/consolidate_remapping

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Feb 6, 2026

Which issue does this PR close?

Rationale for this change

I am going through how the various layers of expression pushdown and schema rewrites work, and I spent a long time confused about what the sync_with_child parameter on update_expr did -- like what was the different betweenupdate_expr(.., true) vs update_expr(..., false) 😕

After some study I concluded it controls which way the rewrite is done (either project the expressions to refer to the projection expressions, or the opposite, 'unproject' an expression and substitute the projection definitions back in)

What changes are included in this PR?

  1. Introduce ProjectionExprs::unproject_exprs/project_exprs
  2. rename sync_with_child to unproject
  3. Improve documentation about what is done

Are these changes tested?

Yes by existing ci

Are there any user-facing changes?

There are new APIs, but no changes to existing APIs

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates datasource Changes to the datasource crate labels Feb 6, 2026
)
})
})
.map(|filter| projection.unproject_expr(&filter))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point of the PR is to pull this function call into a method on ProjectionExprs and add documentation

};
// Now remap column indices to match the table schema.
let remapped_filters: Result<Vec<_>> = filters_to_remap
let remapped_filters = filters_to_remap
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

driveby cleanup (no functional change)

expr: &Arc<dyn PhysicalExpr>,
projected_exprs: &[ProjectionExpr],
sync_with_child: bool,
unproject: bool,
Copy link
Contributor Author

@alamb alamb Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this parameter name to be super confusing as there are no "children" involved. I suspect this is historical and predates the work to push expressions down

unproject as the best I could come up with -- but would love some thoughts on better names

@alamb alamb force-pushed the alamb/consolidate_remapping branch from e7d571b to ee658ae Compare February 6, 2026 16:51
// For example, the filter being pushed down is `c1_c2 > 5` and it was created
// against the output schema of the this `DataSource` which has projection `c1 + c2 as c1_c2`.
// Thus we need to rewrite the filter back to `c1 + c2 > 5` before passing it to the file source.
// This is necessary because filters refer to the output schema of this `DataSource`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified that the "different schema" is the "table schema"

@alamb alamb marked this pull request as ready for review February 6, 2026 17:11
@@ -842,7 +901,7 @@ pub fn combine_projections(
pub fn update_expr(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I contemplated going even farther and deprecating this function (and routing everything through ProjectionExprs) but I think that will result in some non trivial API churn and I figured we could always do it later

@adriangb
Copy link
Contributor

adriangb commented Feb 6, 2026

I have not been able to wrap my head around update_expr either. I will review this later today or tomorrow 😄

@alamb
Copy link
Contributor Author

alamb commented Feb 6, 2026

I have not been able to wrap my head around update_expr either. I will review this later today or tomorrow 😄

Thanks -- no rush.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants