-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Support inferring new predicates to push down #15906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
logical_plan | ||
01)Inner Join: t1.a = t2.a | ||
02)--Projection: t1.a, t1.b | ||
03)----Filter: __common_expr_4 >= Int64(3) AND __common_expr_4 <= Int64(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inferred predicate, which can be pushed down to the t1 scan.
/// Infers new predicates by substituting equalities. | ||
/// For example, with predicates `t2.b = 3` and `t1.b > t2.b`, | ||
/// we can infer `t1.b > 3`. | ||
fn infer_predicates_from_equalities(predicates: Vec<Expr>) -> Result<Vec<Expr>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, we can move the code into a dedicated optimizer rule, such as InferPredicates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be a special case of the range analysis code in
https://docs.rs/datafusion/latest/datafusion/physical_expr/intervals/cp_solver/index.html
In other words, instead of this special case maybe we could use the cp_solver to create a more general framework for introducing inferred predicates 🤔
Now that we have predicate pushdown for ExecutionPlans maybe it is more realistic to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check the cp_solver (didn't notice the part of code before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion. I can help if you need some directions or have any confusion
@@ -131,13 +131,25 @@ pub fn normalize_sorts( | |||
} | |||
|
|||
/// Recursively replace all [`Column`] expressions in a given expression tree with | |||
/// `Column` expressions provided by the hash map argument. | |||
pub fn replace_col(expr: Expr, replace_map: &HashMap<&Column, &Column>) -> Result<Expr> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't wanna write a similar method for the PR, so made the method generic
pub fn replace_col_with_expr(
expr: Expr,
replace_map: &HashMap<Column, &Expr>,
) -> Result<Expr> {
expr.transform(|expr| {
Ok({
if let Expr::Column(c) = &expr {
match replace_map.get(c) {
Some(new_expr) => Transformed::yes((**new_expr).to_owned()),
None => Transformed::no(expr),
}
} else {
Transformed::no(expr)
}
})
})
.data()
}
Perhaps running clickbench or equivalent (assuming clickbench wouldn't trigger this optimization) to showcase the difference would be good? |
🤖 |
🤖: Benchmark completed Details
|
Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is ready for another look |
Which issue does this PR close?
Rationale for this change
We can infer new predicates from existing predicates to push down to reduce IO and improve performance dramatically
What changes are included in this PR?
Infers new predicates by substituting equalities.
For example, with predicates
t2.b = 3
andt1.b > t2.b
, we can infert1.b > 3
.See sqllogictest for another case.
Are these changes tested?
Yes
Are there any user-facing changes?