fix: return ALL constants in EquivalenceProperties::constants#17404
fix: return ALL constants in EquivalenceProperties::constants#17404crepererum merged 3 commits intoapache:mainfrom
EquivalenceProperties::constants#17404Conversation
|
I wonder if this method here is actually dangerous: datafusion/datafusion/physical-expr/src/equivalence/class.rs Lines 195 to 199 in d83a290 It's clearly broken when 1 constant is shared by two columns. As far as I can tell, this is only used in 2 other places: datafusion/datafusion/physical-expr/src/equivalence/class.rs Lines 509 to 523 in d83a290 datafusion/datafusion/physical-expr/src/equivalence/properties/mod.rs Lines 1291 to 1295 in d83a290 So I'm wondering if these two methods have similar bugs 🤔 |
bf941bd to
e8a0c6e
Compare
|
🤖 |
I don't fully understand this comment. The code in question seems to do what the comments say (return the "canonical expr" so if the It seems like the bug fix in this PR is that the intention was to provide |
alamb
left a comment
There was a problem hiding this comment.
Thank you @crepererum -- I read this code carefully and I believe it is correct and fixes a subtle bug.
I also started some planning benchmarks to confirm we don't see any major changes (I don't expect any)
perhaps @berkaysynnada and/or @ozankabak would like to review this PR as well
| " | ||
| ); | ||
|
|
||
| assert_sanity_check(&plan, true); |
There was a problem hiding this comment.
I verified that this test covers the code in the PR by running it without the code changes and it fails like this
assertion `left == right` failed
left: false
right: true
Left: false
Right: true
<Click to see difference>
| c.constant.as_ref().and_then(|across| { | ||
| c.canonical_expr() | ||
| .map(|expr| ConstExpr::new(Arc::clone(expr), across.clone())) | ||
| .flat_map(|c| { |
There was a problem hiding this comment.
The bug fix here is that this method now adds an entry for each equivalent expression rather than just the first "canonical" one
For example, if the equivalent expressions are (a, b) and the constant is known to be 1, now the code will return
a=1, b=1 but before the fix it would only return a=1)
|
🤖: Benchmark completed Details
|
|
#17404 (comment) looks like noise to me |
…he#17404) * test: regression test for apache#17372 * test: add more direct regression for apache#17372 * fix: return ALL constants in `EquivalenceProperties::constants`
…he#17404) * test: regression test for apache#17372 * test: add more direct regression for apache#17372 * fix: return ALL constants in `EquivalenceProperties::constants`
…he#17404) * test: regression test for apache#17372 * test: add more direct regression for apache#17372 * fix: return ALL constants in `EquivalenceProperties::constants`
Which issue does this PR close?
Rationale for this change
This trips the sanity checker for totally valid plans, see tests.
What changes are included in this PR?
Consider all constants in properties, not only the first one.
Are these changes tested?
Regression tests on two levels.
Are there any user-facing changes?
It's a bug fix.