-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Window Functions Order Conservation -- Follow-up On Set Monotonicity #14813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -222,208 +227,6 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { | |||
| Ok(()) | |||
| } | |||
|
|
|||
| #[tokio::test] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coverage of these tests are preserved in test_window_partial_constant_and_set_monotonicity()
54a53ed to
ace28b6
Compare
ace28b6 to
9b55465
Compare
| " BoundedWindowAggExec: wdw=[count: Ok(Field { name: \"count\", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Range, start_bound: Preceding(NULL), end_bound: CurrentRow, is_causal: false }], mode=[Sorted]", | ||
| " SortExec: expr=[nullable_col@0 ASC, non_nullable_col@1 ASC], preserve_partitioning=[false]", | ||
| " DataSourceExec: partitions=1, partition_sizes=[0]"]; | ||
| " SortExec: expr=[nullable_col@0 ASC, non_nullable_col@1 ASC], preserve_partitioning=[false]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the reviewers, please double check here. While 2nd BoundedWindowAggExec requires [nullable_col, non_nullable_col] ordering, it is not guaranteed in the previous version. If I'm not missing something, now it is, with a newly added partial sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it seems like we had a bug before which is now fixed.
| @@ -2280,3 +2086,1265 @@ async fn test_not_replaced_with_partial_sort_for_unbounded_input() -> Result<()> | |||
| assert_optimized!(expected_input, expected_no_change, physical_plan, true); | |||
| Ok(()) | |||
| } | |||
|
|
|||
| #[tokio::test] | |||
| async fn test_window_partial_constant_and_set_monotonicity() -> Result<()> { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing this with rs-test seems hard to debug, and the relation between the arguments and initial-final plans is not very clear. So I prefer this
| None | ||
| } | ||
|
|
||
| fn all_possible_sort_options(expr: Arc<dyn PhysicalExpr>) -> Vec<PhysicalSortExpr> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be init once maybe
|
PTAL @ozankabak, @alamb |
e698fef to
8425c17
Compare
ozankabak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed both the code and the tests carefully, and sent a commit for some improvements. This PR greatly improves the test coverage and fills some gaps in terms of set monotonicity optimizations.
I will wait for some more time before merging in case we get more eyes on this (which would be great).
|
I will go ahead and merge this since it is a follow-up to a previously discussed (and reviewed) PR/feature. It would still be great to have some post-merge review on this when you have some time on your hands @alamb. |
THanks -- I'll try to to look at it but I am currently quite tied up with triyng to get DataFusion 46 ready for release (and dealing with some unexpected fallout from #14224). May not be until next week |
| .map(|pb_order| sort_options_resolving_constant(Arc::clone(pb_order))); | ||
| let all_satisfied_lexs = partition_by_orders | ||
| .multi_cartesian_product() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort_options_resolving_constant returns pair of values for every input expression
then multi_cartesian_product calculates all combinations when sourcing one element from each of the pairs
this produces exponential number of combinations (as a function of # input expressions). is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is actually a kind of workaround to represent the constancy of partitions_by columns along their partitions. So, we need to compare the existing ordering against every possible ordering of alternatives of constant columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the desire, but exponential planning time is hardly acceptable in our use-case.
For a real production query, DF 45 works in a snap, and DF 46+ never exists the planner. I had to chop off a bunch of columns from a window to get it to completion.
we need to compare the existing ordering against every possible ordering
the sort_options_resolving_constant returns only 2 options out of 4 possible.
is this correctness problem, or a missed optimization 'problem'?
define "need"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the desire, but exponential planning time is hardly acceptable in our use-case. For a real production query, DF 45 works in a snap, and DF 46+ never exists the planner. I had to chop off a bunch of columns from a window to get it to completion.
The chance of skipping this complex part can be detected earlier before (for example, if there is no order requirement coming from downstream), and there wouldn't be any order calculation logic specific to window expressions.
we need to compare the existing ordering against every possible ordering
the
sort_options_resolving_constantreturns only 2 options out of 4 possible. is this correctness problem, or a missed optimization 'problem'?define "need"
I checked the code, and I believe one of the three usages of sort_options_resolving_constant should be updated to generate all 4 possibilities (where it is used over partitioning expressions, not window/aggregate functions). The reason for generating only 2 of them is that set monotonicity is broken if the data has an increasing order but nulls come first, and vice versa, if the data has a decreasing order but nulls come last. So, it's not a correctness problem but a missed optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chance of skipping this complex part can be detected earlier before (for example, if there is no order requirement coming from downstream),
That is simple.
However, it's likely to help with simple queries only. I.e. it will help with test queries, but more complex production workloads will still end up doing exponential (multi-minutes) planning.
We need an approach that's better than O(n²) (and obviously current O(2ⁿ) is much much worse).
From query execution perspective, those minutes spent in planning are minutes wasted, if query can be executed in seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got an idea: check satisfy ordering one by one. I mean:
partition_by_cols: [a,b,c]
ordering_req: [x,y,z]
create a new ordering from the first N(1 initially, and increase 1 by 1) elements.
compared_ordering: [x]
compare the variations of the first element of partition by [a INC NL] & [a INC NF] & [a DEC NL] & [a DEC NF]
only one of them can survive (and most of the cases, none of them, and skip the further computation)
store the surviving one, and append the next ordering element variations and partition by column, and continue comparison until either ordering or partition_by elements are over, or there isnt any survivor left
This algorithm would decrease the complexity to O(n) I guess
Which issue does this PR close?
Rationale for this change
#14271 had introduced set-monotonicity term for window functions. After that PR we have some false negatives for order conservation, and we don't have a fully complete test coverage for window functions in terms of set-monotonicity, partitioning type, frame type, and order of the function inputs.
What changes are included in this PR?
This PR enriches the ordering properties of WindowAggExec and BoundedWindowAggExec, and I've written a test function which covers all cases for a window function, both positive and negative cases.
Are these changes tested?
Yes
Are there any user-facing changes?
Order conservation in more cases