-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[MAJOR] Equivalence System Overhaul #16217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
remove usage of prefer_existing_sort as default set requirements Hard set soft on AggregateExec and BoundedWindowExec since they have InputOrderMode functionalities
simplify indentation
simplify indentation
remove prefer_existing_sort based test cases
# Conflicts: # datafusion/core/src/datasource/listing/table.rs # datafusion/physical-optimizer/src/enforce_sorting/mod.rs
# Conflicts: # datafusion-examples/examples/custom_file_format.rs # datafusion/core/src/datasource/file_format/csv.rs # datafusion/core/src/datasource/file_format/json.rs # datafusion/core/src/datasource/file_format/mod.rs # datafusion/core/src/datasource/file_format/parquet.rs # datafusion/core/tests/physical_optimizer/enforce_distribution.rs # datafusion/core/tests/physical_optimizer/test_utils.rs # datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs
return alternative on BoundedWindowAggExec
# Conflicts: # datafusion/catalog/src/stream.rs # datafusion/core/tests/physical_optimizer/enforce_distribution.rs # datafusion/physical-plan/src/joins/sort_merge_join.rs # datafusion/physical-plan/src/joins/symmetric_hash_join.rs
simplify sort_pushdown.rs
add requirements compatible test cases
🤖: Benchmark completed Details
|
I think the planning slowdown is real and reproducible. I did some profiling and it seems it should be fairly easy to fix (we could do it in a follow on even) by optimizing the projection removal code: ![]() ![]() |
Thanks for all the reviews. I will address them today 🚀 With the computational complexity issue removed, I agree that we can progressively reduce the "constant factor" in running time over a few future PRs. To give reviewers some context: Complexity issue is solved by computing normalizations only once (when equivalences are modified) and handling constants on an equal footing with other equivalences. This makes ordering checks fast, but some parts of the current codebase probably make modifications eagerly, which is the likely source of the small slowdown @alamb is seeing. I expect refactoring those parts of the code a little bit will mitigate this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is good to merge as is -- thank you again for all the work
I think the code is much clearer and faster (and will be easier to work with going forward).
I also verified that the reproducer on #13748
If we are not able to resolve the planning time regression before merge, I think we should file a follow on ticket and I can help try and resolve it prior to the 49.0.0 release
Finally, here is a proposed small improvement I ran across while reviewing this PR
Performance Results
on main ( at 900279c)
Running with 10 columns...completed in 31.865667ms
Running with 20 columns...completed in 50.299667ms
Running with 30 columns...completed in 121.58075ms
Running with 40 columns...completed in 253.701875ms
Running with 50 columns...completed in 489.564584ms
Running with 60 columns...completed in 768.394167ms
Running with 70 columns...completed in 1.291121416s
Running with 80 columns...completed in 1.876029208s
Running with 90 columns...completed in 2.652548541s
Running with 100 columns...completed in 3.843196875s
on this branch 🚀
Running with 10 columns...completed in 23.371333ms
Running with 20 columns...completed in 35.933167ms
Running with 30 columns...completed in 74.359083ms
Running with 40 columns...completed in 125.760666ms
Running with 50 columns...completed in 198.790375ms
Running with 60 columns...completed in 285.836584ms
Running with 70 columns...completed in 408.980167ms
Running with 80 columns...completed in 557.148791ms
Running with 90 columns...completed in 710.347208ms
Running with 100 columns...completed in 880.061541ms
Minor: remove an unecessary clone in common_sort_prefix_length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments after (roughly) reviewing followed after @alamb's call on dev. 🙂
It is roughly because the size of the changes. A 7000+ lines PR is too hard to review very in details.
Overall the change looks good and it solves the complexity issue reported. 👍
Although the PR claims that it couldn't find an effective way to split it to smaller ones. I'm not very sure but I can see there are changes that are good to have but maybe not so urgent in just one shot, e.g., a lot of code refactors seems not core change, code comment rephrasing, API change that seems can be in separate works.
For such big overhaul, I still prefer and suggest we can it in incremental approach. It will help reviewing a lot.
Thanks for all the reviews. I went through each and sent a commit that addresses all the suggestions, I also tried to answer any questions I saw. I will later resolve the conflicts and get this ready to go. I see that we are close to getting ready for the 48 release too, so it seems like we are making a good effort to align them in the way we want. Thanks all 🚀 |
I removed the conflicts, this is in a good state now. Did we get to the 48 cut-off point yet? |
I plan on running my test suite against this side branch hopefully tomorrow or early next week to verify it. |
Update yes I think main is open for new features and we can merge this one (though maybe it would be a good idea for @Omega359 to test before we do so) |
I think there are some incoming big PRs, so I'd like to go ahead with this and help with fixing any issues @Omega359 finds after merge. In addition to the time cost of going through merges, I am fearful of accidentally introducing bugs every time I merge. |
OK, I resolved the conflicts and all seems OK. Should we go ahead after CI passes? |
I'll be able to do my tests Sat afternoon or Sun morning. I'm good with doing that against either main or the branch. |
Thanks @Omega359. Given that we didn't hear any concerns yesterday, I will go ahead and merge this. Thanks everyone for all the reviews! |
🚀 woohoo! |
fyi I did run this through my test suite and it seemed to work correctly :) |
* introduce Soft & Hard RequiredInputOrderings remove usage of prefer_existing_sort as default set requirements Hard set soft on AggregateExec and BoundedWindowExec since they have InputOrderMode functionalities * add documentation to replace_with_partial_sort simplify indentation * add documentation to analyze_immediate_sort_removal simplify indentation * remove prefer_existing_sort effects remove prefer_existing_sort based test cases * remove prefer_existing_sort configuration * remove prefer_existing_sort configuration * add documentation * add documentation * add documentation * fix imports and test cases * fix imports and test cases * implement RequiredInputOrdering as vectors * implement RequiredInputOrdering as vectors return alternative on BoundedWindowAggExec * fix test cases * change doc * revert prefer_existing_sort flag * fix changes * fix test case * make LexRequirement private * ensure RequiredInputOrdering inner requirement can not be empty simplify sort_pushdown.rs * add default test cases add requirements compatible test cases * doc fixes * fix clippy and docs * format code * format code * doc fix * add TODO test cases with test_soft_hard_requirements prefix * Review Part 1 * Review Part 2 * Review Part 3 * Review Part 4 * Review Part 5 * Review Part 6 * Enforce non-degeneracy for LexRequirement * Enforce non-degeneracy for LexOrdering (Part 1) * Enforce non-degeneracy for LexOrdering (Part 2) * fix first phase of merge conflicts and other bugs * Fix sqllogictests except the schema mismatch * Cleanup Part 1 * Cleanup Part 2 * Cleanup Part 3 * do not initialize Trivial accumulators if ordering is set * initialize TrivialFirstPrimitiveGroupsAccumulator struct and return * fix clippy * fix merge conflicts * fix typos remove TrivialFirstPrimitiveGroupsAccumulator make groups accumulator available only when order requirement is set * format code * Add requirement_satisfied back in * Replace AsRef with ordinary & for LexOrdering * Further cleanup * add OutputRequirementExec fetches to sort adding * Simplify remove_redundant_entries * Work with iterators in ordering_satisfy_requirement * Fix doctests * Cleanup LexOrdering APIs * Cleanup LexOrdering APIs 2 * Add reverse_each to LexOrdering * Use LexOrdering instead of Arc<[PhysicalSortExpr]> * Use PhysicalSortExpr slices in contexts where we simply list sort expressions * Generalize add_new_ordering APIs * Simplifications * More cleanups * API Simplifications * Improve comments * Use vector in Expr structs * Fix doctests * Simplify sort * Simplify the get_finer_aggregate_exprs_requirement function * Avoid hidden clones * bugfix * Simplify the get_finer_aggregate_exprs_requirement function * Simplify the function with_reorder * Fix with_reorder bug * Simplify the function with_reorder (Part 2) * Simplify * DRY * Simplifications * Improve add_equal_condition * Improve docs * Simplifications * Simplifications * RequiredInputOrdering -> OrderingAlternatives * Simplify new_with_orderings * Transition to fallible LexOrdering constructor * Transition to fallible LexOrdering constructor - 2 * Transition to fallible LexOrdering constructor - 3 * Transition to fallible LexOrdering constructor - 4 * Transition to fallible LexOrdering constructor - 5 * Transition to fallible LexOrdering constructor - 6 * Transition to fallible LexOrdering constructor - 7 * Transition to fallible LexOrdering constructor - 8 * Transition to fallible LexOrdering constructor - 9 * Transition to fallible LexOrdering constructor - 10 * Transition to fallible LexOrdering constructor - 11 * Simplify constant expressions * Simplify constant expressions - 2 * Simplify constant expressions - 3 * Simplify constant expressions - 4 * Simplify constant expressions - 5 * Simplify constant expressions - 6 * Simplify constant expressions - 7 * Simplify constant expressions - 8 * Simplify constant expressions - 9 * Fix imports * Remove explicit constant tracking from equivalences * Resolve logical conflict * Remove the unusual take API, instead use the from trait * Simplify projection mapping - 1 * Use a map instead of a vector in ProjectionMapping * Simplify DependencyMap * Simplify DependencyMap - 2 * Simplify DependencyMap - 3 * Incorporate Jay's suggestions * Simplifications * Fix doctest * Improve docstrings * Update/cast the constant value accordingly when schema changes * Improve ProjectionMapping * Remove DerefMut from ProjectionTargets to preserve non-emptiness * Docstring * Optimize project_expr by fetching equivalence classes only once * Project multiple expressions more efficiently at once * Project multiple expressions more efficiently at once - 2 * Project multiple expressions more efficiently at once - 3 * Project multiple expressions more efficiently at once - 4 * Move normalization of sort expressions to equivalence group * Improve comments * Improve display for EquivalenceProperties * More idiomatic code * More succinct code * Remove extend_orderings from EquivalenceProperties * Simplify with_reorder * Store normalized orderings - 1 * Reduce time complexity of normalization w.r.t. number of equivalence classes * Simplify bridge_classes logic * Remove TODOs * Simplify generate_dependency_orderings * normalized orderings - 2 * normalized orderings - 3 * undo normalized orderings * Fix logical conflicts * Fix imports * Remove noop code * Move add_offset_to_expr * Remove mutation from LexOrdering * Remove unwraps * Remove unwraps - 2 * Remove unwraps - 3 * Remove unwraps - 4 * Remove collapse from LexOrdering * Remove unwraps - 5 * Remove unwraps - 6 * Remove unwraps - 7 * Remove unwraps - 8 * Remove unwraps - 9 * Remove unwraps - 10 * Remove collapse from LexRequirement * Simplify ordering_satisfy * Enforce uniqueness in LexOrdering * Fix with_reorder * Use tee * Fix reorder api * Comment grammar * Remove unwraps * Cache normalized orderings * Minor: remove an unecessary clone in common_sort_prefix_length * Address reviews --------- Co-authored-by: mertak-synnada <mertak67+synaada@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com>
Which issue does this PR close?
UNION
andORDER BY
queries #13748.Rationale for this change
DF's theoretical approach to representing orderings, equivalences and constants is sound, but it suffers from many implementation problems that lead to various issues. Some of these issues include:
None
value for a typeOption<LexOrdering>
, and sometimes with aSome
value with the payload being "empty").EnforceSorting
, which results in missed optimization opportunities. For example, a windowing operator prefers its input to be sorted w.r.t. both PARTITION BY and ORDER BY keys, but if the former is not available, it is still preferable to have just the latter instead of a fully unsorted input.What changes are included in this PR?
ordering_satisfy
checks. Also, the new system caches normalization results for equivalent orderings, which was one of the major sources of excessive computational complexity during sort enforcement.LexOrdering
andLexRequirement
objects do not allow degenerate orderings anymore, whenever you see this object, you can be sure there will be some ordering. Possible absence of an ordering will henceforth be properly represented by anOption<LexOrdering>
.OrderingRequirements
object is introduced to capture ordering preferences of operators.Are these changes tested?
Yes.
Are there any user-facing changes?
Yes -- this is a major API change and will break old code. We spent months looking for ways to (1) break this PR down into smaller ones, (2) yet lump all API changes in one PR -- but couldn't find an effective way to do this.
Therefore, I invite all contributors who are also downstream stakeholders to help by participating in a "pre-scream" test. Please test and report how this PR breaks your code, so we can collectively prepare an upgrade guide for the community at large.
Tips and Tricks for Reviewing
The "meat" of the code is in the following files:
datafusion/physical-expr-common/src/sort_expr.rs
: Definition ofOrderingRequirements
, changes toLexOrdering
and friends.datafusion/physical-plan/src/execution_plan.rs
: API change torequired_input_ordering
.datafusion/physical-expr/src/equivalence
: Re-implementation of the equivalence system.datafusion/functions-aggregate/src
: Effects of the above on aggregate function implementations.The rest of the changes are "mandatory" propagative effects of fixing a fundamental low-level mechanism in a large codebase, and mostly trivial.
@alamb, @andygrove, @berkaysynnada -- feel free to tag others.