-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix TopK Sort incorrectly pushed down past Join with anti join #16641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
088d742
f2201c0
8834c4f
7859e0f
45a9c3a
7e0a73d
945d583
8b59fd8
b358371
512b137
317260b
6f95a63
92366d6
d9188b1
c8c6051
c7178fb
097a47e
a3838a4
b7aaab2
df609aa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -663,15 +663,14 @@ logical_plan | |
| physical_plan | ||
| 01)GlobalLimitExec: skip=4, fetch=10 | ||
| 02)--SortPreservingMergeExec: [c@0 DESC], fetch=14 | ||
| 03)----UnionExec | ||
| 04)------SortExec: TopK(fetch=14), expr=[c@0 DESC], preserve_partitioning=[true] | ||
| 03)----SortExec: TopK(fetch=14), expr=[c@0 DESC], preserve_partitioning=[true] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be right, we should union then to sort limit. |
||
| 04)------UnionExec | ||
| 05)--------ProjectionExec: expr=[CAST(c@0 AS Int64) as c] | ||
| 06)----------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 07)------------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[c], output_ordering=[c@0 ASC NULLS LAST], file_type=csv, has_header=true | ||
| 08)------SortExec: TopK(fetch=14), expr=[c@0 DESC], preserve_partitioning=[true] | ||
| 09)--------ProjectionExec: expr=[CAST(d@0 AS Int64) as c] | ||
| 10)----------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 11)------------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[d], file_type=csv, has_header=true | ||
| 08)--------ProjectionExec: expr=[CAST(d@0 AS Int64) as c] | ||
| 09)----------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 10)------------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[d], file_type=csv, has_header=true | ||
|
|
||
| # Applying LIMIT & OFFSET to subquery. | ||
| query III | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1258,13 +1258,12 @@ logical_plan | |
| 08)--------TableScan: ordered_table projection=[a0, b, c, d] | ||
| physical_plan | ||
| 01)SortPreservingMergeExec: [d@4 ASC NULLS LAST, c@1 ASC NULLS LAST, a@2 ASC NULLS LAST, a0@3 ASC NULLS LAST, b@0 ASC NULLS LAST], fetch=2 | ||
| 02)--UnionExec | ||
| 03)----SortExec: TopK(fetch=2), expr=[d@4 ASC NULLS LAST, c@1 ASC NULLS LAST, a@2 ASC NULLS LAST, b@0 ASC NULLS LAST], preserve_partitioning=[false] | ||
| 02)--SortExec: TopK(fetch=2), expr=[d@4 ASC NULLS LAST, c@1 ASC NULLS LAST, a@2 ASC NULLS LAST, a0@3 ASC NULLS LAST, b@0 ASC NULLS LAST], preserve_partitioning=[true] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. |
||
| 03)----UnionExec | ||
| 04)------ProjectionExec: expr=[b@1 as b, c@2 as c, a@0 as a, NULL as a0, d@3 as d] | ||
| 05)--------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a, b, c, d], output_ordering=[c@2 ASC NULLS LAST], file_type=csv, has_header=true | ||
| 06)----SortExec: TopK(fetch=2), expr=[d@4 ASC NULLS LAST, c@1 ASC NULLS LAST, a0@3 ASC NULLS LAST, b@0 ASC NULLS LAST], preserve_partitioning=[false] | ||
| 07)------ProjectionExec: expr=[b@1 as b, c@2 as c, NULL as a, a0@0 as a0, d@3 as d] | ||
| 08)--------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a0, b, c, d], output_ordering=[c@2 ASC NULLS LAST], file_type=csv, has_header=true | ||
| 06)------ProjectionExec: expr=[b@1 as b, c@2 as c, NULL as a, a0@0 as a0, d@3 as d] | ||
| 07)--------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a0, b, c, d], output_ordering=[c@2 ASC NULLS LAST], file_type=csv, has_header=true | ||
|
|
||
| # Test: run the query from above | ||
| query IIIII | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -53,7 +53,7 @@ query I | |
| select * from (select * from topk limit 8) order by x limit 3; | ||
| ---- | ||
| 0 | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is right, becasue we select * from topk limit 8 but not order, the original case should push down the sort limit to it. |
||
| 1 | ||
| 2 | ||
| 2 | ||
|
|
||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -413,15 +413,14 @@ logical_plan | |
| 06)------TableScan: aggregate_test_100 projection=[c1, c3] | ||
| physical_plan | ||
| 01)SortPreservingMergeExec: [c9@1 DESC], fetch=5 | ||
| 02)--UnionExec | ||
| 03)----SortExec: TopK(fetch=5), expr=[c9@1 DESC], preserve_partitioning=[true] | ||
| 02)--SortExec: TopK(fetch=5), expr=[c9@1 DESC], preserve_partitioning=[true] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. |
||
| 03)----UnionExec | ||
| 04)------ProjectionExec: expr=[c1@0 as c1, CAST(c9@1 AS Decimal128(20, 0)) as c9] | ||
| 05)--------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 06)----------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/testing/data/csv/aggregate_test_100.csv]]}, projection=[c1, c9], file_type=csv, has_header=true | ||
| 07)----SortExec: TopK(fetch=5), expr=[c9@1 DESC], preserve_partitioning=[true] | ||
| 08)------ProjectionExec: expr=[c1@0 as c1, CAST(c3@1 AS Decimal128(20, 0)) as c9] | ||
| 09)--------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 10)----------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/testing/data/csv/aggregate_test_100.csv]]}, projection=[c1, c3], file_type=csv, has_header=true | ||
| 07)------ProjectionExec: expr=[c1@0 as c1, CAST(c3@1 AS Decimal128(20, 0)) as c9] | ||
| 08)--------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 | ||
| 09)----------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/testing/data/csv/aggregate_test_100.csv]]}, projection=[c1, c3], file_type=csv, has_header=true | ||
|
|
||
| query TR | ||
| SELECT c1, c9 FROM aggregate_test_100 UNION ALL SELECT c1, c3 FROM aggregate_test_100 ORDER BY c9 DESC LIMIT 5 | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.