-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhancement](Nereids) check multiple distinct functions that cannot be transformed into muti_distinct #21626
Conversation
run buildall |
91e691f
to
6a398f8
Compare
.filter(AggregateFunction::isDistinct) | ||
.collect(Collectors.toList()); | ||
|
||
Set<Expression> arguments = distinctFuncs.stream().flatMap(expr -> expr.children().stream()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a helper function to do this: agg.getDistinctArguments();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
6a398f8
to
86b010b
Compare
run buildall |
768764c
to
c684e87
Compare
run buildall |
c684e87
to
ba4dc7a
Compare
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
… multi_distinct will result in an error.
ba4dc7a
to
e9978f8
Compare
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…formed into muti_distinct (#21626) This commit introduces a transformation for SQL queries that contain multiple distinct aggregate functions. When the number of distinct values processed by these functions is greater than 1, they are converted into multi_distinct functions for more efficient handling. Example: ``` SELECT COUNT(DISTINCT c1), SUM(DISTINCT c2) FROM tbl GROUP BY c3 -- Transformed to SELECT MULTI_DISTINCT_COUNT(c1), MULTI_DISTINCT_SUM(c2) FROM tbl GROUP BY c3 ``` The following functions can be transformed: - COUNT - SUM - AVG - GROUP_CONCAT If any unsupported functions are encountered, an error is now reported during the optimization phase. To ensure the absence of such cases, a final check has been implemented after the rewriting phase.
Proposed changes
This commit introduces a transformation for SQL queries that contain multiple distinct aggregate functions. When the number of distinct values processed by these functions is greater than 1, they are converted into multi_distinct functions for more efficient handling.
Example:
The following functions can be transformed:
If any unsupported functions are encountered, an error is now reported during the optimization phase.
To ensure the absence of such cases, a final check has been implemented after the rewriting phase.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...