-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix: Implement AggregateUDFImpl::reverse_expr for StringAgg #17165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 03)----DataSourceExec: partitions=1, partition_sizes=[1] | ||
|
|
||
| query TT | ||
| explain select string_agg(k, ',' order by v desc) from t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is single aggregation enough to exercise the reverse_expr API?
Or maybe we need two functions, one using reverse ordering of the other?
I checked that the test does not pass without changes in string_agg.rs and passes with them.
This is likely related to
| if let Some(reverse_aggr_expr) = aggr_expr.reverse_expr() { |
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without reverse_expr the requirement in that function will stay None, which means get_finer_aggregate_exprs_requirement returns an empty list. Maybe @berkaysynnada can clarify the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we hitting this loop exit path as expected?
datafusion/datafusion/physical-plan/src/aggregates/mod.rs
Lines 1188 to 1195 in e1a5cdf
| // If the common requirement is finer than the current expression's, | |
| // we can skip this expression. If the latter is finer than the former, | |
| // adopt it if it is satisfied by the equivalence properties. Otherwise, | |
| // defer the analysis to the reverse expression. | |
| let forward_finer = determine_finer(&requirement, &aggr_req); | |
| if let Some(finer) = forward_finer { | |
| if !finer { | |
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like it, determine_finer returns true. Should it exit there? If it did I think we would have the same problem, i.e., requirement would remain None.
|
This LGTM but i clearly don't understand fully how this works. |
|
I added a test using Thank you @nuno-faria and @findepi |
|
I believe this should be included in DataFusion 50 as well so that we don't have a regression after DF 49. I will merge it now and make a backport |
…7165) * fix: Implement AggregateUDFImpl::reverse_expr for StringAgg * Add a test with two invocations of aggregateion --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Which issue does this PR close?
string_aggdoes not respectORDER BYon49.0.0#17011.Rationale for this change
The
SortExecis missing from the physical plan ofstring_aggfunctions. In the current branch it works since theAggregateExecends up sorting the results, but on49.0.0its broken. This PR helps avoid future regressions by implementingAggregateUDFImpl::reverse_exprforStringAgg, which brings back theSortExec.What changes are included in this PR?
AggregateUDFImpl::reverse_exprforStringAgg.Are these changes tested?
Yes.
Are there any user-facing changes?
No.