-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Replace supports_bounded_execution with supports_retract_batch #6695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Rationale: The default implementation of the `Accumulator` trait returns an error for the `retract_batch` API.
# Conflicts: # datafusion/core/src/physical_plan/udaf.rs # datafusion/core/src/physical_plan/windows/mod.rs # datafusion/core/tests/user_defined_aggregates.rs
| fn uses_bounded_memory(&self) -> bool { | ||
| self.aggregate.supports_bounded_execution() | ||
| && !self.window_frame.end_bound.is_unbounded() | ||
| !self.window_frame.end_bound.is_unbounded() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking through this logic. Actually, as long as end_bound is not bounded (not UNBOUNDED FOLLOWING such as in the form N FOLLOWING). We can produce results without waiting for the whole data to come (If accumulator do not support retract_batch method. We wouldn't be able to run queries in the form M PRECEDING and N FOLLOWING, in this case we will give an error anyway.). Hence here we do not need to check for self.aggregate.supports_bounded_execution() (acc.supports_retract_batch() method with the new API.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me
| fn uses_bounded_memory(&self) -> bool { | ||
| self.aggregate.supports_bounded_execution() | ||
| && !self.window_frame.end_bound.is_unbounded() | ||
| !self.window_frame.end_bound.is_unbounded() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the case above.
| /// ] | ||
| /// ``` | ||
| fn evaluate_with_rank_all( | ||
| fn evaluate_all_with_rank( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, evaluate_all_with_rank is better name. As part of this PR, I changed the method from evaluate_with_rank_all to evaluate_all_with_rank.
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mustafasrepo 🙏
| false | ||
| } | ||
|
|
||
| /// Specifies whether this aggregate function can run using bounded memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis is nice to have moved this logic entirely to the accumulator 👍
Which issue does this PR close?
Related to #5781
Rationale for this change
With the changes in the #6671,
supports_retract_batchmethod is introduced to theAccumulatortrait. With the introduction ofsupports_retract_batchmethod,supports_bounded_executionmethod is no longer necessary for theAggregateExprtrait. (Similar to the case we have movesupports_bounded_executiontrait fromBuiltinWindowFunctionExprtoPartitionEvalautortrait.)This PR removes
supports_bounded_executionmethod fromAggregateExprand moves its functionality tosupports_retract_batchmethod in the Accumulator` trait for existing accumulators.What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?