Replace supports_bounded_execution with supports_retract_batch #6695

mustafasrepo · 2023-06-16T06:40:25Z

Which issue does this PR close?

Related to #5781

Rationale for this change

With the changes in the #6671, supports_retract_batch method is introduced to the Accumulator trait. With the introduction of supports_retract_batch method, supports_bounded_execution method is no longer necessary for the AggregateExpr trait. (Similar to the case we have move supports_bounded_execution trait from BuiltinWindowFunctionExpr to PartitionEvalautor trait.)

This PR removes supports_bounded_execution method from AggregateExpr and moves its functionality to supports_retract_batch method in the Accumulator` trait for existing accumulators.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Rationale: The default implementation of the `Accumulator` trait returns an error for the `retract_batch` API.

# Conflicts: # datafusion/core/src/physical_plan/udaf.rs # datafusion/core/src/physical_plan/windows/mod.rs # datafusion/core/tests/user_defined_aggregates.rs

mustafasrepo · 2023-06-16T07:31:25Z

datafusion/physical-expr/src/window/aggregate.rs

    fn uses_bounded_memory(&self) -> bool {
-        self.aggregate.supports_bounded_execution()
-            && !self.window_frame.end_bound.is_unbounded()
+        !self.window_frame.end_bound.is_unbounded()


After thinking through this logic. Actually, as long as end_bound is not bounded (not UNBOUNDED FOLLOWING such as in the form N FOLLOWING). We can produce results without waiting for the whole data to come (If accumulator do not support retract_batch method. We wouldn't be able to run queries in the form M PRECEDING and N FOLLOWING, in this case we will give an error anyway.). Hence here we do not need to check for self.aggregate.supports_bounded_execution() (acc.supports_retract_batch() method with the new API.)

This makes sense to me

mustafasrepo · 2023-06-16T07:31:36Z

datafusion/physical-expr/src/window/sliding_aggregate.rs

    fn uses_bounded_memory(&self) -> bool {
-        self.aggregate.supports_bounded_execution()
-            && !self.window_frame.end_bound.is_unbounded()
+        !self.window_frame.end_bound.is_unbounded()


Similar to the case above.

mustafasrepo · 2023-06-16T14:20:21Z

datafusion/physical-expr/src/window/partition_evaluator.rs

    /// ]
    /// ```
-    fn evaluate_with_rank_all(
+    fn evaluate_all_with_rank(


I think, evaluate_all_with_rank is better name. As part of this PR, I changed the method from evaluate_with_rank_all to evaluate_all_with_rank.

alamb

Thank you @mustafasrepo 🙏

alamb · 2023-06-16T15:43:33Z

datafusion/physical-expr/src/aggregate/mod.rs

        false
    }

-    /// Specifies whether this aggregate function can run using bounded memory.


THis is nice to have moved this logic entirely to the accumulator 👍

stuartcarnie and others added 3 commits June 14, 2023 17:12

feat: support sliding window accumulators

d53d6c4

Rationale: The default implementation of the `Accumulator` trait returns an error for the `retract_batch` API.

Allow AggregateUDF to define retractable batch

579b4d9

Replace supports_bounded_execution with supports_retract_batch

ce89853

mustafasrepo marked this pull request as draft June 16, 2023 06:40

github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates labels Jun 16, 2023

Merge branch 'main' into feature/6671_exp

2f9fc40

# Conflicts: # datafusion/core/src/physical_plan/udaf.rs # datafusion/core/src/physical_plan/windows/mod.rs # datafusion/core/tests/user_defined_aggregates.rs

github-actions bot removed logical-expr Logical plan and expressions core Core DataFusion crate labels Jun 16, 2023

simplifications

fbc978e

mustafasrepo marked this pull request as ready for review June 16, 2023 06:50

simplifications

65d00e4

mustafasrepo commented Jun 16, 2023

View reviewed changes

Rename evalaute_with_rank_all

9a1bfa6

mustafasrepo commented Jun 16, 2023

View reviewed changes

alamb approved these changes Jun 16, 2023

View reviewed changes

alamb merged commit 8da5f26 into apache:main Jun 16, 2023

mustafasrepo deleted the feature/6671_exp branch July 14, 2023 07:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace supports_bounded_execution with supports_retract_batch #6695

Replace supports_bounded_execution with supports_retract_batch #6695

Uh oh!

mustafasrepo commented Jun 16, 2023 •

edited by alamb

Loading

Uh oh!

mustafasrepo Jun 16, 2023

Uh oh!

alamb Jun 16, 2023

Uh oh!

mustafasrepo Jun 16, 2023

Uh oh!

mustafasrepo Jun 16, 2023

Uh oh!

alamb left a comment

Uh oh!

alamb Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Replace supports_bounded_execution with supports_retract_batch #6695

Replace supports_bounded_execution with supports_retract_batch #6695

Uh oh!

Conversation

mustafasrepo commented Jun 16, 2023 • edited by alamb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

mustafasrepo Jun 16, 2023

Choose a reason for hiding this comment

Uh oh!

alamb Jun 16, 2023

Choose a reason for hiding this comment

Uh oh!

mustafasrepo Jun 16, 2023

Choose a reason for hiding this comment

Uh oh!

mustafasrepo Jun 16, 2023

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jun 16, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mustafasrepo commented Jun 16, 2023 •

edited by alamb

Loading