-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[carnot] Implement execution of partial aggregates. #1440
Comments
JamesMBartlett
changed the title
Implement execution of partial aggregates.
[carnot] Implement execution of partial aggregates.
Jun 6, 2023
This was referenced Jun 6, 2023
JamesMBartlett
added a commit
that referenced
this issue
Jun 7, 2023
Summary: In preparation for implementing partial aggregates on the exec side, this PR adds Serialize/Deserialize methods to the UDADefinition. This will allow the exec side to execute udas' Serialize and Deserialize methods in a type erased way, just as we do for Update, Merge, Finalize, etc. Relevant Issues: #1440 Type of change: /kind cleanup Test Plan: Added a test for the SerializeArrow and Deserialize methods. Also tested as part of broader partial aggregate implmentation. Signed-off-by: James Bartlett <jamesbartlett@pixielabs.ai>
JamesMBartlett
added a commit
that referenced
this issue
Jun 7, 2023
Summary: Prior to this change, the compiler would output a single column called `serialized_expressions` for any partial agg. This isn't particularly convenient for the execution side. Instead, the compiler will now output a column per UDA involved in the agg node. For example, if you have an agg node that has a `min` and `max` output the partial agg will now output two columns (plus any group columns) called `serialized_min` and `serialized_max`, where before it would output an unspecified `serialized_expressions` column. Relevant Issues: #1440 Type of change: /kind cleanup Test Plan: Modified existing tests to test the new behaviour. Also tested as part of broader partial aggregate changes. Signed-off-by: James Bartlett <jamesbartlett@pixielabs.ai>
vihangm
pushed a commit
that referenced
this issue
Jun 13, 2023
Summary: Implements support for executing partial aggregates in the AggNode. The planner still has partial aggs disabled. A separate PR will flip the switch on the planner side. Relevant Issues: #1440 Type of change: /kind feature Test Plan: Added tests in `agg_node_test` that exercise the partial aggregate code path. Also, tested on a cluster with partial aggregates enabled on the planner side. Signed-off-by: James Bartlett <jamesbartlett@pixielabs.ai>
JamesMBartlett
added a commit
that referenced
this issue
Aug 16, 2023
Summary: Flips the switch to enable partial aggregates. This should improve query performance. Relevant Issues: Fixes #1440 Type of change: /kind feature Test Plan: Tested by skaffolding to a cluster. Saw that `px/cluster` and other scripts return reasonable results. Compared the node CPU % from `px/cluster` (calculated through an aggregate) to the actual usage on the node. Also, used `px:set explain=true` `px:set analyze=true` to check that the query plan was using partial aggregates. Changelog Message: ```release-note Queries now support partial aggregates which should improve query performance significantly in some cases. ``` Signed-off-by: James Bartlett <jamesbartlett@pixielabs.ai>
Reopening since this was reverted in #1692. |
vihangm
pushed a commit
that referenced
this issue
Oct 3, 2023
Summary: Revert partial aggregate opt in This change reverts the partial aggregate opt in from 2645963. During the testing of 0.14.7-pre-main.0, I noticed that the flame graphs generated were lacking the typical towers seen. <img width="1728" alt="Java flame graph on pre-release" src="https://github.com/pixie-io/pixie/assets/5855593/5e1f15f0-97ce-4263-a418-9986f675fa55"> <img width="1679" alt="flame graph on pre-release" src="https://github.com/pixie-io/pixie/assets/5855593/f2b7d988-471c-48e0-bed1-339896574940"> Relevant Issues: #1440 Type of change: /kind bug Test Plan: `skaffold`ed the changes and verified that the flame graph looks correct now (as seen below) <img width="1718" alt="flame graph with partial agg revert" src="https://github.com/pixie-io/pixie/assets/5855593/f6c947fe-1930-4bc5-bdb9-89f33cf0e19a"> Signed-off-by: Dom Del Nano <ddelnano@pixielabs.ai>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the Task
We already have support for planning queries with partial aggregates (Partial aggregates means some of the aggregation is done on the agent nodes before sending the partially aggregated results to Kelvin for final aggregation). However, the query execution engine doesn't yet support executing partial aggregate nodes. This issue tracks the implementation on the execution side.
Subtasks
The text was updated successfully, but these errors were encountered: