forked from apache/datafusion
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Plan Properties caching #1
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mustafasrepo
pushed a commit
that referenced
this pull request
Feb 16, 2024
…aTypes) (apache#8985) * ScalarValue return types from argument values * change file name * try using ?Sized * use Ok * move method default impl outside trait * Use type trait for ExprSchemable * fix nit * Proposed Return Type from Expr suggestions (#1) * Improve return_type_from_args * Rework example * Update datafusion/core/tests/user_defined/user_defined_scalar_functions.rs --------- Co-authored-by: Junhao Liu <junhaoliu2023@gmail.com> * Apply suggestions from code review Co-authored-by: Alex Huang <huangweijun1001@gmail.com> * Fix tests + clippy * rework types to use dyn trait * fmt * docs * Apply suggestions from code review Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com> * Add docs explaining what happens when both `return_type` and `return_type_from_exprs` are called * clippy * fix doc -- comedy of errors --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Alex Huang <huangweijun1001@gmail.com> Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
# Conflicts: # datafusion/physical-plan/src/streaming.rs
# Conflicts: # datafusion/physical-plan/src/projection.rs
Merged upstream. |
mustafasrepo
added a commit
that referenced
this pull request
Mar 5, 2024
* refactor `TreeNode::rewrite()` * use handle_tree_recursion in `Expr` * use macro for transform recursions * fix api * minor fixes * fix * don't trust `t.transformed` coming from transformation closures, keep the old way of detecting if changes were made * rephrase todo comment, always propagate up `t.transformed` from the transformation closure, fix projection pushdown closure * Fix `TreeNodeRecursion` docs * extend Skip (Prune) functionality to Jump as it is defined in https://synnada.notion.site/synnada/TreeNode-Design-Proposal-bceac27d18504a2085145550e267c4c1 * fix Jump and add tests * jump test fixes * fix clippy * unify "transform" traversals using macros, fix "visit" traversal jumps, add visit jump tests, ensure consistent naming `f` instead of `op`, `f_down` instead of `pre_visit` and `f_up` instead of `post_visit` * fix macro rewrite * minor fixes * minor fix * refactor tests * add transform tests * add apply, transform_down and transform_up tests * refactor tests * test jump on both a and e nodes in both top-down and bottom-up traversals * better transform/rewrite tests * minor fix * simplify tests * add stop tests, reorganize tests * fix previous merges and remove leftover file * Review TreeNode Refactor (#1) * Minor changes * Jump doesn't ignore f_up * update test * Update rewriter * LogicalPlan visit update and propagate from children flags * Update tree_node.rs * Update map_children's --------- Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai> * fix * minor fixes * fix f_up call when f_down returns jump * simplify code * minor fix * revert unnecessary changes * fix `DynTreeNode` and `ConcreteTreeNode` `transformed` and `tnr` propagation * introduce TransformedResult helper * fix docs * restore transform as alias to trassform_up * restore transform as alias to trassform_up 2 * Simplifications and comment improvements (#2) --------- Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com> Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai> Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
In great analysis by [MENTION NAME] at the issue 9084. [MENTION NAME] recognized that stack usage (depth) increases a lot during logical and physical planning. The root cause of aggressive stack usage in the logical planning is excessive use of
.clone
ofLogicalPlan
enum.In physical planning the problem stems from recursive function calls in the getter
API
s of theArc<dyn ExecutionPlan>
, such asEquivalenceProperties
,output_partitioning
,output_ordering
, etc.In the PR9084, [MENTION NAME] could reduce physical plan stack usage by caching
equivalence_properties
forProjectionExec
.This PR introduces a new struct to cache PlanProperties (
PlanPropertiesCache
). This this struct,schema
,output_partitioning
,equivalence_properties
,output_ordering
is cached. This removes recursive calls during getter methods. Also, given.cache
method is implemented, default implementations of the.output_partitioning
,.equivalence_properties
,output_ordering
works.With these changes flame graph for the query 54 convert from following graph
flamegraph_main_q54.svg.zip (github couldn't upload it, hence loaded as .zip)
to following graph
where stack usage decreases.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
api change