Wrap immutable plan parts into Arc (make creating ExecutionPlans less costly)#19893
Wrap immutable plan parts into Arc (make creating ExecutionPlans less costly)#19893alamb merged 1 commit intoapache:mainfrom
ExecutionPlans less costly)#19893Conversation
b9caf1d to
bb97763
Compare
|
run benchmark reset_plan_states |
|
🤖 Hi @askalt, thanks for the request (#19893 (comment)). |
|
@xudong963 could you please |
|
run benchmark reset_plan_states |
|
🤖 |
|
🤖: Benchmark completed Details
|
70a396a to
3614687
Compare
3614687 to
bb97763
Compare
bb97763 to
f2d829a
Compare
20b3042 to
e6fed67
Compare
|
I decided to simplify |
1e27378 to
2c0746c
Compare
There was a problem hiding this comment.
I am sorry this is taking so long, but I a very worried about API churn and it takes me a non trivial amount of time to review the potential implications of this change
I have two major points:
- I really think it is important to not force downstream users to change how they call
project_schemaas it is a very common operation and there is no technical reason we can't keep their code the same (though it is harder in Datafusion). I made a PR to do so (and backout many of the changes currently required in this PR): askalt#5 as suggested by @crepererum - As I mentioned previously
OptionProjectionRefseems overly complicated to me to add to the public API -- and because theOptionis wrapped, it means that the normal functions forOption(like match and .and_then, etc) don't work. I think it would be simpler, and follow the rest of the code more closely if it wereOption<ProjectionRef>and the special projection application code is put into ProjectionRef
Thank you for your work on this @askalt
Ok, I will do it. |
2c0746c to
6053a7c
Compare
6053a7c to
5d11e60
Compare
@alamb, done as suggested, please check
|
alamb
left a comment
There was a problem hiding this comment.
Thank you @askalt - this looks really nice to me now. I think the code is faster and easier to work with 🏆 (HashJoinExecBuilder, NestedLoopJoinExecBuilder are especially nice)
I'll kick off a final benchmark run but I think this should be good to merge tomorrow
| hash_join.null_aware, | ||
| )?))) | ||
| Ok(Some(Arc::new( | ||
| HashJoinExecBuilder::from(hash_join) |
| @@ -1370,9 +1377,9 @@ impl ExecutionPlan for AggregateExec { | |||
| ) -> Result<Arc<dyn ExecutionPlan>> { | |||
| let mut me = AggregateExec::try_new_with_schema( | |||
There was a problem hiding this comment.
As a follow on PR, I think we can probably avoid a lot of redundant work here -- try_new_with_schema does many things, including re-compute plan properties
| } | ||
| } | ||
|
|
||
| /// Helps to build [`HashJoinExec`]. |
|
run benchmark sql_planner |
1 similar comment
|
run benchmark sql_planner |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
- Closes apache#19852 Improve performance of query planning and plan state re-set by making node clone cheap. - Store projection as `Option<Arc<[usize]>>` instead of `Option<Vec<usize>>` in `FilterExec`, `HashJoinExec`, `NestedLoopJoinExec`. - Store exprs as `Arc<[ProjectionExpr]>` instead of Vec in `ProjectionExprs`. - Store arced aggregation, filter, group by expressions within `AggregateExec`.
5d11e60 to
a4bbb71
Compare
ExecutionPlans less costly)
Which issue does this PR close?
Arc#19852Rationale for this change
Improve performance of query planning and plan state re-set by making node clone cheap.
What changes are included in this PR?
Option<Arc<[usize]>>instead ofOption<Vec<usize>>inFilterExec,HashJoinExec,NestedLoopJoinExec.Arc<[ProjectionExpr]>instead of Vec inProjectionExprs.AggregateExec.