-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement](Nereids) Support to query rewrite by materialized view when join input has aggregate #30230
Conversation
run buildall |
TPC-H: Total hot run time: 39222 ms
|
TPC-DS: Total hot run time: 176362 ms
|
ClickBench: Total hot run time: 30.1 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run buildall |
TPC-H: Total hot run time: 38739 ms
|
TPC-DS: Total hot run time: 176609 ms
|
PR approved by anyone and no changes requested. |
ClickBench: Total hot run time: 30.38 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
if (!collector.key()) { | ||
return null; | ||
} | ||
return super.visit(groupPlan.getGroup().getLogicalExpression().getPlan(), collector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return super.visit(groupPlan.getGroup().getLogicalExpression().getPlan(), collector); | |
groupPlan.getGroup().getLogicalExpression().get(0).getPlan().accept(this, collector); |
run buildall |
0233a95
to
c7af748
Compare
run buildall |
TPC-H: Total hot run time: 38594 ms
|
TPC-DS: Total hot run time: 186722 ms
|
ClickBench: Total hot run time: 30.89 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run buildall |
TPC-H: Total hot run time: 38649 ms
|
TPC-DS: Total hot run time: 186585 ms
|
ClickBench: Total hot run time: 30.95 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
PR approved by at least one committer and no changes requested. |
…when join input has aggregate (apache#30230) Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple For example as following: The materialized view def is > select > l_linenumber, > count(distinct l_orderkey), > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > max(case when l_orderkey in (4, 5) then (l_quantity *2 + part_supp_a.qty_max) * 0.88 else 100 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber; when query is like following, it can be rewritten by mv above > select > l_linenumber, > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber;
…when join input has aggregate (#30230) Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple For example as following: The materialized view def is > select > l_linenumber, > count(distinct l_orderkey), > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > max(case when l_orderkey in (4, 5) then (l_quantity *2 + part_supp_a.qty_max) * 0.88 else 100 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber; when query is like following, it can be rewritten by mv above > select > l_linenumber, > sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end), > avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end) > from lineitem > left join orders on l_orderkey = o_orderkey > left join > (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max, > min(ps_availqty) qty_min, > avg(ps_supplycost) cost_avg > from partsupp > group by ps_partkey,ps_suppkey) part_supp_a > on l_partkey = part_supp_a.ps_partkey > and l_suppkey = part_supp_a.ps_suppkey > group by l_linenumber;
Proposed changes
Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple
For example as following:
The materialized view def is
when query is like following, it can be rewritten by mv above
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...