Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement](Nereids) Support to query rewrite by materialized view when join input has aggregate #30230

Merged
merged 7 commits into from
Jan 24, 2024

Conversation

seawinde
Copy link
Contributor

Proposed changes

Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple
For example as following:
The materialized view def is

       select
       l_linenumber,
       count(distinct l_orderkey),
      sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
       max(case when l_orderkey in (4, 5) then (l_quantity *2 + part_supp_a.qty_max) * 0.88 else 100 end),
       avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
       from lineitem
       left join orders on l_orderkey = o_orderkey
       left join 
      (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
          min(ps_availqty) qty_min,
          avg(ps_supplycost) cost_avg
          from partsupp
          group by ps_partkey,ps_suppkey) part_supp_a
      on l_partkey = part_supp_a.ps_partkey
      and l_suppkey = part_supp_a.ps_suppkey
      group by l_linenumber;

when query is like following, it can be rewritten by mv above

                   select
       l_linenumber,
       sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
       avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
       from lineitem
       left join orders on l_orderkey = o_orderkey
       left join 
       (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
           min(ps_availqty) qty_min,
           avg(ps_supplycost) cost_avg
           from partsupp
           group by ps_partkey,ps_suppkey) part_supp_a
       on l_partkey = part_supp_a.ps_partkey
       and l_suppkey = part_supp_a.ps_suppkey
       group by l_linenumber;

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39222 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2358ba5ee3eb63595b2b927d84356128dd550c37, data reload: false

------ Round 1 ----------------------------------
q1	17698	5309	5298	5298
q2	2047	139	136	136
q3	10646	1156	1148	1148
q4	10223	814	828	814
q5	7733	3232	3193	3193
q6	200	124	125	124
q7	863	501	496	496
q8	9255	1942	1957	1942
q9	7310	6433	6408	6408
q10	8224	3070	3047	3047
q11	405	222	200	200
q12	358	192	193	192
q13	17985	3371	3386	3371
q14	246	215	216	215
q15	545	519	512	512
q16	430	379	386	379
q17	941	558	514	514
q18	7643	6993	6817	6817
q19	1593	1397	1391	1391
q20	569	313	294	294
q21	2788	2430	2499	2430
q22	372	328	301	301
Total cold run time: 108074 ms
Total hot run time: 39222 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5521	5287	5310	5287
q2	324	218	210	210
q3	3324	3232	3226	3226
q4	2086	2071	2057	2057
q5	6029	6227	5982	5982
q6	203	116	119	116
q7	2291	1878	1854	1854
q8	3250	3389	3393	3389
q9	8939	8992	8875	8875
q10	3896	3857	3844	3844
q11	560	451	456	451
q12	781	632	620	620
q13	16939	3167	3192	3167
q14	299	251	258	251
q15	583	519	510	510
q16	525	461	451	451
q17	1872	1860	1843	1843
q18	9470	17409	9679	9679
q19	26605	1665	1562	1562
q20	4607	1948	1935	1935
q21	14489	5337	5391	5337
q22	968	562	528	528
Total cold run time: 113561 ms
Total hot run time: 61174 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 176362 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2358ba5ee3eb63595b2b927d84356128dd550c37, data reload: false

query1	922	333	332	332
query2	6556	1878	1772	1772
query3	6698	202	192	192
query4	32210	22215	22239	22215
query5	4449	365	371	365
query6	260	168	162	162
query7	4603	270	261	261
query8	228	169	177	169
query9	8341	2557	2529	2529
query10	418	243	232	232
query11	16729	15648	15533	15533
query12	125	72	68	68
query13	1686	374	372	372
query14	10501	6938	7043	6938
query15	213	188	186	186
query16	5794	260	264	260
query17	935	473	461	461
query18	1781	260	259	259
query19	176	129	138	129
query20	75	79	68	68
query21	180	118	120	118
query22	4973	4817	4702	4702
query23	31817	30888	30643	30643
query24	12569	2807	2810	2807
query25	559	315	310	310
query26	1820	144	144	144
query27	3198	294	290	290
query28	7090	1829	1815	1815
query29	2144	623	637	623
query30	285	136	137	136
query31	941	763	767	763
query32	88	50	50	50
query33	709	217	205	205
query34	1160	464	470	464
query35	885	810	749	749
query36	1314	1164	1112	1112
query37	96	55	57	55
query38	3317	3238	3229	3229
query39	1322	1259	1279	1259
query40	346	87	86	86
query41	40	36	35	35
query42	90	81	90	81
query43	521	466	480	466
query44	1124	691	694	691
query45	192	179	174	174
query46	1080	661	657	657
query47	1650	1582	1562	1562
query48	401	297	310	297
query49	1223	286	280	280
query50	697	310	313	310
query51	5361	5270	5175	5175
query52	87	82	74	74
query53	326	259	255	255
query54	244	178	189	178
query55	90	77	76	76
query56	181	163	166	163
query57	1016	934	928	928
query58	203	169	165	165
query59	2844	2655	2674	2655
query60	212	182	187	182
query61	81	84	80	80
query62	587	367	357	357
query63	278	249	271	249
query64	6113	1772	1730	1730
query65	3329	3252	3240	3240
query66	1453	317	310	310
query67	15525	15170	15173	15170
query68	16549	536	506	506
query69	604	306	299	299
query70	2229	1468	1477	1468
query71	500	212	211	211
query72	4802	2815	2830	2815
query73	4592	324	325	324
query74	7205	6418	6423	6418
query75	4953	2320	2339	2320
query76	6328	940	988	940
query77	1020	234	233	233
query78	9165	8818	8596	8596
query79	2468	522	491	491
query80	580	319	319	319
query81	448	207	204	204
query82	206	78	76	76
query83	278	117	120	117
query84	278	76	71	71
query85	1102	347	331	331
query86	390	392	372	372
query87	3533	3329	3328	3328
query88	2746	2201	2203	2201
query89	422	343	346	343
query90	2012	183	183	183
query91	153	127	134	127
query92	50	43	44	43
query93	1335	470	458	458
query94	1292	163	158	158
query95	499	448	450	448
query96	621	319	328	319
query97	4276	4151	4190	4151
query98	204	204	189	189
query99	1080	700	717	700
Total cold run time: 304381 ms
Total hot run time: 176362 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2358ba5ee3eb63595b2b927d84356128dd550c37, data reload: false

query1	0.03	0.03	0.02
query2	0.06	0.02	0.02
query3	0.22	0.05	0.05
query4	1.70	0.08	0.08
query5	0.53	0.52	0.52
query6	1.27	0.64	0.64
query7	0.02	0.01	0.01
query8	0.04	0.02	0.03
query9	0.53	0.49	0.49
query10	0.54	0.58	0.56
query11	0.12	0.09	0.09
query12	0.11	0.09	0.09
query13	0.60	0.61	0.62
query14	0.77	0.80	0.82
query15	0.77	0.77	0.78
query16	0.37	0.38	0.36
query17	0.99	1.01	1.02
query18	0.24	0.25	0.25
query19	1.89	1.80	1.76
query20	0.02	0.01	0.01
query21	15.43	0.57	0.57
query22	2.43	2.25	1.86
query23	17.04	0.76	0.82
query24	2.59	1.19	0.19
query25	0.40	0.27	0.19
query26	0.42	0.14	0.13
query27	0.06	0.05	0.05
query28	12.29	0.77	0.76
query29	12.49	3.20	3.09
query30	0.53	0.49	0.45
query31	2.78	0.36	0.35
query32	3.36	0.48	0.49
query33	3.23	3.23	3.26
query34	15.83	4.29	4.26
query35	4.45	4.28	4.26
query36	1.10	1.06	1.07
query37	0.06	0.04	0.04
query38	0.03	0.03	0.03
query39	0.02	0.01	0.02
query40	0.17	0.13	0.13
query41	0.06	0.01	0.02
query42	0.02	0.02	0.01
query43	0.02	0.02	0.02
Total cold run time: 105.63 s
Total hot run time: 30.1 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 2358ba5ee3eb63595b2b927d84356128dd550c37 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       14.9 seconds inserted 10000000 Rows, about 671K ops/s

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38739 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4031dba548c2745343fffbb41a5b2dc6cb8fa743, data reload: false

------ Round 1 ----------------------------------
q1	17677	5186	5112	5112
q2	2035	141	131	131
q3	10654	1129	1151	1129
q4	10229	749	836	749
q5	7720	3116	3142	3116
q6	191	123	122	122
q7	865	496	489	489
q8	9214	1921	1934	1921
q9	7270	6352	6338	6338
q10	8188	3021	3087	3021
q11	417	213	208	208
q12	354	192	189	189
q13	17991	3406	3347	3347
q14	252	223	204	204
q15	544	507	509	507
q16	468	380	391	380
q17	937	532	507	507
q18	7593	6840	6865	6840
q19	2536	1415	1451	1415
q20	612	310	291	291
q21	2807	2405	2425	2405
q22	358	318	319	318
Total cold run time: 108912 ms
Total hot run time: 38739 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5578	5333	5255	5255
q2	327	216	213	213
q3	3378	3240	3194	3194
q4	2032	2066	2016	2016
q5	6015	5829	5896	5829
q6	194	120	115	115
q7	2287	1809	1881	1809
q8	3217	3371	3369	3369
q9	8859	8996	12743	8996
q10	3911	3867	3760	3760
q11	548	476	437	437
q12	786	628	631	628
q13	16921	3184	3250	3184
q14	299	260	282	260
q15	544	495	499	495
q16	525	498	501	498
q17	1886	1821	1873	1821
q18	9576	13357	9707	9707
q19	27166	1621	1507	1507
q20	4612	1927	1949	1927
q21	15880	5358	5283	5283
q22	1584	521	523	521
Total cold run time: 116125 ms
Total hot run time: 60824 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 176609 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4031dba548c2745343fffbb41a5b2dc6cb8fa743, data reload: false

query1	940	328	327	327
query2	6570	1947	1975	1947
query3	6705	198	197	197
query4	31869	22273	22261	22261
query5	4452	373	422	373
query6	245	153	156	153
query7	4604	270	259	259
query8	219	175	182	175
query9	8433	2528	2511	2511
query10	427	236	250	236
query11	17037	15665	15674	15665
query12	125	71	63	63
query13	1684	378	374	374
query14	10548	6844	6970	6844
query15	215	179	182	179
query16	5800	252	245	245
query17	956	463	483	463
query18	1797	262	255	255
query19	178	136	130	130
query20	74	74	66	66
query21	191	136	128	128
query22	4704	4731	4672	4672
query23	31644	31059	30850	30850
query24	11663	2805	2816	2805
query25	579	330	306	306
query26	1596	142	142	142
query27	3226	284	289	284
query28	7196	1828	1825	1825
query29	1544	626	614	614
query30	284	137	136	136
query31	933	749	762	749
query32	87	51	53	51
query33	694	213	216	213
query34	1137	460	459	459
query35	856	798	742	742
query36	1386	1278	1176	1176
query37	92	59	55	55
query38	3341	3271	3243	3243
query39	1319	1271	1265	1265
query40	201	91	83	83
query41	36	34	34	34
query42	97	84	90	84
query43	512	466	480	466
query44	1042	686	672	672
query45	196	178	179	178
query46	1051	650	650	650
query47	1640	1605	1508	1508
query48	370	320	311	311
query49	1120	283	276	276
query50	673	309	305	305
query51	5308	5275	5202	5202
query52	96	78	82	78
query53	315	256	255	255
query54	241	180	190	180
query55	83	76	77	76
query56	174	162	170	162
query57	985	940	932	932
query58	201	164	169	164
query59	2749	2728	2738	2728
query60	207	184	182	182
query61	84	84	83	83
query62	603	358	368	358
query63	281	249	261	249
query64	5065	1792	1764	1764
query65	3331	3255	3244	3244
query66	1270	324	313	313
query67	15629	14936	15027	14936
query68	13408	536	499	499
query69	608	294	295	294
query70	1807	1524	1521	1521
query71	512	211	213	211
query72	5525	2821	2850	2821
query73	3215	315	313	313
query74	6932	6458	6384	6384
query75	5204	2324	2264	2264
query76	6555	1055	982	982
query77	703	234	235	234
query78	9331	8850	8685	8685
query79	1093	499	493	493
query80	563	320	307	307
query81	469	202	199	199
query82	204	78	78	78
query83	140	119	122	119
query84	279	70	67	67
query85	1090	332	317	317
query86	405	406	376	376
query87	3530	3382	3347	3347
query88	2968	2183	2180	2180
query89	432	373	355	355
query90	1987	186	180	180
query91	157	128	122	122
query92	54	44	42	42
query93	1409	458	417	417
query94	1235	152	156	152
query95	502	462	441	441
query96	623	316	318	316
query97	4270	4165	4130	4130
query98	212	191	189	189
query99	1039	710	663	663
Total cold run time: 295371 ms
Total hot run time: 176609 ms

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

ClickBench: Total hot run time: 30.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4031dba548c2745343fffbb41a5b2dc6cb8fa743, data reload: false

query1	0.03	0.03	0.03
query2	0.05	0.02	0.02
query3	0.22	0.05	0.05
query4	1.71	0.06	0.07
query5	0.53	0.52	0.52
query6	1.22	0.64	0.64
query7	0.03	0.01	0.02
query8	0.03	0.02	0.02
query9	0.55	0.51	0.49
query10	0.56	0.57	0.56
query11	0.12	0.08	0.08
query12	0.12	0.09	0.09
query13	0.60	0.61	0.61
query14	0.78	0.81	0.81
query15	0.79	0.77	0.77
query16	0.37	0.38	0.38
query17	1.01	1.00	1.04
query18	0.25	0.24	0.26
query19	1.84	1.80	1.82
query20	0.01	0.01	0.01
query21	15.41	0.57	0.54
query22	2.76	2.28	1.77
query23	17.57	0.90	0.72
query24	2.38	0.49	0.92
query25	0.49	0.20	0.15
query26	0.40	0.14	0.13
query27	0.04	0.05	0.04
query28	12.50	0.76	0.77
query29	12.76	3.20	3.17
query30	0.56	0.49	0.51
query31	2.77	0.35	0.34
query32	3.38	0.48	0.48
query33	3.19	3.27	3.22
query34	15.90	4.30	4.28
query35	4.28	4.37	4.28
query36	1.12	1.07	1.08
query37	0.07	0.05	0.05
query38	0.04	0.03	0.03
query39	0.02	0.02	0.01
query40	0.16	0.13	0.13
query41	0.06	0.01	0.02
query42	0.02	0.02	0.01
query43	0.03	0.02	0.02
Total cold run time: 106.73 s
Total hot run time: 30.38 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 4031dba548c2745343fffbb41a5b2dc6cb8fa743 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          60 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      33 seconds loaded 861443392 Bytes, about 24 MB/s
Insert into select:       14.2 seconds inserted 10000000 Rows, about 704K ops/s

if (!collector.key()) {
return null;
}
return super.visit(groupPlan.getGroup().getLogicalExpression().getPlan(), collector);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return super.visit(groupPlan.getGroup().getLogicalExpression().getPlan(), collector);
groupPlan.getGroup().getLogicalExpression().get(0).getPlan().accept(this, collector);

@seawinde
Copy link
Contributor Author

run buildall

@seawinde seawinde force-pushed the support_join_input_has_agg branch from 0233a95 to c7af748 Compare January 23, 2024 12:02
@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38594 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c7af748e76558ba3591d6e2e8e55064570664b16, data reload: false

------ Round 1 ----------------------------------
q1	17642	5338	5204	5204
q2	2047	143	129	129
q3	10755	1136	1160	1136
q4	10307	740	803	740
q5	7737	3143	3144	3143
q6	193	121	119	119
q7	838	497	477	477
q8	9211	1904	1930	1904
q9	7290	6392	6359	6359
q10	8197	3031	2994	2994
q11	406	201	209	201
q12	376	191	194	191
q13	17994	3340	3350	3340
q14	242	210	217	210
q15	556	505	492	492
q16	421	374	361	361
q17	944	551	559	551
q18	7517	6926	6738	6738
q19	1563	1361	1464	1361
q20	556	281	315	281
q21	2810	2369	2411	2369
q22	352	318	294	294
Total cold run time: 107954 ms
Total hot run time: 38594 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5383	5389	5294	5294
q2	328	221	215	215
q3	3354	3205	3224	3205
q4	2098	2009	2018	2009
q5	6264	6189	5954	5954
q6	196	117	115	115
q7	2306	1941	1821	1821
q8	3239	3356	3374	3356
q9	8834	8853	8812	8812
q10	3869	3797	3832	3797
q11	565	451	449	449
q12	819	604	639	604
q13	16937	3191	3164	3164
q14	286	256	265	256
q15	565	504	506	504
q16	503	480	486	480
q17	1864	1801	1834	1801
q18	9420	19604	9777	9777
q19	23710	1590	1514	1514
q20	4587	1917	1915	1915
q21	14438	5374	5347	5347
q22	975	521	560	521
Total cold run time: 110540 ms
Total hot run time: 60910 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186722 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c7af748e76558ba3591d6e2e8e55064570664b16, data reload: false

query1	941	340	327	327
query2	6556	1954	1933	1933
query3	7203	213	209	209
query4	34866	22490	22695	22490
query5	4543	462	383	383
query6	263	182	192	182
query7	5062	284	262	262
query8	230	202	186	186
query9	9838	2540	2548	2540
query10	483	236	230	230
query11	17565	15576	15486	15486
query12	125	68	69	68
query13	1673	380	370	370
query14	10581	6962	6939	6939
query15	217	188	186	186
query16	5794	256	266	256
query17	947	478	468	468
query18	1792	251	252	251
query19	173	132	132	132
query20	74	71	69	69
query21	180	131	119	119
query22	4879	4810	4606	4606
query23	32316	30754	30829	30754
query24	12946	2794	2801	2794
query25	566	329	318	318
query26	1828	147	143	143
query27	3379	286	289	286
query28	7407	1839	1822	1822
query29	2203	640	626	626
query30	280	141	134	134
query31	937	745	771	745
query32	74	51	51	51
query33	714	217	208	208
query34	1256	459	468	459
query35	869	806	785	785
query36	1346	1211	1204	1204
query37	171	57	58	57
query38	3367	3256	3259	3256
query39	1300	1269	1268	1268
query40	350	83	88	83
query41	36	36	35	35
query42	99	82	85	82
query43	501	496	490	490
query44	1104	682	689	682
query45	192	176	175	175
query46	1076	652	661	652
query47	1644	1495	1507	1495
query48	379	298	302	298
query49	1224	296	285	285
query50	681	309	317	309
query51	5333	5197	5216	5197
query52	91	74	81	74
query53	320	278	261	261
query54	234	179	184	179
query55	81	75	79	75
query56	178	168	168	168
query57	1005	924	913	913
query58	195	161	161	161
query59	2885	2738	2638	2638
query60	215	184	186	184
query61	84	85	84	84
query62	633	358	361	358
query63	281	266	261	261
query64	6074	1773	1769	1769
query65	3327	3251	3223	3223
query66	1428	321	317	317
query67	15607	15323	15105	15105
query68	9733	532	508	508
query69	570	300	295	295
query70	1594	1488	1515	1488
query71	10425	10198	10194	10194
query72	4018	2829	2815	2815
query73	1595	315	315	315
query74	6805	6358	6375	6358
query75	4359	2319	2266	2266
query76	6649	1025	959	959
query77	845	237	239	237
query78	9151	8847	8639	8639
query79	1032	500	499	499
query80	626	322	322	322
query81	454	200	196	196
query82	581	79	78	78
query83	138	117	121	117
query84	276	68	68	68
query85	1421	338	343	338
query86	417	375	387	375
query87	3537	3338	3261	3261
query88	2716	2221	2203	2203
query89	409	353	349	349
query90	2122	183	188	183
query91	154	122	126	122
query92	51	46	43	43
query93	1183	411	407	407
query94	1286	159	158	158
query95	495	452	451	451
query96	608	320	335	320
query97	4254	4172	4138	4138
query98	218	194	188	188
query99	1306	689	677	677
Total cold run time: 309229 ms
Total hot run time: 186722 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.89 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c7af748e76558ba3591d6e2e8e55064570664b16, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.02	0.02
query3	0.22	0.04	0.04
query4	1.70	0.06	0.07
query5	0.54	0.52	0.52
query6	1.30	0.64	0.63
query7	0.01	0.01	0.02
query8	0.04	0.03	0.03
query9	0.53	0.51	0.50
query10	0.57	0.58	0.55
query11	0.12	0.08	0.09
query12	0.12	0.09	0.09
query13	0.60	0.60	0.60
query14	0.79	0.81	0.78
query15	0.79	0.78	0.78
query16	0.38	0.37	0.37
query17	0.99	1.05	1.04
query18	0.23	0.27	0.23
query19	1.89	1.75	1.78
query20	0.01	0.02	0.01
query21	15.43	0.60	0.58
query22	2.50	2.14	2.11
query23	17.43	0.85	0.80
query24	2.41	0.72	1.17
query25	0.50	0.22	0.05
query26	0.46	0.16	0.15
query27	0.06	0.06	0.04
query28	11.59	0.78	0.75
query29	12.52	3.23	3.09
query30	0.50	0.50	0.53
query31	2.78	0.36	0.34
query32	3.37	0.47	0.49
query33	3.21	3.25	3.27
query34	15.81	4.36	4.34
query35	4.36	4.28	4.25
query36	1.09	1.07	1.06
query37	0.07	0.06	0.06
query38	0.03	0.03	0.03
query39	0.02	0.01	0.02
query40	0.15	0.13	0.14
query41	0.06	0.02	0.01
query42	0.03	0.01	0.02
query43	0.02	0.02	0.02
Total cold run time: 105.32 s
Total hot run time: 30.89 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit c7af748e76558ba3591d6e2e8e55064570664b16 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       14.8 seconds inserted 10000000 Rows, about 675K ops/s

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38649 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9071d12480b9842ac4cc61e0b5cda08340d245db, data reload: false

------ Round 1 ----------------------------------
q1	17628	6084	5307	5307
q2	2061	137	135	135
q3	10740	1150	1152	1150
q4	10284	733	796	733
q5	7736	3121	3149	3121
q6	200	121	120	120
q7	850	491	479	479
q8	9221	1958	1948	1948
q9	7301	6385	6308	6308
q10	8244	3077	3118	3077
q11	421	212	211	211
q12	353	192	191	191
q13	17980	3365	3345	3345
q14	252	211	218	211
q15	548	520	511	511
q16	441	370	376	370
q17	937	561	492	492
q18	7415	6982	6580	6580
q19	1676	1407	1402	1402
q20	581	292	309	292
q21	2821	2386	2367	2367
q22	370	312	299	299
Total cold run time: 108060 ms
Total hot run time: 38649 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5530	5280	5286	5280
q2	332	215	218	215
q3	3341	3277	3267	3267
q4	2092	2048	2046	2046
q5	5951	5913	5939	5913
q6	203	116	116	116
q7	2268	1901	1888	1888
q8	3233	3352	3371	3352
q9	8954	8780	8872	8780
q10	4290	3779	3856	3779
q11	561	445	454	445
q12	814	608	605	605
q13	16936	3166	3200	3166
q14	292	257	281	257
q15	553	510	505	505
q16	510	472	466	466
q17	1867	1876	1885	1876
q18	9597	9427	10645	9427
q19	22466	1552	1531	1531
q20	4520	1942	1916	1916
q21	13950	5503	5347	5347
q22	985	552	543	543
Total cold run time: 109245 ms
Total hot run time: 60720 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186585 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9071d12480b9842ac4cc61e0b5cda08340d245db, data reload: false

query1	932	345	356	345
query2	6531	2141	1927	1927
query3	6698	200	199	199
query4	33192	22108	22186	22108
query5	4474	378	374	374
query6	261	157	159	157
query7	4607	261	262	261
query8	232	172	181	172
query9	8318	2531	2520	2520
query10	423	222	246	222
query11	16998	15402	15456	15402
query12	121	65	67	65
query13	1692	373	380	373
query14	10523	6883	6944	6883
query15	215	179	185	179
query16	5791	255	245	245
query17	951	472	476	472
query18	1784	252	256	252
query19	177	131	132	131
query20	73	67	64	64
query21	203	134	132	132
query22	5031	4794	4644	4644
query23	31748	30815	30838	30815
query24	12876	2868	2864	2864
query25	567	308	312	308
query26	1828	141	144	141
query27	3333	287	292	287
query28	7132	1830	1812	1812
query29	2145	626	633	626
query30	283	140	141	140
query31	927	741	768	741
query32	78	53	49	49
query33	706	216	211	211
query34	1261	456	471	456
query35	866	743	751	743
query36	1219	1215	1191	1191
query37	177	57	58	57
query38	3325	3297	3242	3242
query39	1319	1260	1264	1260
query40	352	89	84	84
query41	38	34	35	34
query42	90	80	84	80
query43	520	470	477	470
query44	1094	690	699	690
query45	196	174	173	173
query46	1073	653	663	653
query47	1652	1539	1568	1539
query48	398	303	292	292
query49	1221	281	285	281
query50	690	310	321	310
query51	5341	5260	5250	5250
query52	94	79	73	73
query53	326	258	260	258
query54	242	176	174	174
query55	85	76	77	76
query56	178	170	173	170
query57	1010	911	924	911
query58	181	159	161	159
query59	2804	2719	2730	2719
query60	208	183	183	183
query61	80	80	79	79
query62	571	369	372	369
query63	279	260	264	260
query64	6100	1776	1745	1745
query65	3328	3269	3222	3222
query66	1430	317	310	310
query67	15753	15458	15163	15163
query68	9636	543	520	520
query69	588	303	296	296
query70	1721	1558	1462	1462
query71	10429	10195	10205	10195
query72	4029	2849	2806	2806
query73	1929	311	317	311
query74	7115	6501	6404	6404
query75	4126	2342	2295	2295
query76	6000	1000	1014	1000
query77	779	227	229	227
query78	8940	8987	8724	8724
query79	996	486	495	486
query80	696	329	318	318
query81	462	207	207	207
query82	236	76	79	76
query83	175	117	115	115
query84	275	66	67	66
query85	1121	325	320	320
query86	390	409	374	374
query87	3522	3373	3299	3299
query88	3013	2222	2201	2201
query89	433	364	350	350
query90	2136	189	187	187
query91	152	128	127	127
query92	55	41	44	41
query93	964	421	440	421
query94	1151	164	158	158
query95	511	456	446	446
query96	596	328	330	328
query97	4286	4131	4145	4131
query98	214	201	191	191
query99	1053	706	708	706
Total cold run time: 302080 ms
Total hot run time: 186585 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.95 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9071d12480b9842ac4cc61e0b5cda08340d245db, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.02	0.02
query3	0.23	0.05	0.05
query4	1.68	0.07	0.08
query5	0.54	0.53	0.53
query6	1.32	0.63	0.62
query7	0.03	0.01	0.01
query8	0.04	0.03	0.03
query9	0.54	0.50	0.49
query10	0.55	0.55	0.55
query11	0.12	0.08	0.08
query12	0.11	0.10	0.09
query13	0.61	0.62	0.60
query14	0.78	0.82	0.82
query15	0.79	0.78	0.77
query16	0.39	0.39	0.38
query17	1.00	1.01	1.03
query18	0.23	0.26	0.26
query19	1.87	1.81	1.78
query20	0.01	0.01	0.01
query21	15.39	0.58	0.56
query22	2.33	2.22	2.14
query23	17.02	1.01	0.81
query24	2.57	0.50	1.38
query25	0.32	0.30	0.12
query26	0.51	0.14	0.14
query27	0.07	0.05	0.04
query28	11.51	0.77	0.78
query29	12.55	3.16	3.15
query30	0.54	0.49	0.49
query31	2.78	0.36	0.34
query32	3.34	0.48	0.48
query33	3.22	3.23	3.20
query34	16.33	4.34	4.32
query35	4.42	4.37	4.35
query36	1.12	1.08	1.07
query37	0.07	0.05	0.05
query38	0.04	0.03	0.03
query39	0.02	0.01	0.01
query40	0.16	0.13	0.13
query41	0.07	0.02	0.01
query42	0.02	0.02	0.02
query43	0.02	0.02	0.02
Total cold run time: 105.35 s
Total hot run time: 30.95 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 9071d12480b9842ac4cc61e0b5cda08340d245db with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       14.4 seconds inserted 10000000 Rows, about 694K ops/s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 24, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morrySnow morrySnow merged commit 4e48e19 into apache:master Jan 24, 2024
25 checks passed
seawinde added a commit to seawinde/doris that referenced this pull request Jan 24, 2024
…when join input has aggregate (apache#30230)

Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple
For example as following:
The materialized view def is
>            select
>              l_linenumber,
>              count(distinct l_orderkey),
>              sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
>              max(case when l_orderkey in (4, 5) then (l_quantity *2 + part_supp_a.qty_max) * 0.88 else 100 end),
>              avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
>            from lineitem
>            left join orders on l_orderkey = o_orderkey
>            left join
>              (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
>                min(ps_availqty) qty_min,
>                avg(ps_supplycost) cost_avg
>                from partsupp
>                group by ps_partkey,ps_suppkey) part_supp_a
>              on l_partkey = part_supp_a.ps_partkey
>                and l_suppkey = part_supp_a.ps_suppkey
>            group by l_linenumber;

when query is like following, it can be rewritten by mv above
>            select
>              l_linenumber,
>              sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
>              avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
>            from lineitem
>            left join orders on l_orderkey = o_orderkey
>            left join
>              (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
>                min(ps_availqty) qty_min,
>                avg(ps_supplycost) cost_avg
>                from partsupp
>                group by ps_partkey,ps_suppkey) part_supp_a
>              on l_partkey = part_supp_a.ps_partkey
>                and l_suppkey = part_supp_a.ps_suppkey
>            group by l_linenumber;
yiguolei pushed a commit that referenced this pull request Jan 25, 2024
…when join input has aggregate (#30230)

Support to query rewrite by materialized view when join input has aggregate, the aggregate should be simple
For example as following:
The materialized view def is 
>            select
>              l_linenumber,
>              count(distinct l_orderkey),
>              sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
>              max(case when l_orderkey in (4, 5) then (l_quantity *2 + part_supp_a.qty_max) * 0.88 else 100 end),
>              avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
>            from lineitem
>            left join orders on l_orderkey = o_orderkey
>            left join 
>              (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
>                min(ps_availqty) qty_min,
>                avg(ps_supplycost) cost_avg
>                from partsupp
>                group by ps_partkey,ps_suppkey) part_supp_a
>              on l_partkey = part_supp_a.ps_partkey
>                and l_suppkey = part_supp_a.ps_suppkey
>            group by l_linenumber;

when query is like following, it can be rewritten by mv above
>            select
>              l_linenumber,
>              sum(case when l_orderkey in (1,2,3) then l_suppkey * l_linenumber else 0 end),
>              avg(case when l_partkey in (2, 3, 4) then l_discount + o_totalprice + part_supp_a.qty_sum else 50 end)
>            from lineitem
>            left join orders on l_orderkey = o_orderkey
>            left join 
>              (select ps_partkey, ps_suppkey, sum(ps_availqty) qty_sum, max(ps_availqty) qty_max,
>                min(ps_availqty) qty_min,
>                avg(ps_supplycost) cost_avg
>                from partsupp
>                group by ps_partkey,ps_suppkey) part_supp_a
>              on l_partkey = part_supp_a.ps_partkey
>                and l_suppkey = part_supp_a.ps_suppkey
>            group by l_linenumber;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.0-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants