Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](nereids) set lower bound for range-selectivity #40089

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

englefly
Copy link
Contributor

@englefly englefly commented Aug 29, 2024

Proposed changes

Range selectivity is prone to producing outliers, so we add this threshold limit.
The threshold estimation is calculated based on selecting one month out of fifty years.

Issue Number: close #xxx

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@englefly
Copy link
Contributor Author

run buildall

@englefly englefly force-pushed the range-sel-threshold branch 2 times, most recently from 322111c to 054c1eb Compare August 29, 2024 04:05
@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38676 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 054c1eb3c0924a3f495233df64f8ff469433327d, data reload: false

------ Round 1 ----------------------------------
q1	18201	4520	4414	4414
q2	2813	183	180	180
q3	11306	1133	1204	1133
q4	10209	792	789	789
q5	7760	2897	2860	2860
q6	232	140	149	140
q7	989	627	610	610
q8	9573	2069	2080	2069
q9	7259	6551	6544	6544
q10	6994	2252	2178	2178
q11	460	243	250	243
q12	396	225	231	225
q13	17753	3010	3028	3010
q14	288	232	237	232
q15	527	478	492	478
q16	582	497	527	497
q17	998	739	738	738
q18	7373	7045	6869	6869
q19	1400	1089	1022	1022
q20	672	333	343	333
q21	3997	3097	3136	3097
q22	1151	1050	1015	1015
Total cold run time: 110933 ms
Total hot run time: 38676 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4359	4271	4253	4253
q2	375	277	277	277
q3	2898	2674	2683	2674
q4	1939	1653	1666	1653
q5	5442	5395	5383	5383
q6	221	133	132	132
q7	2072	1778	1777	1777
q8	3195	3409	3367	3367
q9	8505	8422	8416	8416
q10	3449	3221	3238	3221
q11	601	510	494	494
q12	810	620	607	607
q13	11540	3024	3043	3024
q14	304	282	274	274
q15	530	482	481	481
q16	601	567	556	556
q17	1797	1524	1478	1478
q18	7703	7449	7504	7449
q19	1674	1382	1513	1382
q20	2039	1823	1861	1823
q21	5526	5245	5284	5245
q22	1164	1078	1041	1041
Total cold run time: 66744 ms
Total hot run time: 55007 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188713 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 054c1eb3c0924a3f495233df64f8ff469433327d, data reload: false

query1	915	370	366	366
query2	6451	2020	1926	1926
query3	6650	212	221	212
query4	33475	23538	23144	23144
query5	4145	538	520	520
query6	253	164	177	164
query7	4583	303	299	299
query8	255	215	213	213
query9	8679	2509	2502	2502
query10	424	288	290	288
query11	16985	14963	14911	14911
query12	149	108	98	98
query13	1623	400	380	380
query14	10253	6823	7066	6823
query15	225	175	176	175
query16	7675	476	486	476
query17	1606	571	555	555
query18	2019	292	288	288
query19	189	146	143	143
query20	117	111	109	109
query21	206	107	105	105
query22	4459	4282	4202	4202
query23	33965	33575	33569	33569
query24	11359	2963	2904	2904
query25	668	385	388	385
query26	1247	160	164	160
query27	2599	288	277	277
query28	7498	2135	2115	2115
query29	819	417	408	408
query30	313	165	153	153
query31	965	771	820	771
query32	93	62	58	58
query33	750	310	287	287
query34	981	493	484	484
query35	883	731	720	720
query36	1078	937	950	937
query37	165	94	97	94
query38	3960	3900	3915	3900
query39	1460	1390	1414	1390
query40	198	123	121	121
query41	49	49	46	46
query42	122	100	96	96
query43	522	489	480	480
query44	1201	774	767	767
query45	202	165	175	165
query46	1120	746	746	746
query47	1893	1754	1778	1754
query48	395	297	297	297
query49	1066	446	442	442
query50	827	431	424	424
query51	7298	7108	7069	7069
query52	98	88	88	88
query53	264	196	188	188
query54	972	467	462	462
query55	76	82	84	82
query56	282	255	267	255
query57	1167	1099	1080	1080
query58	250	221	229	221
query59	3004	2874	2858	2858
query60	298	280	277	277
query61	107	99	100	99
query62	826	665	633	633
query63	223	193	191	191
query64	5253	683	659	659
query65	3247	3233	3148	3148
query66	1393	337	361	337
query67	15728	15193	15205	15193
query68	3327	601	585	585
query69	407	288	304	288
query70	1213	1111	1101	1101
query71	346	316	286	286
query72	6290	4093	4007	4007
query73	768	334	338	334
query74	9282	8788	8929	8788
query75	3430	2686	2700	2686
query76	1897	1098	984	984
query77	536	331	320	320
query78	9644	9285	9180	9180
query79	1045	555	556	555
query80	749	551	552	551
query81	454	247	237	237
query82	246	153	149	149
query83	180	154	157	154
query84	235	84	84	84
query85	694	298	342	298
query86	305	303	275	275
query87	4446	4203	4345	4203
query88	2965	2383	2355	2355
query89	397	290	298	290
query90	1826	208	207	207
query91	130	106	106	106
query92	67	56	53	53
query93	1057	571	575	571
query94	741	301	308	301
query95	356	278	274	274
query96	606	274	272	272
query97	3208	3054	3062	3054
query98	235	209	197	197
query99	1476	1279	1291	1279
Total cold run time: 286778 ms
Total hot run time: 188713 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 054c1eb3c0924a3f495233df64f8ff469433327d, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.06
query4	1.67	0.08	0.08
query5	0.50	0.49	0.50
query6	1.13	0.73	0.73
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.54	0.48	0.47
query10	0.52	0.53	0.55
query11	0.15	0.11	0.11
query12	0.14	0.13	0.12
query13	0.61	0.58	0.59
query14	2.06	2.05	2.14
query15	0.90	0.83	0.81
query16	0.39	0.40	0.38
query17	0.97	0.98	0.99
query18	0.20	0.21	0.20
query19	1.98	1.87	1.82
query20	0.01	0.02	0.01
query21	15.41	0.69	0.67
query22	4.59	7.40	1.81
query23	18.29	1.40	1.30
query24	2.10	0.25	0.23
query25	0.16	0.09	0.09
query26	0.26	0.18	0.17
query27	0.08	0.07	0.07
query28	13.23	1.00	0.98
query29	12.64	3.40	3.36
query30	0.24	0.06	0.05
query31	2.87	0.39	0.38
query32	3.26	0.48	0.47
query33	2.94	3.00	3.03
query34	17.25	4.49	4.44
query35	4.49	4.54	4.48
query36	0.65	0.47	0.48
query37	0.18	0.16	0.15
query38	0.16	0.14	0.16
query39	0.05	0.03	0.03
query40	0.15	0.12	0.13
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 111.39 s
Total hot run time: 32.09 s

@englefly englefly force-pushed the range-sel-threshold branch from 054c1eb to 547828d Compare August 29, 2024 06:10
@englefly englefly changed the title [opt](nereids) set threshold for range-selectivity [opt](nereids) set lower bound for range-selectivity Aug 29, 2024
@englefly
Copy link
Contributor Author

run buildall

2 similar comments
@englefly
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38006 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 547828d76907966fb3880d0a7ac6682382d9ff3d, data reload: false

------ Round 1 ----------------------------------
q1	17665	4877	4253	4253
q2	2012	177	180	177
q3	11931	928	1010	928
q4	10508	717	652	652
q5	7799	2854	2728	2728
q6	223	137	138	137
q7	959	633	627	627
q8	9309	2032	2110	2032
q9	7253	6519	6492	6492
q10	6992	2170	2208	2170
q11	450	254	243	243
q12	400	231	231	231
q13	17945	3092	3052	3052
q14	278	244	245	244
q15	522	481	506	481
q16	573	511	509	509
q17	987	637	712	637
q18	7452	6991	6964	6964
q19	1383	1074	1035	1035
q20	678	342	333	333
q21	4211	3065	3048	3048
q22	1150	1042	1033	1033
Total cold run time: 110680 ms
Total hot run time: 38006 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4409	4305	4347	4305
q2	394	276	276	276
q3	2947	2612	2735	2612
q4	1953	1628	1673	1628
q5	5690	5714	5790	5714
q6	228	147	147	147
q7	2228	1869	1829	1829
q8	3304	3460	3530	3460
q9	9083	9066	9039	9039
q10	3665	3513	3479	3479
q11	647	527	531	527
q12	886	703	693	693
q13	13386	3168	3151	3151
q14	316	289	280	280
q15	534	495	505	495
q16	634	594	619	594
q17	1846	1542	1546	1542
q18	8288	7753	7898	7753
q19	1749	1574	1534	1534
q20	2149	1942	1931	1931
q21	5604	5539	5480	5480
q22	1167	1116	1098	1098
Total cold run time: 71107 ms
Total hot run time: 57567 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193439 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 547828d76907966fb3880d0a7ac6682382d9ff3d, data reload: false

query1	1258	881	870	870
query2	6269	1955	1916	1916
query3	10628	3917	4047	3917
query4	59743	24409	23339	23339
query5	5502	527	499	499
query6	434	160	162	160
query7	5910	301	305	301
query8	305	212	210	210
query9	9049	2525	2488	2488
query10	504	284	270	270
query11	18330	14951	15210	14951
query12	168	104	98	98
query13	1588	408	406	406
query14	11396	7287	7454	7287
query15	231	183	198	183
query16	7611	462	480	462
query17	1156	588	577	577
query18	2105	318	311	311
query19	306	162	163	162
query20	127	122	118	118
query21	215	110	108	108
query22	4603	4708	4403	4403
query23	34546	33284	33195	33195
query24	5935	2847	2847	2847
query25	538	379	394	379
query26	689	159	154	154
query27	1801	283	284	283
query28	3732	2120	2103	2103
query29	699	415	414	414
query30	238	153	147	147
query31	932	759	787	759
query32	87	54	57	54
query33	462	316	287	287
query34	865	491	477	477
query35	840	710	732	710
query36	1066	947	921	921
query37	151	86	88	86
query38	3939	3827	3858	3827
query39	1485	1429	1401	1401
query40	200	118	118	118
query41	47	45	45	45
query42	116	102	99	99
query43	513	465	465	465
query44	1123	752	754	752
query45	190	163	162	162
query46	1074	757	751	751
query47	1899	1804	1805	1804
query48	386	303	299	299
query49	753	424	443	424
query50	834	426	425	425
query51	7308	7160	7117	7117
query52	97	86	90	86
query53	252	186	189	186
query54	592	480	462	462
query55	79	77	77	77
query56	274	258	268	258
query57	1174	1072	1098	1072
query58	222	272	261	261
query59	3071	2769	2713	2713
query60	292	270	266	266
query61	104	97	104	97
query62	748	666	652	652
query63	225	184	191	184
query64	2886	688	654	654
query65	3266	3127	3196	3127
query66	691	333	340	333
query67	15855	15591	15352	15352
query68	3069	601	592	592
query69	395	284	283	283
query70	1136	1164	1072	1072
query71	378	274	281	274
query72	5949	4085	3895	3895
query73	755	339	340	339
query74	9351	8929	8837	8837
query75	3389	2672	2683	2672
query76	1445	1054	1048	1048
query77	578	334	329	329
query78	9692	10609	9411	9411
query79	1043	549	544	544
query80	683	513	531	513
query81	461	232	234	232
query82	243	150	144	144
query83	173	151	156	151
query84	254	79	79	79
query85	677	299	286	286
query86	308	309	296	296
query87	4361	4304	4234	4234
query88	3100	2376	2338	2338
query89	385	292	343	292
query90	1966	199	199	199
query91	124	100	100	100
query92	64	50	52	50
query93	1028	547	545	545
query94	741	283	277	277
query95	341	271	266	266
query96	585	275	271	271
query97	3219	3114	3071	3071
query98	217	201	203	201
query99	1476	1277	1309	1277
Total cold run time: 307300 ms
Total hot run time: 193439 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.95 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 547828d76907966fb3880d0a7ac6682382d9ff3d, data reload: false

query1	0.05	0.05	0.04
query2	0.08	0.04	0.04
query3	0.22	0.04	0.05
query4	1.68	0.08	0.09
query5	0.51	0.50	0.49
query6	1.13	0.74	0.73
query7	0.02	0.01	0.02
query8	0.06	0.04	0.04
query9	0.55	0.49	0.49
query10	0.55	0.54	0.54
query11	0.16	0.12	0.11
query12	0.16	0.11	0.12
query13	0.62	0.58	0.59
query14	2.05	2.05	2.08
query15	0.89	0.82	0.81
query16	0.35	0.38	0.38
query17	1.01	1.07	1.05
query18	0.22	0.20	0.19
query19	1.92	1.75	1.87
query20	0.01	0.02	0.01
query21	15.39	0.70	0.67
query22	4.56	7.04	1.90
query23	18.30	1.35	1.24
query24	2.13	0.24	0.23
query25	0.15	0.08	0.09
query26	0.26	0.18	0.17
query27	0.09	0.08	0.08
query28	13.21	1.01	1.00
query29	12.59	3.30	3.34
query30	0.24	0.06	0.06
query31	2.88	0.39	0.39
query32	3.26	0.50	0.48
query33	2.99	3.07	2.95
query34	17.16	4.44	4.41
query35	4.51	4.43	4.39
query36	0.65	0.49	0.48
query37	0.20	0.17	0.16
query38	0.17	0.15	0.16
query39	0.04	0.04	0.03
query40	0.16	0.13	0.12
query41	0.10	0.04	0.05
query42	0.06	0.05	0.04
query43	0.04	0.04	0.04
Total cold run time: 111.38 s
Total hot run time: 31.95 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 30, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@englefly englefly merged commit d580a0a into apache:master Aug 30, 2024
29 of 31 checks passed
@englefly englefly deleted the range-sel-threshold branch August 30, 2024 03:29
englefly added a commit to englefly/incubator-doris that referenced this pull request Sep 20, 2024
englefly added a commit that referenced this pull request Sep 21, 2024
## Proposed changes
pick #40089
Issue Number: close #xxx

<!--Describe your changes.-->
englefly added a commit that referenced this pull request Sep 22, 2024
## Proposed changes
pick  #40089
Issue Number: close #xxx

<!--Describe your changes.-->
dataroaring pushed a commit that referenced this pull request Oct 9, 2024
## Proposed changes
Range selectivity is prone to producing outliers, so we add this
threshold limit.
The threshold estimation is calculated based on selecting one month out
of fifty years.

Issue Number: close #xxx

<!--Describe your changes.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants