Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](inverted index) fix wrong no need read data when need_remaining_after_evaluate #36637

Merged
merged 2 commits into from
Jun 21, 2024

Conversation

airborne12
Copy link
Member

Proposed changes

When using an equal predicate on a column that applies an inverted index with a parser, it requires remaining_after_evaluate. In this situation, we cannot optimize the column without reading the data.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@airborne12
Copy link
Member Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39775 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 66c1dbe333c5d1dedbbbafb2bb1b6073079334eb, data reload: false

------ Round 1 ----------------------------------
q1	17696	5817	4358	4358
q2	2638	208	194	194
q3	11248	1138	1118	1118
q4	11288	792	804	792
q5	7557	2781	2668	2668
q6	224	142	138	138
q7	969	633	605	605
q8	9245	2087	2107	2087
q9	9393	6524	6450	6450
q10	8902	3654	3638	3638
q11	491	241	237	237
q12	406	236	223	223
q13	17774	3011	3007	3007
q14	280	232	223	223
q15	521	471	462	462
q16	487	398	376	376
q17	968	700	696	696
q18	8047	7434	7379	7379
q19	4193	1524	1433	1433
q20	652	319	339	319
q21	4832	3038	3158	3038
q22	378	336	334	334
Total cold run time: 118189 ms
Total hot run time: 39775 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4309	4234	4195	4195
q2	366	266	265	265
q3	2975	2699	2730	2699
q4	1863	1614	1669	1614
q5	5243	5256	5251	5251
q6	215	126	133	126
q7	2072	1719	1737	1719
q8	3198	3305	3307	3305
q9	8264	8333	8296	8296
q10	3844	3694	3671	3671
q11	582	490	483	483
q12	762	601	589	589
q13	17366	2978	2996	2978
q14	294	246	276	246
q15	521	482	476	476
q16	466	415	403	403
q17	1766	1473	1471	1471
q18	7516	7546	7503	7503
q19	1667	1464	1575	1464
q20	1994	1753	1759	1753
q21	4875	4777	4599	4599
q22	592	561	556	556
Total cold run time: 70750 ms
Total hot run time: 53662 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.48% (9009/24695)
Line Coverage: 28.03% (73887/263630)
Region Coverage: 27.52% (38381/139486)
Branch Coverage: 24.21% (19561/80790)
Coverage Report: http://coverage.selectdb-in.cc/coverage/66c1dbe333c5d1dedbbbafb2bb1b6073079334eb_66c1dbe333c5d1dedbbbafb2bb1b6073079334eb/report/index.html

@airborne12 airborne12 force-pushed the fix-need-read-data branch from 66c1dbe to d352d7e Compare June 21, 2024 07:32
@airborne12
Copy link
Member Author

run buildall

@airborne12
Copy link
Member Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.48% (9009/24699)
Line Coverage: 28.02% (73890/263737)
Region Coverage: 27.49% (38374/139592)
Branch Coverage: 24.19% (19564/80860)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e0a8911176e1edbfb7c0a848393d35be029bb9f2_e0a8911176e1edbfb7c0a848393d35be029bb9f2/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 39748 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e0a8911176e1edbfb7c0a848393d35be029bb9f2, data reload: false

------ Round 1 ----------------------------------
q1	17609	4340	4226	4226
q2	2042	196	194	194
q3	10463	1096	1042	1042
q4	10194	794	808	794
q5	7466	2670	2614	2614
q6	223	140	139	139
q7	965	626	620	620
q8	9225	2082	2086	2082
q9	8892	6469	6486	6469
q10	8960	3757	3730	3730
q11	469	257	243	243
q12	554	245	234	234
q13	18726	2983	3011	2983
q14	265	225	222	222
q15	517	465	482	465
q16	512	380	390	380
q17	971	681	684	681
q18	8040	7428	7370	7370
q19	4443	1449	1497	1449
q20	674	322	336	322
q21	5006	3149	3837	3149
q22	412	340	355	340
Total cold run time: 116628 ms
Total hot run time: 39748 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4383	4230	4239	4230
q2	363	257	271	257
q3	2969	2882	2915	2882
q4	2012	1764	1698	1698
q5	5624	5513	5457	5457
q6	225	134	132	132
q7	2207	1968	1825	1825
q8	3300	3445	3422	3422
q9	8663	8725	8783	8725
q10	4087	3857	3707	3707
q11	583	507	485	485
q12	837	672	657	657
q13	15858	3188	3197	3188
q14	307	291	281	281
q15	516	477	489	477
q16	488	430	447	430
q17	1809	1528	1475	1475
q18	8028	7777	7879	7777
q19	1802	1517	1606	1517
q20	3048	1980	1886	1886
q21	7819	4913	4893	4893
q22	657	588	572	572
Total cold run time: 75585 ms
Total hot run time: 55973 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174968 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e0a8911176e1edbfb7c0a848393d35be029bb9f2, data reload: false

query1	921	383	390	383
query2	7673	2316	2410	2316
query3	6645	215	222	215
query4	19262	17393	17350	17350
query5	3666	500	468	468
query6	244	162	165	162
query7	4590	305	289	289
query8	321	298	312	298
query9	8574	2452	2429	2429
query10	586	291	283	283
query11	10507	10087	10112	10087
query12	130	90	85	85
query13	1631	364	366	364
query14	10052	7175	7771	7175
query15	232	198	193	193
query16	7746	267	264	264
query17	1906	547	525	525
query18	1956	277	272	272
query19	190	162	169	162
query20	117	88	85	85
query21	208	136	131	131
query22	4361	4065	3956	3956
query23	33787	33757	33715	33715
query24	11105	3012	2948	2948
query25	640	405	383	383
query26	1220	164	159	159
query27	2484	330	334	330
query28	7278	2137	2129	2129
query29	906	653	662	653
query30	245	179	159	159
query31	973	804	783	783
query32	100	53	55	53
query33	749	284	284	284
query34	1045	486	495	486
query35	755	654	640	640
query36	1125	998	990	990
query37	161	70	76	70
query38	2973	2872	2848	2848
query39	928	825	851	825
query40	215	132	131	131
query41	55	55	55	55
query42	119	111	111	111
query43	608	558	548	548
query44	1269	737	756	737
query45	201	167	166	166
query46	1089	711	732	711
query47	1852	1755	1743	1743
query48	381	301	293	293
query49	870	441	417	417
query50	773	399	388	388
query51	6994	6839	6823	6823
query52	109	95	91	91
query53	366	302	298	298
query54	922	442	446	442
query55	76	73	72	72
query56	277	257	270	257
query57	1135	1036	1073	1036
query58	255	249	276	249
query59	3400	3291	3179	3179
query60	292	274	276	274
query61	97	93	92	92
query62	603	451	441	441
query63	330	297	308	297
query64	8825	2275	1757	1757
query65	3204	3104	3112	3104
query66	748	328	331	328
query67	15333	14937	14966	14937
query68	4530	520	532	520
query69	579	412	406	406
query70	1165	1148	1173	1148
query71	413	280	287	280
query72	7735	5560	5603	5560
query73	743	321	328	321
query74	5854	5531	5521	5521
query75	3510	2684	2685	2684
query76	2705	1000	967	967
query77	609	308	310	308
query78	10414	9721	9842	9721
query79	2576	532	516	516
query80	2199	476	461	461
query81	610	223	272	223
query82	1204	114	104	104
query83	290	176	177	176
query84	257	84	98	84
query85	1421	286	276	276
query86	479	308	303	303
query87	3226	3052	3066	3052
query88	3992	2366	2351	2351
query89	479	396	389	389
query90	1762	198	192	192
query91	131	99	104	99
query92	63	50	52	50
query93	2326	501	491	491
query94	1115	200	187	187
query95	424	324	328	324
query96	600	274	274	274
query97	3249	3082	3022	3022
query98	221	201	197	197
query99	1134	848	837	837
Total cold run time: 274405 ms
Total hot run time: 174968 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.6 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e0a8911176e1edbfb7c0a848393d35be029bb9f2, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.04	0.04
query3	0.22	0.06	0.06
query4	1.65	0.08	0.08
query5	0.50	0.50	0.47
query6	1.14	0.73	0.72
query7	0.02	0.02	0.01
query8	0.05	0.05	0.05
query9	0.56	0.49	0.48
query10	0.54	0.54	0.54
query11	0.16	0.11	0.12
query12	0.16	0.12	0.12
query13	0.60	0.58	0.60
query14	0.80	0.80	0.78
query15	0.85	0.81	0.81
query16	0.37	0.36	0.37
query17	1.01	1.00	1.01
query18	0.25	0.23	0.26
query19	1.93	1.77	1.72
query20	0.02	0.01	0.01
query21	15.43	0.67	0.66
query22	4.69	7.43	1.87
query23	18.26	1.46	1.25
query24	2.10	0.25	0.24
query25	0.16	0.09	0.08
query26	0.27	0.18	0.18
query27	0.08	0.08	0.08
query28	13.24	1.04	1.01
query29	12.61	3.27	3.37
query30	0.26	0.07	0.06
query31	2.88	0.41	0.38
query32	3.23	0.48	0.47
query33	2.90	2.92	2.91
query34	17.01	4.44	4.42
query35	4.46	4.44	4.55
query36	0.65	0.46	0.47
query37	0.18	0.17	0.16
query38	0.16	0.15	0.15
query39	0.04	0.03	0.04
query40	0.18	0.14	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.93 s
Total hot run time: 30.6 s

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 21, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaokang xiaokang merged commit 5f65daf into apache:master Jun 21, 2024
25 of 29 checks passed
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jun 21, 2024
…_after_evaluate (apache#36637)

When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jun 21, 2024
…_after_evaluate (apache#36637)

When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jun 21, 2024
…_after_evaluate (apache#36637)

When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
airborne12 added a commit that referenced this pull request Jun 21, 2024
…ining_after_evaluate (#36684)

When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.

## Proposed changes

From (#36637)
dataroaring pushed a commit that referenced this pull request Jun 21, 2024
…_after_evaluate (#36637)

When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.
xiaokang pushed a commit that referenced this pull request Jun 23, 2024
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants