Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](local-tvf) support local tvf on shared storage #33050

Merged
merged 2 commits into from
Apr 1, 2024

Conversation

morningman
Copy link
Contributor

@morningman morningman commented Mar 29, 2024

Proposed changes

Previously, local tvf can only query data on one BE node.
But if the storage is shared(eg, NAS), it can be executed on multi nodes.

This PR mainly changes:

  1. Add a new property "shared_stoage" = "false/true"

    Default is false, if set to true, "backend_id" is optional. If "backend_id" is set,
    it still be executed on that BE, if not set, "shared_stoage" must be "true"
    and it will be executed on multi nodes.

Doc: apache/doris-website#494

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Comment on lines 86 to 88
} else if (backendId != -1 && sharedStorage) {
throw new AnalysisException("'shared_storage' should be false when 'backend_id' is set.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else-if logic is inconsistent with the pr description?

Copy link
Contributor

@BePPPower BePPPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38552 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 224f6b29f9cd1fd90721969989961a14e2b95ef7, data reload: false

------ Round 1 ----------------------------------
q1	17642	4117	4048	4048
q2	2096	189	185	185
q3	10483	1223	1379	1223
q4	10203	861	1000	861
q5	7473	2924	2891	2891
q6	213	131	131	131
q7	1081	643	607	607
q8	9397	2045	2022	2022
q9	6710	6201	6101	6101
q10	8453	3488	3487	3487
q11	418	243	236	236
q12	391	221	214	214
q13	17778	2927	2908	2908
q14	271	243	235	235
q15	523	482	467	467
q16	500	382	383	382
q17	947	886	906	886
q18	7331	6571	6433	6433
q19	1674	1521	1517	1517
q20	612	319	309	309
q21	3526	3099	3104	3099
q22	350	310	312	310
Total cold run time: 108072 ms
Total hot run time: 38552 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4063	4073	4050	4050
q2	339	229	228	228
q3	2945	3039	2951	2951
q4	1882	1850	1765	1765
q5	5215	5186	5209	5186
q6	209	127	125	125
q7	2220	1828	1794	1794
q8	3204	3247	3268	3247
q9	8418	8434	8408	8408
q10	3748	3961	4018	3961
q11	550	469	483	469
q12	770	615	589	589
q13	16825	3049	3079	3049
q14	322	269	280	269
q15	537	499	493	493
q16	514	475	444	444
q17	1765	1735	1693	1693
q18	8171	7667	7854	7667
q19	1659	1660	1671	1660
q20	2037	1800	1840	1800
q21	5086	4942	4896	4896
q22	505	449	440	440
Total cold run time: 70984 ms
Total hot run time: 55184 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181777 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 224f6b29f9cd1fd90721969989961a14e2b95ef7, data reload: false

query1	1207	1131	1123	1123
query2	6339	2039	1943	1943
query3	6664	207	210	207
query4	25370	21536	21575	21536
query5	4177	402	398	398
query6	272	194	181	181
query7	4593	294	296	294
query8	233	174	176	174
query9	8478	2197	2227	2197
query10	447	249	254	249
query11	14983	14490	14539	14490
query12	143	96	98	96
query13	1630	376	395	376
query14	8545	6962	6942	6942
query15	211	177	179	177
query16	6872	287	279	279
query17	973	606	586	586
query18	1876	290	292	290
query19	209	164	164	164
query20	97	93	94	93
query21	201	138	132	132
query22	5017	4911	4837	4837
query23	33120	32534	32738	32534
query24	12974	3247	3280	3247
query25	733	465	446	446
query26	2002	167	162	162
query27	3188	390	398	390
query28	7049	1890	1853	1853
query29	1340	632	641	632
query30	310	176	165	165
query31	1079	774	761	761
query32	109	65	64	64
query33	702	267	260	260
query34	1190	508	524	508
query35	859	755	749	749
query36	1018	915	896	896
query37	216	81	76	76
query38	3748	3613	3544	3544
query39	1068	1046	1062	1046
query40	224	147	140	140
query41	49	46	45	45
query42	116	111	107	107
query43	458	414	411	411
query44	1153	727	736	727
query45	290	290	286	286
query46	1096	824	825	824
query47	1985	1910	1929	1910
query48	403	323	328	323
query49	954	387	395	387
query50	829	408	419	408
query51	6982	6661	6824	6661
query52	111	113	107	107
query53	380	305	295	295
query54	330	235	232	232
query55	90	85	89	85
query56	260	240	237	237
query57	1304	1197	1201	1197
query58	252	227	233	227
query59	2665	2424	2438	2424
query60	263	240	233	233
query61	92	90	90	90
query62	679	443	437	437
query63	312	280	290	280
query64	5828	3073	3104	3073
query65	3030	2987	3017	2987
query66	1327	335	332	332
query67	15479	15158	14842	14842
query68	9454	557	595	557
query69	589	331	336	331
query70	1409	1106	1119	1106
query71	517	271	285	271
query72	6419	2607	2415	2415
query73	1553	324	324	324
query74	6774	6440	6300	6300
query75	3815	2286	2305	2286
query76	5802	1070	1197	1070
query77	574	264	245	245
query78	11017	10087	10187	10087
query79	9488	530	526	526
query80	1778	416	422	416
query81	511	226	222	222
query82	334	102	104	102
query83	214	163	161	161
query84	269	88	96	88
query85	947	290	289	289
query86	364	288	275	275
query87	3682	3504	3506	3504
query88	3638	2313	2291	2291
query89	541	359	370	359
query90	2021	178	177	177
query91	136	108	107	107
query92	59	51	49	49
query93	6764	527	542	527
query94	1190	187	195	187
query95	423	326	331	326
query96	625	273	270	270
query97	2695	2442	2471	2442
query98	252	211	219	211
query99	1159	900	852	852
Total cold run time: 301196 ms
Total hot run time: 181777 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.82 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 224f6b29f9cd1fd90721969989961a14e2b95ef7, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.04
query3	0.24	0.05	0.05
query4	1.68	0.06	0.06
query5	0.48	0.47	0.50
query6	1.14	0.65	0.65
query7	0.01	0.01	0.01
query8	0.05	0.04	0.04
query9	0.57	0.50	0.52
query10	0.56	0.55	0.58
query11	0.14	0.12	0.11
query12	0.14	0.11	0.11
query13	0.61	0.60	0.59
query14	0.79	0.79	0.77
query15	0.85	0.82	0.84
query16	0.36	0.35	0.35
query17	0.97	0.97	0.99
query18	0.24	0.25	0.25
query19	1.79	1.74	1.71
query20	0.02	0.01	0.01
query21	15.53	0.78	0.66
query22	3.49	5.90	1.51
query23	17.54	1.26	1.15
query24	1.44	0.22	0.23
query25	0.13	0.10	0.08
query26	0.27	0.18	0.19
query27	0.08	0.08	0.08
query28	13.82	0.95	0.95
query29	12.59	3.39	3.40
query30	0.28	0.09	0.08
query31	2.80	0.39	0.39
query32	3.29	0.47	0.48
query33	2.84	2.84	2.81
query34	15.48	4.33	4.31
query35	4.38	4.39	4.34
query36	0.66	0.47	0.47
query37	0.20	0.16	0.18
query38	0.17	0.17	0.16
query39	0.04	0.04	0.04
query40	0.18	0.15	0.14
query41	0.10	0.06	0.05
query42	0.06	0.06	0.05
query43	0.04	0.04	0.05
Total cold run time: 106.16 s
Total hot run time: 29.82 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 224f6b29f9cd1fd90721969989961a14e2b95ef7 with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       15.6 seconds inserted 10000000 Rows, about 641K ops/s

Copy link
Contributor

github-actions bot commented Apr 1, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 1, 2024
@morningman morningman merged commit 8506df5 into apache:master Apr 1, 2024
27 of 30 checks passed
morningman added a commit to morningman/doris that referenced this pull request Apr 7, 2024
Previously, local tvf can only query data on one BE node.
But if the storage is shared(eg, NAS), it can be executed on multi nodes.

This PR mainly changes:
1. Add a new property `"shared_stoage" = "false/true"`

    Default is false, if set to true, "backend_id" is optional. If "backend_id" is set,
    it still be executed on that BE, if not set, "shared_stoage" must be "true"
    and it will be executed on multi nodes.

Doc: apache/doris-website#494
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants