Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](kerberos) optimize kerbero options, remove BE kinit, just use UGI login by newInstanceFromKeytab #29291

Merged
merged 1 commit into from
Jan 12, 2024

Conversation

wsjz
Copy link
Contributor

@wsjz wsjz commented Dec 29, 2023

Proposed changes

  1. we need remove BE kinit, and use jni login with keytab, because kinit cannot renew TGT for doris in many complex cases.

This pull requet will support new instance from keytab: apache/doris-thirdparty#173, so now we won't need kinit cmd, just login with keytab and principal

  1. add kerberos_ccache_path to set kerberos credentials cache path manually.

  2. add max_hdfs_file_handle_cache_time_ms to set hdfs fs handle cache time.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@wsjz wsjz marked this pull request as ready for review January 9, 2024 08:43
@wsjz wsjz changed the title [fix](kerberos) optimize kerbero options [fix](kerberos) optimize kerbero options, remove BE kinit, just use UGI login by call newInstanceFromKeytab Jan 9, 2024
@wsjz wsjz changed the title [fix](kerberos) optimize kerbero options, remove BE kinit, just use UGI login by call newInstanceFromKeytab [fix](kerberos) optimize kerbero options, remove BE kinit, just use UGI login by newInstanceFromKeytab Jan 9, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@wsjz
Copy link
Contributor Author

wsjz commented Jan 12, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -105,7 +104,9 @@ class HdfsFileHandleCache {

private:
FileHandleCache _cache;
HdfsFileHandleCache() : _cache(config::max_hdfs_file_handle_cache_num, 16, 3600 * 1000L) {}
HdfsFileHandleCache()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use '= default' to define a trivial default constructor [modernize-use-equals-default]

be/src/io/fs/hdfs_file_system.cpp:108:

-                      config::max_hdfs_file_handle_cache_time_sec * 1000L) {};
+                      config::max_hdfs_file_handle_cache_time_sec * 1000L) = default;;

@wsjz
Copy link
Contributor Author

wsjz commented Jan 12, 2024

run buildall

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 12, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.71% (8649/23561)
Line Coverage: 28.73% (70449/245202)
Region Coverage: 27.65% (36411/131694)
Branch Coverage: 24.35% (18618/76450)
Coverage Report: http://coverage.selectdb-in.cc/coverage/41409c18b1efb2c7db26fa2c9813ad53047bec52_41409c18b1efb2c7db26fa2c9813ad53047bec52/report/index.html

@morningman
Copy link
Contributor

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.71% (8649/23561)
Line Coverage: 28.72% (70433/245202)
Region Coverage: 27.65% (36415/131694)
Branch Coverage: 24.37% (18629/76450)
Coverage Report: http://coverage.selectdb-in.cc/coverage/c6346dc52ca171ee65491940d11bce3fe55ebdf3_c6346dc52ca171ee65491940d11bce3fe55ebdf3/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 39283 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c6346dc52ca171ee65491940d11bce3fe55ebdf3, data reload: false

------ Round 1 ----------------------------------
q1	18094	5193	5012	5012
q2	2039	142	136	136
q3	11009	1148	1144	1144
q4	11256	807	804	804
q5	7905	3181	3245	3181
q6	210	129	127	127
q7	915	528	516	516
q8	10035	2030	2028	2028
q9	8140	6669	6621	6621
q10	8299	3063	3055	3055
q11	433	222	210	210
q12	358	193	194	193
q13	18141	3468	3463	3463
q14	239	214	209	209
q15	558	513	512	512
q16	467	381	386	381
q17	971	586	541	541
q18	7483	6805	6753	6753
q19	1565	1386	1393	1386
q20	481	293	291	291
q21	2776	2409	2419	2409
q22	369	311	312	311
Total cold run time: 111743 ms
Total hot run time: 39283 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4988	4936	4946	4936
q2	309	197	208	197
q3	3356	3313	3301	3301
q4	2246	2249	2247	2247
q5	5829	5805	5808	5805
q6	190	117	120	117
q7	2322	1872	1841	1841
q8	3452	3552	3772	3552
q9	8959	8877	8855	8855
q10	3755	3847	3851	3847
q11	565	447	413	413
q12	789	602	609	602
q13	3915	3264	3221	3221
q14	292	265	271	265
q15	567	506	507	506
q16	508	469	468	468
q17	2069	2026	2026	2026
q18	8929	8334	8451	8334
q19	1615	1606	1605	1605
q20	2214	1943	1947	1943
q21	6065	5786	5680	5680
q22	571	464	481	464
Total cold run time: 63505 ms
Total hot run time: 60225 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 179629 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c6346dc52ca171ee65491940d11bce3fe55ebdf3, data reload: false

query1	924	340	324	324
query2	6729	1777	1858	1777
query3	6692	226	212	212
query4	26130	22262	22050	22050
query5	7366	561	596	561
query6	258	191	197	191
query7	4617	276	276	276
query8	221	201	198	198
query9	9217	2799	2799	2799
query10	533	242	241	241
query11	16115	15485	15362	15362
query12	126	75	69	69
query13	1691	376	391	376
query14	12011	7268	7364	7268
query15	264	194	203	194
query16	5347	255	250	250
query17	1906	498	493	493
query18	1946	264	259	259
query19	282	155	155	155
query20	79	84	72	72
query21	201	131	124	124
query22	5081	4938	4614	4614
query23	32050	31329	31792	31329
query24	13738	2957	2903	2903
query25	585	353	362	353
query26	2460	289	180	180
query27	3706	327	297	297
query28	8242	1891	1856	1856
query29	2695	685	673	673
query30	294	138	140	138
query31	987	777	773	773
query32	95	66	60	60
query33	746	243	246	243
query34	1227	484	476	476
query35	933	802	774	774
query36	1447	1368	1391	1368
query37	460	73	101	73
query38	3422	3404	3328	3328
query39	1343	1286	1265	1265
query40	339	102	93	93
query41	39	35	35	35
query42	105	94	103	94
query43	553	516	512	512
query44	1089	705	706	705
query45	217	183	181	181
query46	1059	672	665	665
query47	1691	1556	1551	1551
query48	404	335	331	331
query49	1289	307	312	307
query50	717	320	327	320
query51	5297	5227	5214	5214
query52	100	95	101	95
query53	373	290	288	288
query54	925	458	458	458
query55	99	93	95	93
query56	214	197	191	191
query57	992	970	943	943
query58	219	189	208	189
query59	2728	2517	2595	2517
query60	239	235	204	204
query61	83	82	82	82
query62	674	399	448	399
query63	308	286	292	286
query64	5771	1655	1667	1655
query65	3344	3264	3263	3263
query66	1412	332	331	331
query67	15717	15208	15363	15208
query68	10332	514	508	508
query69	698	400	391	391
query70	1688	1618	1547	1547
query71	561	247	237	237
query72	4686	2834	2859	2834
query73	2045	326	316	316
query74	6930	6459	6366	6366
query75	4958	2272	2361	2272
query76	6285	1119	1079	1079
query77	725	283	267	267
query78	9724	8726	8837	8726
query79	1029	503	497	497
query80	526	336	344	336
query81	448	207	210	207
query82	242	91	95	91
query83	141	120	119	119
query84	284	77	72	72
query85	1021	333	325	325
query86	392	397	406	397
query87	3550	3427	3404	3404
query88	2978	2291	2336	2291
query89	447	387	412	387
query90	2089	206	213	206
query91	164	122	130	122
query92	65	55	53	53
query93	945	398	426	398
query94	1211	178	178	178
query95	532	494	482	482
query96	627	335	327	327
query97	4285	4159	4176	4159
query98	199	188	185	185
query99	975	766	686	686
Total cold run time: 298945 ms
Total hot run time: 179629 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.73 seconds
stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 21.2 seconds inserted 10000000 Rows, about 471K ops/s
storage size: 17183905520 Bytes

@doris-robot
Copy link

ClickBench: Total hot run time: 30.26 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c6346dc52ca171ee65491940d11bce3fe55ebdf3, data reload: false

query1	0.06	0.06	0.05
query2	0.05	0.02	0.02
query3	0.25	0.11	0.10
query4	1.76	0.13	0.11
query5	0.52	0.51	0.52
query6	1.34	0.64	0.63
query7	0.01	0.01	0.02
query8	0.03	0.02	0.02
query9	0.56	0.50	0.50
query10	0.54	0.55	0.54
query11	0.12	0.09	0.09
query12	0.12	0.10	0.09
query13	0.61	0.60	0.59
query14	0.77	0.79	0.78
query15	0.82	0.80	0.80
query16	0.34	0.36	0.35
query17	0.99	0.96	0.97
query18	0.25	0.26	0.24
query19	1.85	1.78	1.72
query20	0.02	0.01	0.01
query21	15.42	0.57	0.57
query22	2.69	2.38	1.61
query23	17.25	0.87	0.80
query24	16.20	0.60	0.61
query25	2.15	0.15	0.16
query26	0.13	0.14	0.13
query27	0.15	0.16	0.15
query28	7.47	0.83	0.80
query29	12.54	3.10	3.21
query30	0.52	0.49	0.48
query31	2.78	0.35	0.36
query32	3.36	0.49	0.48
query33	3.21	3.25	3.25
query34	15.86	4.21	4.19
query35	4.20	4.18	4.21
query36	1.09	1.04	1.06
query37	0.06	0.05	0.05
query38	0.03	0.03	0.03
query39	0.02	0.01	0.02
query40	0.15	0.13	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.01
query43	0.02	0.02	0.02
Total cold run time: 116.42 s
Total hot run time: 30.26 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit c6346dc52ca171ee65491940d11bce3fe55ebdf3 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       12.4 seconds inserted 10000000 Rows, about 806K ops/s

@morningman morningman merged commit df30de0 into apache:master Jan 12, 2024
17 of 18 checks passed
seawinde pushed a commit to seawinde/doris that referenced this pull request Jan 15, 2024
apache#29291)

1. we need  remove BE kinit, and use jni login with keytab, because kinit cannot renew TGT for doris in many complex cases.
> This pull requet will support new instance from keytab: apache/doris-thirdparty#173, so now we  won't need kinit cmd, just login with keytab and principal

2. add `kerberos_ccache_path` to set kerberos credentials cache path manually.

3. add `max_hdfs_file_handle_cache_time_ms` to set hdfs fs handle cache time.
seawinde pushed a commit to seawinde/doris that referenced this pull request Jan 15, 2024
apache#29291)

1. we need  remove BE kinit, and use jni login with keytab, because kinit cannot renew TGT for doris in many complex cases.
> This pull requet will support new instance from keytab: apache/doris-thirdparty#173, so now we  won't need kinit cmd, just login with keytab and principal

2. add `kerberos_ccache_path` to set kerberos credentials cache path manually.

3. add `max_hdfs_file_handle_cache_time_ms` to set hdfs fs handle cache time.
yiguolei pushed a commit that referenced this pull request Jan 16, 2024
#29291)

1. we need  remove BE kinit, and use jni login with keytab, because kinit cannot renew TGT for doris in many complex cases.
> This pull requet will support new instance from keytab: apache/doris-thirdparty#173, so now we  won't need kinit cmd, just login with keytab and principal

2. add `kerberos_ccache_path` to set kerberos credentials cache path manually.

3. add `max_hdfs_file_handle_cache_time_ms` to set hdfs fs handle cache time.
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Jan 22, 2024
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Jan 30, 2024
@wsjz wsjz deleted the krb_opt branch November 18, 2024 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.4-merged dev/3.0.0-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants