Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore](conf) Specify UTF8 as the default charset. #39521

Merged
merged 1 commit into from
Sep 2, 2024

Conversation

yagagagaga
Copy link
Contributor

@yagagagaga yagagagaga commented Aug 17, 2024

According the JEP 400, UTF-8 was be used as the default charset of the JavaSE 18. But when you use the version below 18, the default charset depends on your locale. Usually, it can work well in many sence, but Apache Doris only supports UTF-8 as its charset, which may occur some decoding incorrectly.

So it is necessary to set UTF8 as the default JDK charset.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yagagagaga
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38886 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6085bda4f6b89a675712f8808f1d319e9b9f0c6e, data reload: false

------ Round 1 ----------------------------------
q1	17938	4389	4256	4256
q2	2049	220	226	220
q3	12035	977	1195	977
q4	10525	716	738	716
q5	7785	2878	2818	2818
q6	263	161	155	155
q7	1015	677	670	670
q8	11111	2104	2078	2078
q9	8830	6563	6595	6563
q10	7100	2247	2213	2213
q11	492	271	269	269
q12	436	259	255	255
q13	18544	3029	3170	3029
q14	318	280	272	272
q15	598	533	540	533
q16	568	443	434	434
q17	1027	782	737	737
q18	8448	7324	7261	7261
q19	7625	1141	1116	1116
q20	849	370	362	362
q21	3984	3145	2904	2904
q22	1172	1048	1065	1048
Total cold run time: 122712 ms
Total hot run time: 38886 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6509	4351	4293	4293
q2	428	327	328	327
q3	2974	2784	2730	2730
q4	1977	1707	1707	1707
q5	5592	5610	5560	5560
q6	242	143	147	143
q7	2237	1838	1772	1772
q8	3325	3455	3423	3423
q9	8788	8754	8841	8754
q10	3500	3264	3244	3244
q11	628	550	519	519
q12	862	659	648	648
q13	15938	3060	3155	3060
q14	335	305	305	305
q15	596	519	527	519
q16	507	461	460	460
q17	1831	1555	1527	1527
q18	8289	7761	7623	7623
q19	10004	1614	1585	1585
q20	2180	1882	1887	1882
q21	5519	5342	5272	5272
q22	1229	1076	1097	1076
Total cold run time: 83490 ms
Total hot run time: 56429 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195283 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6085bda4f6b89a675712f8808f1d319e9b9f0c6e, data reload: false

query1	1295	901	880	880
query2	6624	2013	1964	1964
query3	10615	3966	3853	3853
query4	58824	23996	23297	23297
query5	6222	705	719	705
query6	546	213	216	213
query7	6456	324	332	324
query8	544	430	433	430
query9	9246	2596	2579	2579
query10	626	339	336	336
query11	18655	15111	15547	15111
query12	197	127	132	127
query13	1694	438	431	431
query14	12650	6752	7647	6752
query15	337	194	191	191
query16	7184	527	519	519
query17	1210	635	591	591
query18	2075	335	341	335
query19	311	170	167	167
query20	144	138	155	138
query21	250	145	144	144
query22	4584	4313	4322	4313
query23	34274	33677	33873	33677
query24	5779	3044	2976	2976
query25	592	447	443	443
query26	745	189	180	180
query27	1848	300	300	300
query28	3835	2199	2118	2118
query29	718	433	435	433
query30	233	181	185	181
query31	1037	805	822	805
query32	123	78	79	78
query33	546	337	354	337
query34	901	515	495	495
query35	891	768	761	761
query36	1055	934	954	934
query37	154	103	111	103
query38	4060	3871	3951	3871
query39	1575	1490	1453	1453
query40	239	152	157	152
query41	139	138	137	137
query42	137	116	117	116
query43	533	513	514	513
query44	1130	789	787	787
query45	226	195	188	188
query46	1112	785	762	762
query47	1916	1821	1896	1821
query48	407	336	334	334
query49	918	577	575	575
query50	861	469	461	461
query51	6794	6793	6729	6729
query52	122	108	108	108
query53	298	227	229	227
query54	615	505	497	497
query55	92	91	89	89
query56	328	312	307	307
query57	1228	1127	1147	1127
query58	298	298	312	298
query59	2983	2826	2757	2757
query60	346	331	328	328
query61	149	147	145	145
query62	752	674	679	674
query63	286	225	225	225
query64	4381	2345	1857	1857
query65	3239	3168	3180	3168
query66	1050	664	672	664
query67	15375	14998	15175	14998
query68	6045	567	582	567
query69	523	360	330	330
query70	1207	1128	1064	1064
query71	490	317	308	308
query72	6755	2368	1978	1978
query73	824	353	354	353
query74	9402	8780	9096	8780
query75	3591	2747	2747	2747
query76	3185	984	1049	984
query77	654	440	447	440
query78	9806	9342	9080	9080
query79	7428	557	561	557
query80	1706	616	608	608
query81	587	267	259	259
query82	906	162	158	158
query83	343	212	215	212
query84	287	95	99	95
query85	1018	355	358	355
query86	421	331	325	325
query87	4373	4241	4246	4241
query88	4439	2486	2469	2469
query89	560	322	325	322
query90	1985	223	228	223
query91	153	126	124	124
query92	83	72	75	72
query93	5200	549	546	546
query94	789	321	318	318
query95	384	296	295	295
query96	612	285	285	285
query97	3316	3088	3115	3088
query98	250	230	243	230
query99	1634	1311	1281	1281
Total cold run time: 333565 ms
Total hot run time: 195283 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.9 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6085bda4f6b89a675712f8808f1d319e9b9f0c6e, data reload: false

query1	0.05	0.05	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.06
query4	1.67	0.08	0.06
query5	0.50	0.48	0.50
query6	1.12	0.73	0.74
query7	0.02	0.02	0.02
query8	0.05	0.05	0.05
query9	0.55	0.49	0.48
query10	0.56	0.54	0.55
query11	0.16	0.12	0.12
query12	0.15	0.13	0.13
query13	0.61	0.61	0.60
query14	0.78	0.78	0.79
query15	0.90	0.83	0.82
query16	0.38	0.38	0.39
query17	1.01	0.98	1.03
query18	0.22	0.21	0.22
query19	1.84	1.66	1.72
query20	0.02	0.01	0.01
query21	15.42	0.68	0.67
query22	3.68	8.09	1.74
query23	18.27	1.45	1.33
query24	2.09	0.22	0.23
query25	0.16	0.09	0.08
query26	0.31	0.24	0.24
query27	0.46	0.23	0.23
query28	13.29	1.02	1.02
query29	12.69	3.34	3.27
query30	0.44	0.24	0.19
query31	2.81	0.40	0.39
query32	3.24	0.48	0.48
query33	2.96	2.96	2.98
query34	17.09	4.34	4.38
query35	4.39	4.39	4.44
query36	0.66	0.50	0.47
query37	0.21	0.18	0.17
query38	0.16	0.16	0.16
query39	0.07	0.05	0.05
query40	0.17	0.15	0.15
query41	0.11	0.06	0.06
query42	0.07	0.07	0.06
query43	0.06	0.06	0.05
Total cold run time: 109.71 s
Total hot run time: 30.9 s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 21, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman added kind/behavior-changed dev/2.0.x dev/2.1.x and removed approved Indicates a PR has been approved by one committer. reviewed labels Aug 21, 2024
@morningman
Copy link
Contributor

run buildall

@yagagagaga yagagagaga changed the title [chore](conf) Using UTF8 as the default character [chore](conf) Using UTF8 as the default charset. Aug 21, 2024
@yagagagaga yagagagaga changed the title [chore](conf) Using UTF8 as the default charset. [chore](conf) Specify UTF8 as the default charset. Aug 21, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 21, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 38760 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bf60f367dbc645047c814abdfa22752b9281bb9e, data reload: false

------ Round 1 ----------------------------------
q1	17898	4491	4332	4332
q2	2046	220	221	220
q3	11583	1007	1203	1007
q4	10553	761	725	725
q5	7777	2854	2879	2854
q6	264	164	157	157
q7	1009	654	653	653
q8	9384	2144	2151	2144
q9	7003	6619	6556	6556
q10	7089	2261	2243	2243
q11	517	269	275	269
q12	435	268	263	263
q13	17796	3003	3064	3003
q14	305	259	267	259
q15	569	536	540	536
q16	549	440	417	417
q17	998	758	734	734
q18	7433	6894	6904	6894
q19	1464	1185	1204	1185
q20	683	369	360	360
q21	4127	3223	2879	2879
q22	1159	1070	1074	1070
Total cold run time: 110641 ms
Total hot run time: 38760 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4377	4361	4357	4357
q2	418	338	311	311
q3	2887	2693	2726	2693
q4	1973	1721	1736	1721
q5	5730	5762	5727	5727
q6	245	158	153	153
q7	2248	1884	1858	1858
q8	3295	3453	3431	3431
q9	8822	8864	8822	8822
q10	3635	3370	3345	3345
q11	667	571	567	567
q12	877	696	662	662
q13	16366	3235	3252	3235
q14	344	313	311	311
q15	581	529	527	527
q16	522	459	448	448
q17	1833	1577	1535	1535
q18	8266	8018	7810	7810
q19	1797	1538	1511	1511
q20	2168	1937	1975	1937
q21	5668	5555	5504	5504
q22	1229	1108	1113	1108
Total cold run time: 73948 ms
Total hot run time: 57573 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198371 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bf60f367dbc645047c814abdfa22752b9281bb9e, data reload: false

query1	1343	910	879	879
query2	6534	2056	2020	2020
query3	10612	4090	3980	3980
query4	59509	27634	23463	23463
query5	5588	720	735	720
query6	465	219	219	219
query7	5897	332	323	323
query8	510	440	431	431
query9	8978	2571	2545	2545
query10	567	348	350	348
query11	17994	15100	15711	15100
query12	189	146	139	139
query13	1648	460	439	439
query14	10832	7030	6740	6740
query15	250	200	197	197
query16	7535	527	553	527
query17	1380	633	619	619
query18	1946	354	371	354
query19	314	189	178	178
query20	158	143	146	143
query21	245	147	151	147
query22	4739	4640	4518	4518
query23	34619	34165	34227	34165
query24	6314	3096	2952	2952
query25	586	433	423	423
query26	712	181	183	181
query27	1988	316	317	316
query28	4089	2184	2174	2174
query29	730	464	456	456
query30	236	210	191	191
query31	1085	828	824	824
query32	108	79	78	78
query33	538	347	349	347
query34	909	528	520	520
query35	881	772	754	754
query36	1108	974	1019	974
query37	164	104	106	104
query38	4069	3930	4013	3930
query39	1531	1494	1482	1482
query40	228	156	154	154
query41	143	138	139	138
query42	135	119	113	113
query43	571	555	510	510
query44	1157	795	805	795
query45	231	197	197	197
query46	1159	793	767	767
query47	1968	1910	1891	1891
query48	420	342	339	339
query49	932	594	597	594
query50	896	472	478	472
query51	7256	7269	7021	7021
query52	121	108	107	107
query53	300	225	227	225
query54	624	511	519	511
query55	94	89	89	89
query56	339	328	325	325
query57	1259	1144	1126	1126
query58	305	339	298	298
query59	3140	2996	2926	2926
query60	399	333	330	330
query61	158	156	153	153
query62	846	696	728	696
query63	258	227	229	227
query64	3500	1910	1882	1882
query65	3286	3187	3220	3187
query66	1025	682	672	672
query67	15489	15381	15312	15312
query68	4487	611	599	599
query69	452	339	326	326
query70	1176	1174	1124	1124
query71	389	320	313	313
query72	6574	2369	2157	2157
query73	802	370	368	368
query74	9202	8899	9075	8899
query75	3453	2795	2804	2795
query76	1793	1092	1003	1003
query77	660	450	452	450
query78	9838	9335	9132	9132
query79	1081	563	559	559
query80	880	628	675	628
query81	584	270	263	263
query82	276	162	164	162
query83	235	214	215	214
query84	288	103	99	99
query85	797	362	368	362
query86	339	333	323	323
query87	4489	4392	4271	4271
query88	3514	2542	2533	2533
query89	424	321	325	321
query90	2031	230	233	230
query91	155	129	133	129
query92	88	77	75	75
query93	1083	555	556	555
query94	800	350	337	337
query95	396	301	309	301
query96	608	293	295	293
query97	3250	3088	3197	3088
query98	242	270	224	224
query99	1590	1281	1349	1281
Total cold run time: 315115 ms
Total hot run time: 198371 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit bf60f367dbc645047c814abdfa22752b9281bb9e, data reload: false

query1	0.05	0.04	0.05
query2	0.08	0.04	0.04
query3	0.23	0.06	0.06
query4	1.65	0.08	0.08
query5	0.55	0.49	0.50
query6	1.13	0.72	0.74
query7	0.02	0.01	0.02
query8	0.05	0.06	0.05
query9	0.54	0.50	0.48
query10	0.54	0.54	0.55
query11	0.16	0.13	0.12
query12	0.16	0.13	0.13
query13	0.63	0.60	0.58
query14	0.77	0.79	0.78
query15	0.84	0.85	0.84
query16	0.38	0.38	0.38
query17	1.07	1.08	1.06
query18	0.22	0.21	0.22
query19	1.86	1.84	1.80
query20	0.01	0.01	0.01
query21	15.39	0.66	0.66
query22	4.60	6.71	1.93
query23	18.31	1.36	1.26
query24	2.12	0.22	0.22
query25	0.16	0.09	0.08
query26	0.28	0.19	0.18
query27	0.09	0.09	0.09
query28	13.33	1.03	1.02
query29	12.68	3.44	3.39
query30	0.43	0.19	0.19
query31	2.81	0.40	0.39
query32	3.26	0.48	0.48
query33	2.99	3.01	3.01
query34	16.92	4.42	4.44
query35	4.46	4.46	4.54
query36	0.67	0.48	0.48
query37	0.21	0.18	0.17
query38	0.17	0.16	0.16
query39	0.06	0.05	0.06
query40	0.20	0.15	0.14
query41	0.11	0.06	0.06
query42	0.08	0.06	0.06
query43	0.06	0.06	0.06
Total cold run time: 110.33 s
Total hot run time: 31.38 s

@yagagagaga
Copy link
Contributor Author

run cloud_p0

@yagagagaga
Copy link
Contributor Author

run cloud_p1

@yagagagaga
Copy link
Contributor Author

run external

2 similar comments
@yagagagaga
Copy link
Contributor Author

run external

@yagagagaga
Copy link
Contributor Author

run external

@gavinchou gavinchou merged commit 48991df into apache:master Sep 2, 2024
28 of 30 checks passed
dataroaring pushed a commit that referenced this pull request Sep 3, 2024
According the [JEP 400](https://openjdk.org/jeps/400), UTF-8 was be used
as the default charset of the JavaSE 18. But when you use the version
below 18, the default charset depends on your locale. Usually, it can
work well in many sence, but Apache Doris only supports UTF-8 as its
charset, which may occur some decoding incorrectly.

So it is necessary to set UTF8 as the default JDK charset.
yiguolei pushed a commit that referenced this pull request Sep 12, 2024
## Proposed changes

pick from  #39521

<!--Describe your changes.-->
@yagagagaga yagagagaga deleted the chore_20240817 branch February 25, 2025 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants