Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement](group_commit): refector relay wal code #29183

Merged
merged 7 commits into from
Dec 30, 2023

Conversation

hust-hhb
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@hust-hhb
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -387,6 +387,66 @@ Status HttpStreamAction::_process_put(HttpRequest* http_req,
return _exec_env->stream_load_executor()->execute_plan_fragment(ctx);
}

Status HttpStreamAction::process_wal_relay(ExecEnv* exec_env, int64_t wal_id, std::string& sql_str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'process_wal_relay' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status HttpStreamAction::process_wal_relay(ExecEnv* exec_env, int64_t wal_id, std::string& sql_str,
static Status HttpStreamAction::process_wal_relay(ExecEnv* exec_env, int64_t wal_id, std::string& sql_str,


#include "olap/wal_info.h"
namespace doris {
WalInfo::WalInfo(std::string wal_path, int64_t retry_num, int64_t start_time_ms, bool relaying)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pass by value and use std::move [modernize-pass-by-value]

be/src/olap/wal_info.cpp:18:

- namespace doris {
+ 
+ #include <utility>
+ namespace doris {

be/src/olap/wal_info.cpp:20:

-         : _wal_path(wal_path),
+         : _wal_path(std::move(wal_path)),

_retry_num(retry_num),
_start_time_ms(start_time_ms),
_relaying(relaying) {}
WalInfo::~WalInfo() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use '= default' to define a trivial destructor [modernize-use-equals-default]

Suggested change
WalInfo::~WalInfo() {}
WalInfo::~WalInfo() = default;

std::string WalInfo::get_wal_path() {
return _wal_path;
}
int64_t WalInfo::get_retry_num() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_retry_num' can be made const [readability-make-member-function-const]

Suggested change
int64_t WalInfo::get_retry_num() {
int64_t WalInfo::get_retry_num() const {

be/src/olap/wal_info.h:24:

-     int64_t get_retry_num();
+     int64_t get_retry_num() const;

int64_t WalInfo::get_retry_num() {
return _retry_num;
}
int64_t WalInfo::get_start_time_ms() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_start_time_ms' can be made const [readability-make-member-function-const]

Suggested change
int64_t WalInfo::get_start_time_ms() {
int64_t WalInfo::get_start_time_ms() const {

be/src/olap/wal_info.h:25:

-     int64_t get_start_time_ms();
+     int64_t get_start_time_ms() const;

int64_t WalInfo::get_start_time_ms() {
return _start_time_ms;
}
bool WalInfo::get_relaying() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_relaying' can be made const [readability-make-member-function-const]

Suggested change
bool WalInfo::get_relaying() {
bool WalInfo::get_relaying() const {

be/src/olap/wal_info.h:27:

-     bool get_relaying();
+     bool get_relaying() const;

Status _parse_wal_path(const std::string& wal, std::shared_ptr<std::pair<int64_t, std::string>>&);
bool _rename_to_tmp_path(const std::string wal);
Status _replay_one_txn_with_stremaload(int64_t wal_id, const std::string& wal,
const std::string& label);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 'wal' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
const std::string& label);
bool _rename_to_tmp_path(std::string wal);

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Status _send_request(int64_t wal_id, const std::string& wal, const std::string& label);
Status _abort_txn(int64_t db_id, int64_t wal_id);
Status _parse_wal_path(const std::string& wal, std::shared_ptr<std::pair<int64_t, std::string>>&);
bool _rename_to_tmp_path(const std::string wal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 'wal' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
bool _rename_to_tmp_path(const std::string wal);
bool _rename_to_tmp_path(std::string wal);

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 9d5065038a76db4ddbb71d388652172e337f95c1, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4998	4643	4639	4639
q2	361	155	158	155
q3	1472	1312	1199	1199
q4	1138	943	922	922
q5	3164	3149	3150	3149
q6	249	128	130	128
q7	1016	508	492	492
q8	2283	2253	2290	2253
q9	6735	6667	6675	6667
q10	3193	3284	3279	3279
q11	334	213	202	202
q12	350	213	217	213
q13	4162	3428	3409	3409
q14	241	212	220	212
q15	584	526	530	526
q16	441	384	384	384
q17	1047	776	573	573
q18	7102	6825	6923	6825
q19	1623	1637	1648	1637
q20	556	346	302	302
q21	3091	2693	2734	2693
q22	368	297	297	297
Total cold run time: 44508 ms
Total hot run time: 40156 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4594	4572	4552	4552
q2	271	170	171	170
q3	3399	3365	3373	3365
q4	2224	2205	2199	2199
q5	5705	5718	5734	5718
q6	238	119	118	118
q7	2372	1827	1852	1827
q8	3609	3623	3615	3615
q9	9042	8925	9015	8925
q10	3788	3885	3896	3885
q11	487	372	387	372
q12	771	611	605	605
q13	3904	3197	3188	3188
q14	286	261	266	261
q15	572	528	524	524
q16	517	456	452	452
q17	1986	1967	1967	1967
q18	8625	8225	8264	8225
q19	1934	1771	1764	1764
q20	2267	1938	1921	1921
q21	6164	5790	5762	5762
q22	558	443	458	443
Total cold run time: 63313 ms
Total hot run time: 59858 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.75 seconds
stream load tsv: 574 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 67 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.7 seconds inserted 10000000 Rows, about 336K ops/s
storage size: 17184225588 Bytes

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#include "olap/wal_info.h"
namespace doris {
WalInfo::WalInfo(int64_t wal_id, std::string wal_path, int64_t retry_num, int64_t start_time_ms)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pass by value and use std::move [modernize-pass-by-value]

be/src/olap/wal_info.cpp:18:

- namespace doris {
+ 
+ #include <utility>
+ namespace doris {

be/src/olap/wal_info.cpp:21:

-           _wal_path(wal_path),
+           _wal_path(std::move(wal_path)),

_retry_num(retry_num),
_start_time_ms(start_time_ms) {}
WalInfo::~WalInfo() {}
int64_t WalInfo::get_wal_id() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_wal_id' can be made const [readability-make-member-function-const]

Suggested change
int64_t WalInfo::get_wal_id() {
int64_t WalInfo::get_wal_id() const {

be/src/olap/wal_info.h:24:

-     int64_t get_wal_id();
+     int64_t get_wal_id() const;

std::string WalInfo::get_wal_path() {
return _wal_path;
}
int64_t WalInfo::get_retry_num() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_retry_num' can be made const [readability-make-member-function-const]

Suggested change
int64_t WalInfo::get_retry_num() {
int64_t WalInfo::get_retry_num() const {

be/src/olap/wal_info.h:25:

-     int64_t get_retry_num();
+     int64_t get_retry_num() const;

int64_t WalInfo::get_retry_num() {
return _retry_num;
}
int64_t WalInfo::get_start_time_ms() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'get_start_time_ms' can be made const [readability-make-member-function-const]

Suggested change
int64_t WalInfo::get_start_time_ms() {
int64_t WalInfo::get_start_time_ms() const {

be/src/olap/wal_info.h:26:

-     int64_t get_start_time_ms();
+     int64_t get_start_time_ms() const;

@@ -164,15 +166,15 @@ Status WalManager::_init_wal_dirs_info() {
&_update_wal_dirs_info_thread);
}

void WalManager::add_wal_status_queue(int64_t table_id, int64_t wal_id, WAL_STATUS wal_status) {
void WalManager::add_wal_status_queue(int64_t table_id, int64_t wal_id, WalStatus wal_status) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add_wal_status_queue' can be made static [readability-convert-member-functions-to-static]

be/src/olap/wal_manager.h:73:

-     void add_wal_status_queue(int64_t table_id, int64_t wal_id, WalStatus wal_status);
+     static void add_wal_status_queue(int64_t table_id, int64_t wal_id, WalStatus wal_status);

#else
return true;
#endif
}

Status WalTable::_abort_txn(int64_t db_id, int64_t wal_id) {
Status WalTable::_try_abort_txn(int64_t db_id, int64_t wal_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_try_abort_txn' can be made static [readability-convert-member-functions-to-static]

be/src/olap/wal_table.h:53:

-     Status _try_abort_txn(int64_t db_id, int64_t wal_id);
+     static Status _try_abort_txn(int64_t db_id, int64_t wal_id);

return Status::OK();
}

Status WalTable::_get_wal_info(const std::string& wal,
std::shared_ptr<std::pair<int64_t, std::string>>& pair) {
Status WalTable::_parse_wal_path(const std::string& wal,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_parse_wal_path' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status WalTable::_parse_wal_path(const std::string& wal,
static Status WalTable::_parse_wal_path(const std::string& wal,

evhttp_add_header(req->output_headers, HTTP_LABEL_KEY.c_str(), label.c_str());
evhttp_add_header(req->output_headers, HTTP_AUTH_CODE.c_str(), std::to_string(wal_id).c_str());
evhttp_add_header(req->output_headers, HTTP_WAL_ID_KY.c_str(), std::to_string(wal_id).c_str());
Status WalTable::_construct_sql_str(const std::string& wal, const std::string& label,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_construct_sql_str' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status WalTable::_construct_sql_str(const std::string& wal, const std::string& label,
static Status WalTable::_construct_sql_str(const std::string& wal, const std::string& label,

return st;
}

Status WalTable::_replay_one_txn_with_stremaload(int64_t wal_id, const std::string& wal,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_replay_one_txn_with_stremaload' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status WalTable::_replay_one_txn_with_stremaload(int64_t wal_id, const std::string& wal,
static Status WalTable::_replay_one_txn_with_stremaload(int64_t wal_id, const std::string& wal,

Status _abort_txn(int64_t db_id, int64_t wal_id);
Status _parse_wal_path(const std::string& wal,
std::shared_ptr<std::pair<int64_t, std::string>>&);
Status _rename_to_tmp_path(const std::string wal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 'wal' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
Status _rename_to_tmp_path(const std::string wal);
Status _rename_to_tmp_path(std::string wal);

@hust-hhb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 4be61af3c764b6fbafb0e33a4d6dedc9b48f11f2, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5485	5092	5061	5061
q2	400	160	158	158
q3	1463	1163	1116	1116
q4	1099	802	811	802
q5	3143	2999	3124	2999
q6	235	141	137	137
q7	972	521	565	521
q8	2167	2237	2206	2206
q9	6860	6824	6823	6823
q10	3193	3149	3134	3134
q11	351	227	225	225
q12	389	251	245	245
q13	4379	3625	3624	3624
q14	256	224	225	224
q15	608	547	574	547
q16	470	412	397	397
q17	1041	581	556	556
q18	7088	6788	6736	6736
q19	1665	1517	1497	1497
q20	617	354	370	354
q21	2859	2357	2463	2357
q22	392	332	321	321
Total cold run time: 45132 ms
Total hot run time: 40040 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5154	5078	5049	5049
q2	351	243	246	243
q3	3374	3294	3312	3294
q4	2192	2057	1987	1987
q5	5944	5930	5933	5930
q6	233	132	131	131
q7	2428	1968	1963	1963
q8	3545	3668	3658	3658
q9	9036	9005	9023	9005
q10	3885	3924	3937	3924
q11	599	471	471	471
q12	815	656	674	656
q13	3916	3177	3193	3177
q14	301	267	264	264
q15	612	547	560	547
q16	561	514	500	500
q17	2043	1793	1784	1784
q18	8764	8316	8342	8316
q19	1772	1714	1701	1701
q20	2436	2004	1993	1993
q21	5750	5344	5263	5263
q22	555	484	480	480
Total cold run time: 64266 ms
Total hot run time: 60336 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.58 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.8 seconds inserted 10000000 Rows, about 359K ops/s
storage size: 17184031950 Bytes

@hust-hhb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 461a256f5760a90e67b2586da4676ade9024e627, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5463	5198	5250	5198
q2	406	171	159	159
q3	1482	1183	1116	1116
q4	1099	825	777	777
q5	3139	3101	3109	3101
q6	237	148	142	142
q7	1015	586	542	542
q8	2166	2229	2252	2229
q9	6728	6672	6635	6635
q10	3195	3155	3124	3124
q11	338	215	223	215
q12	392	245	252	245
q13	4444	3616	3625	3616
q14	266	216	240	216
q15	633	557	584	557
q16	454	421	414	414
q17	1063	566	571	566
q18	7090	6766	7848	6766
q19	1661	1546	1580	1546
q20	624	389	347	347
q21	2920	2486	2504	2486
q22	402	316	332	316
Total cold run time: 45217 ms
Total hot run time: 40313 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5181	5142	5144	5142
q2	340	263	271	263
q3	3383	3303	3314	3303
q4	2172	2029	1981	1981
q5	5938	5949	5916	5916
q6	236	138	133	133
q7	2399	1930	1891	1891
q8	3587	3668	3695	3668
q9	9048	8992	8908	8908
q10	3892	3908	3926	3908
q11	591	492	490	490
q12	816	641	666	641
q13	3901	3204	3181	3181
q14	315	269	265	265
q15	621	566	556	556
q16	583	516	508	508
q17	2056	1845	1852	1845
q18	8757	8380	8428	8380
q19	1790	1733	1720	1720
q20	2280	2003	1989	1989
q21	5798	5399	5354	5354
q22	582	509	511	509
Total cold run time: 64266 ms
Total hot run time: 60551 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.25 seconds
stream load tsv: 567 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17184378314 Bytes

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.59% (8596/23490)
Line Coverage: 28.65% (69876/243876)
Region Coverage: 27.65% (36165/130780)
Branch Coverage: 24.37% (18485/75850)
Coverage Report: http://coverage.selectdb-in.cc/coverage/461a256f5760a90e67b2586da4676ade9024e627_461a256f5760a90e67b2586da4676ade9024e627/report/index.html

@hust-hhb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 10e8240c73fd5ea73d1df99a5c5733775a032719, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5412	5137	5136	5136
q2	397	161	158	158
q3	1458	1202	1237	1202
q4	1100	862	828	828
q5	3144	3119	3075	3075
q6	232	147	141	141
q7	958	566	549	549
q8	2161	2233	2178	2178
q9	6695	6667	6706	6667
q10	3186	3207	3173	3173
q11	357	237	226	226
q12	424	250	239	239
q13	4408	3622	3641	3622
q14	267	219	221	219
q15	602	576	562	562
q16	462	401	402	401
q17	1047	651	556	556
q18	7152	6775	6719	6719
q19	1639	1535	1559	1535
q20	618	399	400	399
q21	2931	2460	2520	2460
q22	397	318	332	318
Total cold run time: 45047 ms
Total hot run time: 40363 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5145	5138	5064	5064
q2	343	263	268	263
q3	3383	3347	3329	3329
q4	2156	1985	2018	1985
q5	5943	5915	5944	5915
q6	233	132	138	132
q7	2391	1968	1960	1960
q8	3604	3702	3672	3672
q9	9108	8975	8964	8964
q10	3880	3921	3939	3921
q11	578	475	477	475
q12	827	697	643	643
q13	3858	3177	3183	3177
q14	311	270	277	270
q15	623	566	580	566
q16	544	520	496	496
q17	2049	1827	1771	1771
q18	8839	8340	8371	8340
q19	1761	1719	1719	1719
q20	2282	2029	1983	1983
q21	5732	5287	5371	5287
q22	606	474	510	474
Total cold run time: 64196 ms
Total hot run time: 60406 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.51 seconds
stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.2 seconds inserted 10000000 Rows, about 354K ops/s
storage size: 17183679906 Bytes

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.62% (8611/23514)
Line Coverage: 28.67% (69949/244013)
Region Coverage: 27.66% (36188/130839)
Branch Coverage: 24.38% (18503/75886)
Coverage Report: http://coverage.selectdb-in.cc/coverage/10e8240c73fd5ea73d1df99a5c5733775a032719_10e8240c73fd5ea73d1df99a5c5733775a032719/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 29, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring dataroaring merged commit 03901b9 into apache:master Dec 30, 2023
16 of 18 checks passed
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. meta-change reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants