Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](function) Support for aggregate function foreach combiner #31526

Merged
merged 7 commits into from
Mar 6, 2024

Conversation

Mryange
Copy link
Contributor

@Mryange Mryange commented Feb 28, 2024

Proposed changes

mysql [test]>select * from table_with_null;
+------+-----------+---------------+
| id   | a         | s             |
+------+-----------+---------------+
|    1 | [1, 2, 3] | ["ab", "123"] |
|    2 | [20]      | ["cd"]        |
|    3 | [100]     | ["efg"]       |
|    4 | NULL      | NULL          |
|    5 | [null, 2] | [null, "c"]   |
+------+-----------+---------------+


mysql [test]>select sum_foreach(a) ,  count_foreach(a) , group_concat_foreach(s) from table_with_null;
+------------------+--------------------+---------------------------+
| sum_foreach(`a`) | count_foreach(`a`) | group_concat_foreach(`s`) |
+------------------+--------------------+---------------------------+
| [121, 4, 3]      | [3, 2, 1]          | ["ab,cd,efg", "123,c"]    |
+------------------+--------------------+---------------------------+

Currently not supported on Nereids, further follow-up is needed.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

@Mryange
Copy link
Contributor Author

Mryange commented Feb 28, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

const size_t nested_size_of_data;
const size_t num_arguments;

AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place,
AggregateFunctionForEachData& ensureAggregateData(const AggregateDataPtr __restrict place,

"Aggregate function {} require at least one argument", get_name());
}
}
void set_version(const int version_) override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'set_version' can be made static [readability-convert-member-functions-to-static]

Suggested change
void set_version(const int version_) override {
static void set_version(const int version_) override {

return std::make_shared<DataTypeArray>(make_nullable(nested_function->get_return_type()));
}

void destroy(AggregateDataPtr __restrict place) const noexcept override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void destroy(AggregateDataPtr __restrict place) const noexcept override {
void destroy(const AggregateDataPtr __restrict place) const noexcept override {

@@ -78,6 +86,21 @@ class AggregateFunctionSimpleFactory {
}
}

void register_foreach_function_combinator(const Creator& creator, const std::string& suffix,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'register_foreach_function_combinator' can be made static [readability-convert-member-functions-to-static]

Suggested change
void register_foreach_function_combinator(const Creator& creator, const std::string& suffix,
static void register_foreach_function_combinator(const Creator& creator, const std::string& suffix,

@doris-robot
Copy link

TPC-H: Total hot run time: 37760 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3e2cc053792157045fe914555d7aed73430261b0, data reload: false

------ Round 1 ----------------------------------
q1	17617	4124	4074	4074
q2	2051	140	133	133
q3	10580	954	940	940
q4	4662	950	959	950
q5	7638	2930	3031	2930
q6	181	121	119	119
q7	1238	771	760	760
q8	9382	1959	2035	1959
q9	7192	6302	6331	6302
q10	8206	2506	2512	2506
q11	423	216	194	194
q12	827	310	319	310
q13	17942	3220	3184	3184
q14	277	254	257	254
q15	532	497	489	489
q16	479	398	411	398
q17	937	915	866	866
q18	6732	5871	6154	5871
q19	1575	1515	1506	1506
q20	538	275	276	275
q21	6279	3461	3555	3461
q22	804	294	279	279
Total cold run time: 106092 ms
Total hot run time: 37760 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4018	4028	4008	4008
q2	310	215	222	215
q3	3011	2910	2960	2910
q4	1824	1820	1835	1820
q5	5157	5131	5150	5131
q6	199	114	117	114
q7	2181	1729	1724	1724
q8	3149	3220	3211	3211
q9	8319	8276	8256	8256
q10	6153	3649	3662	3649
q11	508	445	429	429
q12	696	542	537	537
q13	4768	3088	3094	3088
q14	277	252	266	252
q15	548	494	489	489
q16	500	466	476	466
q17	1691	1684	1687	1684
q18	7815	7504	7476	7476
q19	1655	1643	1628	1628
q20	2104	1900	1891	1891
q21	4727	4586	4621	4586
q22	510	433	432	432
Total cold run time: 60120 ms
Total hot run time: 53996 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169564 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3e2cc053792157045fe914555d7aed73430261b0, data reload: false

query1	948	361	339	339
query2	6511	1904	1929	1904
query3	6713	207	206	206
query4	23303	20576	20499	20499
query5	4330	368	364	364
query6	269	168	173	168
query7	4601	299	296	296
query8	257	188	193	188
query9	8393	2258	2249	2249
query10	420	219	216	216
query11	14563	14108	14202	14108
query12	135	82	83	82
query13	1640	422	440	422
query14	8478	6456	6256	6256
query15	220	170	174	170
query16	7087	275	256	256
query17	1023	508	503	503
query18	1896	280	270	270
query19	188	148	149	148
query20	86	83	84	83
query21	201	124	116	116
query22	4576	4607	4657	4607
query23	30491	29934	29960	29934
query24	12032	3116	3138	3116
query25	664	356	345	345
query26	1855	156	165	156
query27	3069	311	326	311
query28	7021	1820	1809	1809
query29	1192	566	569	566
query30	276	132	141	132
query31	855	684	694	684
query32	93	58	54	54
query33	723	219	231	219
query34	1065	479	499	479
query35	852	727	752	727
query36	956	863	822	822
query37	140	61	63	61
query38	3094	3010	3023	3010
query39	1312	1252	1248	1248
query40	282	99	99	99
query41	36	39	36	36
query42	100	99	94	94
query43	465	433	441	433
query44	1054	702	732	702
query45	188	177	173	173
query46	1072	806	793	793
query47	1523	1418	1385	1385
query48	433	346	350	346
query49	1127	307	324	307
query50	775	380	374	374
query51	4442	4298	4302	4298
query52	93	91	85	85
query53	337	271	263	263
query54	295	223	237	223
query55	89	82	71	71
query56	217	205	198	198
query57	936	875	877	875
query58	218	196	192	192
query59	2157	2182	2278	2182
query60	237	215	222	215
query61	86	84	89	84
query62	617	385	367	367
query63	282	256	258	256
query64	6560	3005	3107	3005
query65	3243	3200	3203	3200
query66	1450	327	311	311
query67	14419	14203	14180	14180
query68	4959	565	587	565
query69	540	382	369	369
query70	1166	1210	1177	1177
query71	317	243	245	243
query72	6308	2825	2673	2673
query73	726	320	316	316
query74	6313	6020	6017	6017
query75	3054	2409	2434	2409
query76	2761	1034	1151	1034
query77	343	238	226	226
query78	8966	8436	8366	8366
query79	1688	519	519	519
query80	1108	350	344	344
query81	479	192	196	192
query82	950	87	84	84
query83	231	127	122	122
query84	279	80	83	80
query85	1333	356	348	348
query86	372	307	300	300
query87	3245	3096	3124	3096
query88	2799	2335	2326	2326
query89	390	330	320	320
query90	1895	165	174	165
query91	156	125	132	125
query92	54	48	50	48
query93	1424	533	516	516
query94	1330	184	178	178
query95	445	345	345	345
query96	583	263	266	263
query97	4247	4107	4187	4107
query98	220	210	191	191
query99	1027	632	630	630
Total cold run time: 263969 ms
Total hot run time: 169564 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.49 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3e2cc053792157045fe914555d7aed73430261b0, data reload: false

query1	0.04	0.03	0.03
query2	0.06	0.02	0.02
query3	0.24	0.06	0.06
query4	1.66	0.10	0.10
query5	0.52	0.51	0.52
query6	1.27	0.67	0.68
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.56	0.50	0.51
query10	0.55	0.55	0.56
query11	0.13	0.10	0.10
query12	0.12	0.10	0.10
query13	0.62	0.61	0.62
query14	0.80	0.80	0.80
query15	0.84	0.82	0.81
query16	0.37	0.38	0.38
query17	1.00	0.94	0.99
query18	0.27	0.25	0.25
query19	1.78	1.69	1.71
query20	0.02	0.01	0.01
query21	15.39	0.63	0.64
query22	2.53	4.28	2.80
query23	17.41	1.05	1.00
query24	1.98	0.76	0.37
query25	0.46	0.14	0.05
query26	0.18	0.14	0.16
query27	0.05	0.05	0.05
query28	11.93	0.83	0.82
query29	12.60	3.31	3.15
query30	0.58	0.52	0.57
query31	2.79	0.35	0.35
query32	3.35	0.48	0.48
query33	3.21	3.16	3.22
query34	15.36	4.76	4.74
query35	4.78	4.77	4.76
query36	1.13	1.07	1.05
query37	0.06	0.05	0.05
query38	0.04	0.03	0.03
query39	0.02	0.01	0.02
query40	0.15	0.12	0.13
query41	0.07	0.02	0.01
query42	0.03	0.01	0.02
query43	0.02	0.02	0.02
Total cold run time: 105.03 s
Total hot run time: 32.49 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 3e2cc053792157045fe914555d7aed73430261b0 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       18.1 seconds inserted 10000000 Rows, about 552K ops/s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.67% (8550/23971)
Line Coverage: 27.48% (69424/252619)
Region Coverage: 26.64% (35985/135095)
Branch Coverage: 23.44% (18392/78450)
Coverage Report: http://coverage.selectdb-in.cc/coverage/3e2cc053792157045fe914555d7aed73430261b0_3e2cc053792157045fe914555d7aed73430261b0/report/index.html

@Mryange
Copy link
Contributor Author

Mryange commented Feb 28, 2024

run buildall

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this class is not only process agg state function, u should rename it



qt_sql """
select sum_foreach(a) , count_foreach(a) , group_concat_foreach(s) from table_with_null;
Copy link
Contributor

@morrySnow morrySnow Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if all agg function could have foreach suffix. u should add all agg function with foreach suffix tests. and each function should cover its all signatures. refer to: regression-test/suites/nereids_function_p0/agg_function

@Mryange
Copy link
Contributor Author

Mryange commented Feb 28, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.67% (8550/23971)
Line Coverage: 27.48% (69430/252621)
Region Coverage: 26.64% (35993/135098)
Branch Coverage: 23.45% (18400/78452)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2872a97f6410173596e55bb10306c95548781194_2872a97f6410173596e55bb10306c95548781194/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 37999 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2872a97f6410173596e55bb10306c95548781194, data reload: false

------ Round 1 ----------------------------------
q1	17603	4025	4043	4025
q2	2041	140	135	135
q3	10588	962	949	949
q4	4783	978	958	958
q5	7962	2942	3080	2942
q6	185	125	127	125
q7	1275	780	774	774
q8	9566	2064	2059	2059
q9	7398	6270	6237	6237
q10	8185	2481	2493	2481
q11	420	200	211	200
q12	780	313	311	311
q13	17940	3217	3210	3210
q14	277	247	248	247
q15	532	496	499	496
q16	475	418	421	418
q17	931	865	880	865
q18	6815	5958	5995	5958
q19	1572	1522	1492	1492
q20	563	280	284	280
q21	6650	3560	3576	3560
q22	801	277	299	277
Total cold run time: 107342 ms
Total hot run time: 37999 ms

----- Round 2, with runtime_filter_mode=off -----
q1	3999	3980	3993	3980
q2	306	212	213	212
q3	3000	2932	2921	2921
q4	1838	1830	1850	1830
q5	5114	5111	5077	5077
q6	200	114	116	114
q7	2136	1720	1759	1720
q8	3109	3184	3194	3184
q9	8238	8235	8215	8215
q10	6097	3654	3633	3633
q11	507	440	444	440
q12	680	524	535	524
q13	8293	3049	3056	3049
q14	274	247	255	247
q15	546	490	493	490
q16	475	448	471	448
q17	1705	1688	1680	1680
q18	7805	7458	7488	7458
q19	1618	1624	1618	1618
q20	2090	1909	1900	1900
q21	4852	4608	4698	4608
q22	532	449	449	449
Total cold run time: 63414 ms
Total hot run time: 53797 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169278 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2872a97f6410173596e55bb10306c95548781194, data reload: false

query1	930	339	333	333
query2	6540	1805	1745	1745
query3	6695	213	203	203
query4	23620	20611	20647	20611
query5	4274	367	361	361
query6	261	171	169	169
query7	4616	301	302	301
query8	256	189	198	189
query9	8441	2304	2300	2300
query10	417	235	213	213
query11	14801	14105	14101	14101
query12	135	85	86	85
query13	1651	431	437	431
query14	8918	6398	6738	6398
query15	209	173	173	173
query16	7094	259	260	259
query17	1037	517	498	498
query18	1899	266	271	266
query19	200	148	149	148
query20	89	82	79	79
query21	202	133	117	117
query22	4806	4588	4560	4560
query23	30635	29899	29888	29888
query24	10904	3094	3091	3091
query25	632	359	373	359
query26	1617	156	167	156
query27	2974	318	324	318
query28	6925	1843	1833	1833
query29	1109	566	571	566
query30	280	132	142	132
query31	874	697	696	696
query32	99	57	57	57
query33	721	220	235	220
query34	1053	489	496	489
query35	852	762	743	743
query36	963	827	894	827
query37	126	59	62	59
query38	3102	2967	2987	2967
query39	1315	1251	1256	1251
query40	276	103	103	103
query41	38	39	37	37
query42	103	98	96	96
query43	450	437	419	419
query44	1081	698	710	698
query45	188	184	173	173
query46	1038	813	793	793
query47	1536	1472	1443	1443
query48	424	348	348	348
query49	1191	294	321	294
query50	766	377	378	377
query51	4344	4219	4310	4219
query52	103	90	93	90
query53	343	266	264	264
query54	298	223	234	223
query55	85	85	87	85
query56	220	203	198	198
query57	989	891	875	875
query58	222	196	192	192
query59	2151	2021	1976	1976
query60	234	230	220	220
query61	92	88	85	85
query62	590	378	364	364
query63	289	252	272	252
query64	6640	3013	3086	3013
query65	3239	3230	3202	3202
query66	1439	319	310	310
query67	14369	13982	14064	13982
query68	5020	561	568	561
query69	528	382	379	379
query70	1278	1230	1220	1220
query71	340	260	269	260
query72	6268	2800	2652	2652
query73	714	316	324	316
query74	6308	5972	5942	5942
query75	3047	2465	2407	2407
query76	2806	1041	1161	1041
query77	346	240	234	234
query78	9020	8583	8422	8422
query79	1004	519	542	519
query80	687	361	354	354
query81	445	202	198	198
query82	1286	81	88	81
query83	238	124	120	120
query84	279	80	81	80
query85	1168	356	343	343
query86	312	311	305	305
query87	3230	3089	3084	3084
query88	2766	2348	2318	2318
query89	383	357	338	338
query90	1992	162	166	162
query91	152	125	119	119
query92	53	52	48	48
query93	1053	543	512	512
query94	1097	180	183	180
query95	441	339	356	339
query96	576	265	264	264
query97	4256	4125	4157	4125
query98	232	216	206	206
query99	1043	626	689	626
Total cold run time: 262378 ms
Total hot run time: 169278 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2872a97f6410173596e55bb10306c95548781194, data reload: false

query1	0.04	0.03	0.04
query2	0.06	0.02	0.02
query3	0.23	0.06	0.07
query4	1.67	0.10	0.10
query5	0.52	0.51	0.52
query6	1.31	0.67	0.67
query7	0.02	0.02	0.01
query8	0.04	0.02	0.02
query9	0.55	0.51	0.53
query10	0.58	0.57	0.55
query11	0.12	0.10	0.09
query12	0.12	0.10	0.10
query13	0.63	0.62	0.61
query14	0.78	0.79	0.80
query15	0.84	0.83	0.82
query16	0.36	0.38	0.37
query17	0.97	0.96	0.97
query18	0.27	0.25	0.23
query19	1.78	1.70	1.77
query20	0.02	0.01	0.01
query21	15.42	0.63	0.56
query22	3.34	5.06	1.98
query23	17.34	1.08	1.02
query24	2.04	0.20	0.30
query25	0.46	0.08	0.07
query26	0.17	0.14	0.13
query27	0.06	0.04	0.04
query28	12.57	0.87	0.83
query29	12.70	3.28	3.17
query30	0.59	0.53	0.54
query31	2.79	0.35	0.36
query32	3.36	0.48	0.48
query33	3.16	3.24	3.21
query34	15.36	4.74	4.70
query35	4.72	4.72	4.71
query36	1.12	1.05	1.06
query37	0.07	0.05	0.04
query38	0.05	0.03	0.02
query39	0.02	0.01	0.02
query40	0.17	0.13	0.13
query41	0.07	0.02	0.02
query42	0.02	0.01	0.02
query43	0.02	0.02	0.02
Total cold run time: 106.53 s
Total hot run time: 31.44 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 2872a97f6410173596e55bb10306c95548781194 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       17.5 seconds inserted 10000000 Rows, about 571K ops/s

@Mryange
Copy link
Contributor Author

Mryange commented Feb 28, 2024

run feut


@Override
public boolean nullable() {
return true;
Copy link
Contributor

@924060929 924060929 Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why implement AlwaysNotNullable but override nullable method like AlwaysNullable

@Mryange
Copy link
Contributor Author

Mryange commented Mar 1, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38868 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 758a158b482524341d54921ab22ac6e2ea088192, data reload: false

------ Round 1 ----------------------------------
q1	18002	4900	4123	4123
q2	2060	155	142	142
q3	11560	955	981	955
q4	4684	952	963	952
q5	8629	2923	3063	2923
q6	181	126	122	122
q7	1310	838	846	838
q8	9630	2095	2112	2095
q9	7449	6567	6577	6567
q10	8293	2649	2613	2613
q11	435	225	240	225
q12	762	311	315	311
q13	18017	2994	3120	2994
q14	296	277	266	266
q15	483	457	439	439
q16	488	405	413	405
q17	945	874	838	838
q18	9717	6397	6364	6364
q19	2910	1396	1542	1396
q20	586	302	300	300
q21	7656	3702	3704	3702
q22	805	298	303	298
Total cold run time: 114898 ms
Total hot run time: 38868 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4061	4076	4119	4076
q2	319	221	221	221
q3	2979	2938	2941	2938
q4	1854	1822	1810	1810
q5	5273	5261	5276	5261
q6	201	117	117	117
q7	2261	1855	1877	1855
q8	3241	3299	3302	3299
q9	8599	8619	8607	8607
q10	6146	3790	3920	3790
q11	552	451	470	451
q12	735	587	616	587
q13	11251	2988	2967	2967
q14	292	284	295	284
q15	485	445	447	445
q16	490	439	419	419
q17	1708	1659	1695	1659
q18	11556	7660	7454	7454
q19	1638	1624	1628	1624
q20	1991	1795	1749	1749
q21	4928	4781	4791	4781
q22	539	474	470	470
Total cold run time: 71099 ms
Total hot run time: 54864 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.62% (8551/24008)
Line Coverage: 27.44% (69426/253020)
Region Coverage: 26.60% (36006/135356)
Branch Coverage: 23.42% (18397/78554)
Coverage Report: http://coverage.selectdb-in.cc/coverage/758a158b482524341d54921ab22ac6e2ea088192_758a158b482524341d54921ab22ac6e2ea088192/report/index.html

void register_aggregate_function_combinator_foreach(AggregateFunctionSimpleFactory& factory) {
AggregateFunctionCreator creator = [&](const std::string& name, const DataTypes& types,
const bool result_is_nullable) -> AggregateFunctionPtr {
const std::string suffix = AggregateFunctionForEach::AGG_FOREACH_SUFFIX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use & reference

const bool result_is_nullable) -> AggregateFunctionPtr {
const std::string suffix = AggregateFunctionForEach::AGG_FOREACH_SUFFIX;
DataTypes transform_arguments;
for (auto t : types) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto&

factory.get(nested_function_name, transform_arguments, result_is_nullable);
if (!nested_function) {
throw Exception(ErrorCode::INTERNAL_ERROR,
"The combiner did not find a corresponding function. nested function "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each combiner

const size_t nested_size_of_data;
const size_t num_arguments;

AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure_aggregate_data(

@Mryange Mryange force-pushed the agg-foreach branch 2 times, most recently from 3c99923 to 7cdc52e Compare March 2, 2024 11:17
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

const size_t nested_size_of_data;
const size_t num_arguments;

AggregateFunctionForEachData& ensure_aggregate_data(AggregateDataPtr __restrict place,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
AggregateFunctionForEachData& ensure_aggregate_data(AggregateDataPtr __restrict place,
AggregateFunctionForEachData& ensure_aggregate_data(const AggregateDataPtr __restrict place,

}

for (i = 0; i < old_size; ++i) {
nested_function->merge(&new_state[i * nested_size_of_data],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destory the old_state mem

const auto& first_array_column = assert_cast<const ColumnArray&>(*columns[0]);
const auto& offsets = first_array_column.get_offsets();

size_t begin = offsets[row_num - 1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happen if row_num == 0? maybe overflow access mem offset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index of offsets can be -1.
// [[2,1,5,9,1], [1,2,4]] --> data column [2,1,5,9,1,1,2,4], offset[-1] = 0, offset[0] = 5, offset[1] = 8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning !!!row_num is size_t -1 = 18446744073709551615

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my mistake.

@Mryange
Copy link
Contributor Author

Mryange commented Mar 3, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.41% (8556/24165)
Line Coverage: 27.19% (69456/255442)
Region Coverage: 26.41% (36024/136426)
Branch Coverage: 23.25% (18400/79154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2691518bca3f088f149da414bb5753caec39ba32_2691518bca3f088f149da414bb5753caec39ba32/report/index.html

@Mryange
Copy link
Contributor Author

Mryange commented Mar 4, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.41% (8556/24165)
Line Coverage: 27.19% (69453/255442)
Region Coverage: 26.40% (36022/136426)
Branch Coverage: 23.25% (18401/79154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8ed6354799269d0495707ac16ee3a159025d7094_8ed6354799269d0495707ac16ee3a159025d7094/report/index.html

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -65,7 +65,7 @@ class AggregateFunctionCount final

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr __restrict place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add' can be made static [readability-convert-member-functions-to-static]

Suggested change
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override {
static void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) override {

@@ -65,7 +65,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr __restrict place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override {
void add(const AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override {

Comment on lines +197 to 198
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add' can be made static [readability-convert-member-functions-to-static]

Suggested change
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) const override {
static void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) override {

@@ -194,7 +194,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr __restrict place, const IColumn** columns, size_t row_num,
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
void add(const AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,

Comment on lines +298 to 299
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add' can be made static [readability-convert-member-functions-to-static]

Suggested change
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) const override {
static void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) override {

@@ -65,7 +65,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {

@@ -103,7 +103,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add' can be made static [readability-convert-member-functions-to-static]

Suggested change
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
static void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) override {

@@ -103,7 +103,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {

@@ -148,7 +148,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'add' can be made static [readability-convert-member-functions-to-static]

Suggested change
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
static void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) override {

@@ -148,7 +148,7 @@

DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); }

void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override {
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]

Suggested change
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override {

@Mryange
Copy link
Contributor Author

Mryange commented Mar 4, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.40% (8554/24165)
Line Coverage: 27.18% (69428/255442)
Region Coverage: 26.40% (36010/136426)
Branch Coverage: 23.24% (18392/79154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/db82be0360be7349ff47cd06421cba917cbbde52_db82be0360be7349ff47cd06421cba917cbbde52/report/index.html

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 5, 2024
Copy link
Contributor

github-actions bot commented Mar 5, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Mar 5, 2024

PR approved by anyone and no changes requested.

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have a doc in sql-functions/combinators/

@Mryange
Copy link
Contributor Author

Mryange commented Mar 5, 2024

should have a doc in sql-functions/combinators/

Due to error with some functions' self-implemented features, there are currently some that cannot be used. Documentation will be added once all issues are addressed.

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, LGTM then

@HappenLee HappenLee merged commit 3c39734 into apache:master Mar 6, 2024
26 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants