-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature](function) Support for aggregate function foreach combiner #31526
Conversation
Thank you for your contribution to Apache Doris. |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
const size_t nested_size_of_data; | ||
const size_t num_arguments; | ||
|
||
AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place, | |
AggregateFunctionForEachData& ensureAggregateData(const AggregateDataPtr __restrict place, |
"Aggregate function {} require at least one argument", get_name()); | ||
} | ||
} | ||
void set_version(const int version_) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'set_version' can be made static [readability-convert-member-functions-to-static]
void set_version(const int version_) override { | |
static void set_version(const int version_) override { |
return std::make_shared<DataTypeArray>(make_nullable(nested_function->get_return_type())); | ||
} | ||
|
||
void destroy(AggregateDataPtr __restrict place) const noexcept override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void destroy(AggregateDataPtr __restrict place) const noexcept override { | |
void destroy(const AggregateDataPtr __restrict place) const noexcept override { |
@@ -78,6 +86,21 @@ class AggregateFunctionSimpleFactory { | |||
} | |||
} | |||
|
|||
void register_foreach_function_combinator(const Creator& creator, const std::string& suffix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'register_foreach_function_combinator' can be made static [readability-convert-member-functions-to-static]
void register_foreach_function_combinator(const Creator& creator, const std::string& suffix, | |
static void register_foreach_function_combinator(const Creator& creator, const std::string& suffix, |
TPC-H: Total hot run time: 37760 ms
|
TPC-DS: Total hot run time: 169564 ms
|
ClickBench: Total hot run time: 32.49 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
TeamCity be ut coverage result: |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this class is not only process agg state function, u should rename it
|
||
|
||
qt_sql """ | ||
select sum_foreach(a) , count_foreach(a) , group_concat_foreach(s) from table_with_null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if all agg function could have foreach suffix. u should add all agg function with foreach suffix tests. and each function should cover its all signatures. refer to: regression-test/suites/nereids_function_p0/agg_function
run buildall |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 37999 ms
|
TPC-DS: Total hot run time: 169278 ms
|
ClickBench: Total hot run time: 31.44 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run feut |
|
||
@Override | ||
public boolean nullable() { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why implement AlwaysNotNullable
but override nullable method like AlwaysNullable
run buildall |
TPC-H: Total hot run time: 38868 ms
|
TeamCity be ut coverage result: |
void register_aggregate_function_combinator_foreach(AggregateFunctionSimpleFactory& factory) { | ||
AggregateFunctionCreator creator = [&](const std::string& name, const DataTypes& types, | ||
const bool result_is_nullable) -> AggregateFunctionPtr { | ||
const std::string suffix = AggregateFunctionForEach::AGG_FOREACH_SUFFIX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use & reference
const bool result_is_nullable) -> AggregateFunctionPtr { | ||
const std::string suffix = AggregateFunctionForEach::AGG_FOREACH_SUFFIX; | ||
DataTypes transform_arguments; | ||
for (auto t : types) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto&
factory.get(nested_function_name, transform_arguments, result_is_nullable); | ||
if (!nested_function) { | ||
throw Exception(ErrorCode::INTERNAL_ERROR, | ||
"The combiner did not find a corresponding function. nested function " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each combiner
const size_t nested_size_of_data; | ||
const size_t num_arguments; | ||
|
||
AggregateFunctionForEachData& ensureAggregateData(AggregateDataPtr __restrict place, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_aggregate_data(
3c99923
to
7cdc52e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
const size_t nested_size_of_data; | ||
const size_t num_arguments; | ||
|
||
AggregateFunctionForEachData& ensure_aggregate_data(AggregateDataPtr __restrict place, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
AggregateFunctionForEachData& ensure_aggregate_data(AggregateDataPtr __restrict place, | |
AggregateFunctionForEachData& ensure_aggregate_data(const AggregateDataPtr __restrict place, |
} | ||
|
||
for (i = 0; i < old_size; ++i) { | ||
nested_function->merge(&new_state[i * nested_size_of_data], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destory the old_state mem
const auto& first_array_column = assert_cast<const ColumnArray&>(*columns[0]); | ||
const auto& offsets = first_array_column.get_offsets(); | ||
|
||
size_t begin = offsets[row_num - 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happen if row_num == 0? maybe overflow access mem offset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index of offsets can be -1.
// [[2,1,5,9,1], [1,2,4]] --> data column [2,1,5,9,1,1,2,4], offset[-1] = 0, offset[0] = 5, offset[1] = 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning !!!row_num is size_t
-1 = 18446744073709551615
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my mistake.
run buildall |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -65,7 +65,7 @@ class AggregateFunctionCount final | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr __restrict place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'add' can be made static [readability-convert-member-functions-to-static]
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override { | |
static void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) override { |
@@ -65,7 +65,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr __restrict place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void add(AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override { | |
void add(const AggregateDataPtr __restrict place, const IColumn**, ssize_t, Arena*) const override { |
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | ||
Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'add' can be made static [readability-convert-member-functions-to-static]
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | |
Arena*) const override { | |
static void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | |
Arena*) override { |
@@ -194,7 +194,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr __restrict place, const IColumn** columns, size_t row_num, | |||
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | |
void add(const AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, |
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | ||
Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'add' can be made static [readability-convert-member-functions-to-static]
void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | |
Arena*) const override { | |
static void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num, | |
Arena*) override { |
@@ -65,7 +65,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { | |
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
@@ -103,7 +103,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'add' can be made static [readability-convert-member-functions-to-static]
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { | |
static void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) override { |
@@ -103,7 +103,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { | |
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
@@ -148,7 +148,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'add' can be made static [readability-convert-member-functions-to-static]
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { | |
static void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) override { |
@@ -148,7 +148,7 @@ | |||
|
|||
DataTypePtr get_return_type() const override { return std::make_shared<DataTypeInt64>(); } | |||
|
|||
void add(AggregateDataPtr place, const IColumn**, size_t, Arena*) const override { | |||
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: pointer parameter 'place' can be pointer to const [readability-non-const-parameter]
void add(AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { | |
void add(const AggregateDataPtr place, const IColumn**, ssize_t, Arena*) const override { |
run buildall |
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should have a doc in sql-functions/combinators/
Due to error with some functions' self-implemented features, there are currently some that cannot be used. Documentation will be added once all issues are addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, LGTM then
Proposed changes
Currently not supported on Nereids, further follow-up is needed.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...