Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Agg_BitOr function push down. #5293

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions dbms/src/AggregateFunctions/AggregateFunctionNull.h
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,12 @@ class AggregateFunctionNullUnary final : public AggregateFunctionNullBase<result
const IColumn * nested_column = &column->getNestedColumn();
this->nested_function->add(this->nestedPlace(place), &nested_column, row_num, arena);
}
else if (const String & nested_func_name = this->nested_function->getName(); nested_func_name == "groupBitOr" || nested_func_name == "groupBitAnd" || nested_func_name == "groupBitXor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use if constexpr (result_is_nullable)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is right for non-bitwise functions.

{
// If nested_function is bitwise aggregation function,
// the result is always not null.
this->setFlag(place);
}
}
else
{
Expand Down Expand Up @@ -466,6 +472,13 @@ class AggregateFunctionNullUnary final : public AggregateFunctionNullBase<result
arena,
if_argument_pos);

if (const String & nested_func_name = this->nested_function->getName(); nested_func_name == "groupBitOr" || nested_func_name == "groupBitAnd" || nested_func_name == "groupBitXor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use if constexpr (result_is_nullable)?

{
// If nested_function is bitwise aggregation function,
// the result is always not null.
this->setFlag(place);
}

if constexpr (result_is_nullable)
if (!mem_utils::memoryIsByte(null_map, batch_size, std::byte{1}))
this->setFlag(place);
Expand Down
41 changes: 40 additions & 1 deletion dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,36 @@ void DAGExpressionAnalyzer::fillArgumentDetail(
arg_collators.push_back(removeNullable(arg_types.back())->isString() ? getCollatorFromExpr(arg) : nullptr);
}

void DAGExpressionAnalyzer::fillArgumentDetailForAggFuncBitwise(
const ExpressionActionsPtr & actions,
const tipb::Expr & arg,
Names & arg_names,
DataTypes & arg_types,
TiDB::TiDBCollators & arg_collators)
{
DataTypePtr uint64_type = std::make_shared<DataTypeUInt64>();
const Block & sample_block = actions->getSampleBlock();
String name = getActions(arg, actions);
DataTypePtr orig_type = sample_block.getByName(name).type;
// Bitwise operations can only be applied to integer types.
// So we need to cast the argument to UInt64.
Comment on lines +225 to +226
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to add cast here?
Didn't tidb add a cast here?

Copy link
Author

@RinChanNOWWW RinChanNOWWW Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, TiDB didn't. We compute the result in TiFlash, so we should cast in TiFlash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems tidb does type cast during operator execution phase.
https://github.com/pingcap/tidb/blob/bce6901705cde161451c0bd707a07ecfac0c7578/expression/aggregation/bit_or.go#L37-L56. This behavior is kind of similar to non-agg bitwise function. I think we might have two choice

  1. also do type cast during operator execution phase instead of appending cast beforehand
  2. do the same as non-agg bitwise function, i.e. appending cast in advance

String DAGExpressionAnalyzerHelper::buildBitwiseFunction(
DAGExpressionAnalyzer * analyzer,
const tipb::Expr & expr,
const ExpressionActionsPtr & actions)
{
const String & func_name = getFunctionName(expr);
Names argument_names;
// We should convert arguments to UInt64.
// See https://github.com/pingcap/tics/issues/1756
DataTypePtr uint64_type = std::make_shared<DataTypeUInt64>();
const Block & sample_block = actions->getSampleBlock();
for (const auto & child : expr.children())
{
String name = analyzer->getActions(child, actions);
DataTypePtr orig_type = sample_block.getByName(name).type;
// Bump argument type
if (!removeNullable(orig_type)->equals(*uint64_type))
{
if (orig_type->isNullable())
{
name = analyzer->appendCast(makeNullable(uint64_type), actions, name);
}
else
{
name = analyzer->appendCast(uint64_type, actions, name);
}
}
argument_names.push_back(name);
}
return analyzer->applyFunction(func_name, argument_names, actions, nullptr);
}

However, I think current comment is not accurate. TiDB always pushes integer types to tiflash. The reason why we append a cast, or do type cast in execution phase, is because mysql (prior to 8.0) handle only unsigned 64-bit integer argument and result values.

ref: https://dev.mysql.com/doc/refman/8.0/en/bit-functions.html

if (!removeNullable(orig_type)->equals(*uint64_type))
{
if (orig_type->isNullable())
{
name = appendCast(makeNullable(uint64_type), actions, name);
}
else
{
name = appendCast(uint64_type, actions, name);
}
}

arg_names.push_back(name);
arg_types.push_back(orig_type->isNullable() ? makeNullable(uint64_type) : uint64_type);
arg_collators.push_back(removeNullable(arg_types.back())->isString() ? getCollatorFromExpr(arg) : nullptr);
}

void DAGExpressionAnalyzer::buildGroupConcat(
const tipb::Expr & expr,
const ExpressionActionsPtr & actions,
Expand Down Expand Up @@ -330,14 +360,23 @@ void DAGExpressionAnalyzer::buildCommonAggFunc(
NamesAndTypes & aggregated_columns,
bool empty_input_as_null)
{
tipb::ExprType tp = expr.tp();
bool is_bitwise = tp == tipb::ExprType::Agg_BitAnd || tp == tipb::ExprType::Agg_BitOr || tp == tipb::ExprType::Agg_BitXor;
auto child_size = expr.children_size();
Names arg_names;
DataTypes arg_types;
TiDB::TiDBCollators arg_collators;

for (Int32 i = 0; i < child_size; ++i)
{
fillArgumentDetail(actions, expr.children(i), arg_names, arg_types, arg_collators);
if (is_bitwise)
{
fillArgumentDetailForAggFuncBitwise(actions, expr.children(i), arg_names, arg_types, arg_collators);
}
else
{
fillArgumentDetail(actions, expr.children(i), arg_names, arg_types, arg_collators);
}
}
// For count(not null column), we can transform it to count() to avoid the cost of convertToFullColumn.
if (expr.tp() == tipb::ExprType::Count && !expr.has_distinct() && child_size == 1 && !arg_types[0]->isNullable())
Expand Down
7 changes: 7 additions & 0 deletions dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.h
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,13 @@ class DAGExpressionAnalyzer : private boost::noncopyable
DataTypes & arg_types,
TiDB::TiDBCollators & arg_collators);

void fillArgumentDetailForAggFuncBitwise(
const ExpressionActionsPtr & actions,
const tipb::Expr & arg,
Names & arg_names,
DataTypes & arg_types,
TiDB::TiDBCollators & arg_collators);

void makeExplicitSet(
const tipb::Expr & expr,
const Block & sample_block,
Expand Down
2 changes: 1 addition & 1 deletion dbms/src/Flash/Coprocessor/DAGUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ const std::unordered_map<tipb::ExprType, String> agg_func_map({
{tipb::ExprType::GroupConcat, "groupArray"},
//{tipb::ExprType::Avg, ""},
//{tipb::ExprType::Agg_BitAnd, ""},
//{tipb::ExprType::Agg_BitOr, ""},
{tipb::ExprType::Agg_BitOr, "groupBitOr"},
//{tipb::ExprType::Agg_BitXor, ""},
//{tipb::ExprType::Std, ""},
//{tipb::ExprType::Stddev, ""},
Expand Down
76 changes: 76 additions & 0 deletions tests/fullstack-test/expr/agg_bit_or_pushdown.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Copyright 2022 PingCAP, Ltd.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

mysql> drop table if exists test.all_null
mysql> create table test.all_null (a int, b int)
mysql> alter table test.all_null set tiflash replica 1
mysql> insert into test.all_null values(1, NULL), (1, NULL), (2, NULL), (2, NULL), (3, NULL);

mysql> drop table if exists test.not_all_null
mysql> create table test.not_all_null (a int, b int)
RinChanNOWWW marked this conversation as resolved.
Show resolved Hide resolved
mysql> alter table test.not_all_null set tiflash replica 1
mysql> insert into test.not_all_null values(1, 1), (1, 2), (2, 2), (2, 4), (3, NULL), (3, 4);

mysql> drop table if exists test.char
mysql> create table test.char (a int, b char(10))
mysql> alter table test.char set tiflash replica 1
mysql> insert into test.char values(1, '1a'), (1, '2a2'), (2, '2bbb'), (2, '4ccc11'), (3, 'adasda'), (3, 'ASD');

mysql> set @@tidb_isolation_read_engines='tiflash'

mysql> select bit_or(b) from test.all_null;
+-----------+
| bit_or(b) |
+-----------+
| 0 |
+-----------+
mysql> select a, bit_or(b) from test.all_null group by a;
+------+-----------+
| a | bit_or(b) |
+------+-----------+
| 1 | 0 |
| 2 | 0 |
| 3 | 0 |
+------+-----------+


mysql> select bit_or(b) from test.not_all_null;
+-----------+
| bit_or(b) |
+-----------+
| 7 |
+-----------+
mysql> select a, bit_or(b) from test.not_all_null group by a;
+------+-----------+
| a | bit_or(b) |
+------+-----------+
| 1 | 3 |
| 2 | 6 |
| 3 | 4 |
+------+-----------+

mysql> select bit_or(b) from test.char;
+-----------+
| bit_or(b) |
+-----------+
| 7 |
+-----------+
mysql> select a, bit_or(b) from test.char group by a;
+------+-----------+
| a | bit_or(b) |
+------+-----------+
| 1 | 3 |
| 2 | 6 |
| 3 | 0 |
+------+-----------+