Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Agg_BitOr function push down. #5293

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

RinChanNOWWW
Copy link

Signed-off-by: hezheyu rinchannow@bupt.edu.cn

What problem does this PR solve?

Issue Number: close #5118

Problem Summary:

What is changed and how it works?

Apply CAST function to the bitwise aggregation functions' children exprs to make ordinary Clickhouse functions be able to work.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jul 5, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Lloyd-Pottiger

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Jul 5, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Jul 5, 2022

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 5, 2022
@RinChanNOWWW
Copy link
Author

When the rows are all NULL, TiFlash will throw an exception: "cannot convert null value to non-null type". I don't know how to solve this problem. Please help me to reach the solution.

P.S. If there is at least one non-null row, it will work well.

@ti-chi-bot
Copy link
Member

Welcome @RinChanNOWWW!

It looks like this is your first PR to pingcap/tiflash 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tiflash. 😃

@RinChanNOWWW
Copy link
Author

cc @XuHuaiyu

@RinChanNOWWW
Copy link
Author

RinChanNOWWW commented Jul 6, 2022

When the rows are all NULL, TiFlash will throw an exception: "cannot convert null value to non-null type". I don't know how to solve this problem. Please help me to reach the solution.

P.S. If there is at least one non-null row, it will work well.

Solved with AggregateFunctionCombinatorNull::transformAggregateFunction and class AggregateFunctionNullUnary.

@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 6, 2022
@RinChanNOWWW RinChanNOWWW force-pushed the PUSHDOWN-BITOR-TO-TIFLASH branch 2 times, most recently from e96b302 to b9b79ca Compare July 6, 2022 07:40
@RinChanNOWWW RinChanNOWWW marked this pull request as ready for review July 6, 2022 07:41
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 6, 2022
@Willendless
Copy link
Contributor

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jul 6, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AggregateFunctions/AggregateFunctionNull.h          170               159     6.47%          49                42    14.29%         325               302     7.08%          58                58     0.00%
Flash/Coprocessor/DAGExpressionAnalyzer.cpp         402               154    61.69%          51                 8    84.31%         932               339    63.63%         314               156    50.32%
Flash/Coprocessor/DAGExpressionAnalyzer.h             2                 1    50.00%           2                 1    50.00%           2                 1    50.00%           0                 0         -
Flash/Coprocessor/DAGUtils.cpp                      347               155    55.33%          45                15    66.67%         638               294    53.92%         412               197    52.18%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               921               469    49.08%         147                66    55.10%        1897               936    50.66%         784               411    47.58%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18412      9643             47.63%    207258  96576        53.40%

full coverage report (for internal network access only)

@RinChanNOWWW
Copy link
Author

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jul 7, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AggregateFunctions/AggregateFunctionNull.h          170               159     6.47%          49                42    14.29%         323               300     7.12%          58                58     0.00%
Flash/Coprocessor/DAGExpressionAnalyzer.cpp         402               154    61.69%          51                 8    84.31%         932               339    63.63%         314               156    50.32%
Flash/Coprocessor/DAGExpressionAnalyzer.h             2                 1    50.00%           2                 1    50.00%           2                 1    50.00%           0                 0         -
Flash/Coprocessor/DAGUtils.cpp                      347               155    55.33%          45                15    66.67%         638               294    53.92%         412               197    52.18%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               921               469    49.08%         147                66    55.10%        1895               934    50.71%         784               411    47.58%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18424      9626             47.75%    207353  96536        53.44%

full coverage report (for internal network access only)

Copy link
Contributor

@Lloyd-Pottiger Lloyd-Pottiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 13, 2022
@RinChanNOWWW
Copy link
Author

/run-all-tests

1 similar comment
@Lloyd-Pottiger
Copy link
Contributor

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jul 14, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AggregateFunctions/AggregateFunctionNull.h          170               159     6.47%          49                42    14.29%         323               300     7.12%          58                58     0.00%
Flash/Coprocessor/DAGExpressionAnalyzer.cpp         403               150    62.78%          52                 8    84.62%         935               330    64.71%         314               151    51.91%
Flash/Coprocessor/DAGExpressionAnalyzer.h             2                 1    50.00%           2                 1    50.00%           2                 1    50.00%           0                 0         -
Flash/Coprocessor/DAGUtils.cpp                      347               142    59.08%          45                11    75.56%         638               272    57.37%         412               192    53.40%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               922               452    50.98%         148                62    58.11%        1898               903    52.42%         784               401    48.85%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18529      9574             48.33%    208830  96396        53.84%

full coverage report (for internal network access only)

@ywqzzy
Copy link
Contributor

ywqzzy commented Jul 14, 2022

/rebuild

Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
Signed-off-by: RinChanNOWWW <rinchannow@bupt.edu.cn>
@SeaRise
Copy link
Contributor

SeaRise commented Jul 25, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jul 25, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AggregateFunctions/AggregateFunctionNull.h          170               109    35.88%          49                30    38.78%         323               201    37.77%          58                44    24.14%
Flash/Coprocessor/DAGExpressionAnalyzer.cpp         403               144    64.27%          52                 8    84.62%         935               322    65.56%         314               141    55.10%
Flash/Coprocessor/DAGExpressionAnalyzer.h             2                 1    50.00%           2                 1    50.00%           2                 1    50.00%           0                 0         -
Flash/Coprocessor/DAGUtils.cpp                      347               141    59.37%          45                11    75.56%         638               271    57.52%         412               190    53.88%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               922               395    57.16%         148                50    66.22%        1898               795    58.11%         784               375    52.17%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18800      9454             49.71%    214210  95541        55.40%

full coverage report (for internal network access only)

@@ -467,6 +474,13 @@ class AggregateFunctionNullUnary final : public AggregateFunctionNullBase<result
arena,
if_argument_pos);

if (const String & nested_func_name = this->nested_function->getName(); nested_func_name == "groupBitOr" || nested_func_name == "groupBitAnd" || nested_func_name == "groupBitXor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use if constexpr (result_is_nullable)?

if (!column->isNullAt(row_num))
{
this->setFlag(place);
const IColumn * nested_column = &column->getNestedColumn();
this->nested_function->add(this->nestedPlace(place), &nested_column, row_num, arena);
}
else if (const String & nested_func_name = this->nested_function->getName(); nested_func_name == "groupBitOr" || nested_func_name == "groupBitAnd" || nested_func_name == "groupBitXor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use if constexpr (result_is_nullable)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is right for non-bitwise functions.

Comment on lines +172 to +173
// Bitwise operations can only be applied to integer types.
// So we need to cast the argument to UInt64.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to add cast here?
Didn't tidb add a cast here?

Copy link
Author

@RinChanNOWWW RinChanNOWWW Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, TiDB didn't. We compute the result in TiFlash, so we should cast in TiFlash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems tidb does type cast during operator execution phase.
https://github.com/pingcap/tidb/blob/bce6901705cde161451c0bd707a07ecfac0c7578/expression/aggregation/bit_or.go#L37-L56. This behavior is kind of similar to non-agg bitwise function. I think we might have two choice

  1. also do type cast during operator execution phase instead of appending cast beforehand
  2. do the same as non-agg bitwise function, i.e. appending cast in advance

String DAGExpressionAnalyzerHelper::buildBitwiseFunction(
DAGExpressionAnalyzer * analyzer,
const tipb::Expr & expr,
const ExpressionActionsPtr & actions)
{
const String & func_name = getFunctionName(expr);
Names argument_names;
// We should convert arguments to UInt64.
// See https://github.com/pingcap/tics/issues/1756
DataTypePtr uint64_type = std::make_shared<DataTypeUInt64>();
const Block & sample_block = actions->getSampleBlock();
for (const auto & child : expr.children())
{
String name = analyzer->getActions(child, actions);
DataTypePtr orig_type = sample_block.getByName(name).type;
// Bump argument type
if (!removeNullable(orig_type)->equals(*uint64_type))
{
if (orig_type->isNullable())
{
name = analyzer->appendCast(makeNullable(uint64_type), actions, name);
}
else
{
name = analyzer->appendCast(uint64_type, actions, name);
}
}
argument_names.push_back(name);
}
return analyzer->applyFunction(func_name, argument_names, actions, nullptr);
}

However, I think current comment is not accurate. TiDB always pushes integer types to tiflash. The reason why we append a cast, or do type cast in execution phase, is because mysql (prior to 8.0) handle only unsigned 64-bit integer argument and result values.

ref: https://dev.mysql.com/doc/refman/8.0/en/bit-functions.html

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 25, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Agg_BitOr function push down
7 participants