Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: fix agg elimination logic after agg pushed down through a join #44941

Merged
merged 10 commits into from
Jun 30, 2023

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented Jun 26, 2023

What problem does this PR solve?

Issue Number: close #44795

Problem Summary:

What is changed and how it works?

code in aggregationPushDownSolver for the switch p.SCtx().GetSessionVars().AllowAggPushDown path is too old to use.

briefly speaking, when we try to push agg down through a join (that's what aggregationPushDownSolver does) and try to combine some aggregation elimination (for example, group item covering unique key, then the aggregation itself can be eliminated):

buildKeyInfo(join)
proj := a.tryToEliminateAggregation(agg, opt). // here use an old pointer, whose child is already changed during the logic above
if proj != nil {
	p = proj
}

the comment place is actually to eliminate the new pushed-down agg since agg's children have changed, maybe some new unique key can be detected to eliminate itself. Or we should say, the old agg elimination logic is quite different from normal agg eliminations (see comments for more detail)

image

old agg rewriting still uses the ifnull(col#19, 0, 1)'s logic, taking every row as a count 1, while after the agg has been pushed down, the col#19 here is already the final aggregation result, rather than in the process of aggregation. Keep it real is the true path, while for now, we banned this kind of old agg elimination logic.

so the handling logic is a mess.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    scripts:

sql = $below

SELECT c_count, count(*) as custdist
from ( SELECT c_custkey, count(o_orderkey)  as  c_count
       from customer left join orders on c_custkey = o_custkey and o_comment not like '%special%requests%'
       group by c_custkey ) c_orders
group by c_count
order by custdist desc, c_count desc;

execute $sql + "into outfile 'tai1.txt'" under set tidb_opt_agg_push_down=ON;
execute $sql + "into outfile 'tai2txt'" under set tidb_opt_agg_push_down=OFF;

diff tai1.txt tai2.txt to see nothing strange

  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

planner: fix agg elimination logic after agg pushed down to join

@ti-chi-bot ti-chi-bot bot added needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 26, 2023
@tiprow
Copy link

tiprow bot commented Jun 26, 2023

Hi @AilinKid. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AilinKid AilinKid changed the title planner: fix agg elimination logic after agg pushed down to join planner: fix agg elimination logic after agg pushed down through join Jun 26, 2023
@AilinKid AilinKid changed the title planner: fix agg elimination logic after agg pushed down through join planner: fix agg elimination logic after agg pushed down through a join Jun 26, 2023
@AilinKid
Copy link
Contributor Author

/test all

@tiprow
Copy link

tiprow bot commented Jun 26, 2023

@AilinKid: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hawkingrei
Copy link
Member

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Jun 26, 2023
@hawkingrei
Copy link
Member

/retest

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 26, 2023
planner/core/rule_aggregation_elimination.go Outdated Show resolved Hide resolved
Comment on lines 77 to 78
if a.oldAggEliminationCheck {
if !CheckCanConvertAggToProj(agg) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this check can always take effect, instead of only for the pushed agg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.
agg1 -> join -> agg2
now the case is that agg1's work has been transferred to agg2, agg1 can be just removed.
while 'we' do the remove-work by aggEliminator, in which it will rewrite some agg func to a new projection, consequently causing problems.

the ideal approach is that to make the agg1's parent shift its args to what join can produce after agg2 is generated, otherwise, the can't-find-column error will occur. (currently too complicated for a quick fix)

we just keep the old agg1 remained, but its work is actually done by agg2, old-agg1 here acts like a role to project what schema the join produced to what the old-agg1 parent wanted as before.

@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2023
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
.
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
@ti-chi-bot ti-chi-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2023
AilinKid and others added 2 commits June 30, 2023 14:58
Co-authored-by: Elsa <111482174+elsa0520@users.noreply.github.com>
Co-authored-by: Elsa <111482174+elsa0520@users.noreply.github.com>
@ti-chi-bot ti-chi-bot bot deleted a comment from ti-chi-bot Jun 30, 2023
@ti-chi-bot ti-chi-bot bot deleted a comment from ti-chi-bot Jun 30, 2023
Copy link
Contributor

@fixdb fixdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jun 30, 2023
Signed-off-by: AilinKid <314806019@qq.com>
Copy link
Contributor

@elsa0520 elsa0520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jun 30, 2023
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 30, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-06-30 08:13:26.573002096 +0000 UTC m=+73289.947579946: ☑️ agreed by fixdb.
  • 2023-06-30 09:45:47.358807771 +0000 UTC m=+78830.733385621: ☑️ agreed by elsa0520.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 30, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elsa0520, fixdb, wshwsh12

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Jun 30, 2023
@ti-chi-bot ti-chi-bot bot deleted a comment from ti-chi-bot Jun 30, 2023
@ti-chi-bot ti-chi-bot bot merged commit 9885d1b into pingcap:master Jun 30, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.1: #45095.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Jun 30, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #45096.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Jun 30, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #45097.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Jun 30, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

got wrong result when enable tidb_opt_agg_push_down
7 participants