Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Colocate plan][Step1] Colocate join covers more situations #5521

Merged
merged 6 commits into from
Apr 11, 2021

Conversation

EmmyMiao87
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 commented Mar 15, 2021

Proposed changes

There are 4 steps in colocate plan: join, agg, sort and set operation.
This is the first step.

The old colocate join can only cover the case where the child is hash or scan.
In fact, as long as the child's data distribution meets the requirements,
no matter what the plan node on the child node is, a colocate join can be performed.

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)
  • Code refactor (Modify the code structure, format the code, etc...)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have created an issue on (Fix Colocate plan #5589) and described the bug/feature there in detail. Later
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • If these changes need document changes, I have updated the document
  • Any dependent changes have been merged

Further comments

document is comming

@EmmyMiao87 EmmyMiao87 added kind/improvement area/planner Issues or PRs related to the query planner area/colocated Issues or PRs related to colocated tables labels Mar 15, 2021
@EmmyMiao87 EmmyMiao87 linked an issue Mar 15, 2021 that may be closed by this pull request
@EmmyMiao87
Copy link
Contributor Author

Why change import order of LogManager?

The order of import should comply with http://doris.apache.org/master/zh-CN/developer-guide/fe-eclipse-dev.html#%E4%BB%A3%E7%A0%81%E6%9B%B4%E6%96%B0

@EmmyMiao87
Copy link
Contributor Author

seems not need return null too many, may under code is better:

       if (this instanceof ScanNode && tupleIds.contains(tupleId)) {
            return (ScanNode) this;
        } else if (!(this instanceof ExchangeNode)) {
            for (PlanNode planNode : children) {
                ScanNode scanNode = planNode.getScanNodeInOneFragmentByTupleId(tupleId);
                if (scanNode != null) {
                    return scanNode;
                }
            }
        }
        return null;

changed

@EmmyMiao87
Copy link
Contributor Author

the comment is same as line 570. just keep one is enough.

removed

HappenLee
HappenLee previously approved these changes Mar 23, 2021
Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Collections.shuffle(list);
backends.add(list.get(0));
if (FeConstants.runningUnitTest) {
backends.addAll(list);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a test, add all the backends belonging to this ip.

@VariableMgr.VarAttr(name = DISABLE_COLOCATE_JOIN)
public boolean disableColocateJoin = false;
@VariableMgr.VarAttr(name = DISABLE_COLOCATE_PLAN)
public boolean disableColocatePlan = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the previous configuration name is colocate join. It only controls the colocate of the join.
Now this new configuration uniformly controls the colocate of all plan node.
So it would be too one-sided to call colocate join.

The old colocate join can only cover the case where the child is hash or scan.
In fact, as long as the child's data distribution meets the requirements,
  no matter what the plan node on the child node is, a colocate join can be performed.
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added the approved Indicates a PR has been approved by one committer. label Apr 8, 2021
@morningman morningman merged commit a25e3af into apache:master Apr 11, 2021
EmmyMiao87 added a commit to EmmyMiao87/incubator-doris that referenced this pull request May 14, 2021
)

The old colocate join can only cover the case where the child is hash or scan.
In fact, as long as the child's data distribution meets the requirements,
no matter what the plan node on the child node is, a colocate join can be performed.

Change-Id: I07fc736087501d897ea26dd6b0f030e72cfddc9c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/colocated Issues or PRs related to colocated tables area/planner Issues or PRs related to the query planner kind/improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Colocate plan cocolate join在查询中带函数的情况下会失效
3 participants