Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: fix update panic when update in prepare and execute #26759

Merged
merged 35 commits into from
Aug 5, 2021

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented Jul 30, 2021

Signed-off-by: ailinkid 314806019@qq.com

What problem does this PR solve?

Issue Number: close #25997

What is changed and how it works?

What's Changed:

Background:
prepare and execute may have the difference schema to build the plan and run the statement. When we are in buildExecute, once we found the real statement of Execute is update read, we will substitute information schema with the lasted one. Latter, we use the latested schema to rebuild the plan, and run it.

Actually, `IS` is read only for all the travel path around optimize and exec. The `IS` and `TxnCtn.IS`
should always be the same, otherwise, it will cause some check fails. For example. `checkUpdateList`
panics in the plan building phase.

...
optimize(is) -> OptimizePreparedPlan(is) -> ... `is` changed
runstmt(is)
...

The modification of `is` is in the deep call chain. when function-call stack frame go back to the outer, since
the `is` struct is value copy model, the outer `is` is still the old one before calling the runstmt.

while the session TxnCtx hold the new `IS`, we can substitute the outer `is` once we found the schema
version difference before calling the runstmt

This pr solved several bad point.
1: do not convert point-table-scan to point-get if there are non-public column exists.
2: change schema in the deep call chain is not necessary and it's risky, since the outer schema is still old.
3: Binary-Protocol Execute

  • since it has been fetched in the entrance of optimize.(we remove these schema change code in the deep call chain, otherwise, the new plan built with new schema will encounter check error in execute phase with old schema)

4: Text-Protocol Execute

  • we need to fetch the latest schema as soon find it's execute stmtType with forUpdateRead flag as possible. here we choose to do this at the same level of optimize & runstmt.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
use the test script in the issue.

before this pr, it will panic
after this, it can stably run

Release note

planner: fix update panic when update in prepare and executenner: fix update panic when update in prepare and execute

Signed-off-by: ailinkid <314806019@qq.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jul 30, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • xhebox
  • you06

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 30, 2021
@AilinKid AilinKid requested a review from you06 July 30, 2021 10:18
@AilinKid AilinKid assigned xhebox and unassigned xhebox Jul 30, 2021
@AilinKid AilinKid requested a review from xhebox July 30, 2021 10:18
@AilinKid AilinKid requested a review from sticnarf July 30, 2021 10:21
Signed-off-by: ailinkid <314806019@qq.com>
@github-actions github-actions bot added sig/execution SIG execution sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra labels Jul 30, 2021
Copy link
Contributor

@xhebox xhebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is kinda hacky. Maybe we can workaround the panic in another low-impact way.

planner/core/logical_plan_builder.go Outdated Show resolved Hide resolved
is = domain.GetDomain(sctx).InfoSchema()
sctx.GetSessionVars().TxnCtx.InfoSchema = is
Copy link
Contributor

@xhebox xhebox Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetInfoSchema() is not equavalent to sctx.GetSessionVars().TxnCtx.InfoSchema, e.g. snapshot and stale infoschema. And it is intended that the schema of session is not always the global latest, it should be session-latest except some special cases like forUpdate. In fact, I found 2pc relied on this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found 2pc relied on this behavior. I don't find this? where is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only find it works on the resolveAccessPaths

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. these two information can be different! may be for this sort of forUpdateRead case, they should be the same.

Copy link
Contributor

@xhebox xhebox Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2pc relied on schema checker and amender. Changing infoschema between optimize and execute may bypass the checker and fail the correctness of amender(for most cases? I don't know why forUpdateRead will ask for the latest exactly, seems it will cause inconsistency frequently if it is not the latest in some parallel cases)

You reminds me that I've suggested to remove these lines, because I have moved the logic of choosing correct infoschema to the outside https://github.com/pingcap/tidb/pull/26759/files#r679868151: these lines should be stub, the correct infoschema has been chosen. But got refused for risks, because the transaction reviewer seems can not track the whole execution path, too.

I've looked more closely, server/conn will dispatch prepare and execute into the one in session, so yes, the correct infoschema will be passed down from session down to here. And chasing a even newer infoschema between two closely coupled optimize and execute is non-sense for me.

Copy link
Contributor

@xhebox xhebox Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a whole new proposal to define the behavior of concurrent DDL, because sync the latest = use the global latest. So all sessions will share the same infoschema as long as possible. It is been a gray area for a long time. I believe a talk to the transaction and runtime team is must.

session/session.go Outdated Show resolved Hide resolved
session/session.go Outdated Show resolved Hide resolved
executor/prepared.go Outdated Show resolved Hide resolved
executor/executor.go Outdated Show resolved Hide resolved
@zimulala zimulala self-requested a review August 2, 2021 07:50
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
Signed-off-by: ailinkid <314806019@qq.com>
@morgo morgo self-requested a review August 5, 2021 03:07
@@ -745,7 +755,7 @@ func (ds *DataSource) findBestTask(prop *property.PhysicalProperty, planCounter
p: dual,
}, cntPlan, nil
}
canConvertPointGet := len(path.Ranges) > 0 && path.StoreType == kv.TiKV
canConvertPointGet := len(path.Ranges) > 0 && path.StoreType == kv.TiKV && ds.isPointGetConvertableSchema()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if executing normal statements without prepare-execute model? It affects those statements now.

Copy link
Contributor Author

@AilinKid AilinKid Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok, it's is a normal limitation.

@@ -157,6 +170,7 @@ type preprocessor struct {

// values that may be returned
*PreprocessorReturn
*PreprocessExecuteISUpdate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not name it Preprocessor like other members?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is PreprocessExecuteISUpdate necessary? How about just call IsExecuteForUpdateRead in ensureInfoSchema?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import circle problem, pass function poninter here is trying to do this

@@ -77,6 +77,24 @@ func IsReadOnly(node ast.Node, vars *variable.SessionVars) bool {
return ast.IsReadOnly(node)
}

// IsExecuteForUpdateRead is used to check whether the statement is `execute` and target statement has a forUpdateRead flag.
// If so, we will return the latest information schema.
func IsExecuteForUpdateRead(node ast.Node, sctx sessionctx.Context) infoschema.InfoSchema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name sounds like a function returning a bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Signed-off-by: ailinkid <314806019@qq.com>
@you06
Copy link
Contributor

you06 commented Aug 5, 2021

/lgtm

@ti-chi-bot
Copy link
Member

@you06: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@AilinKid AilinKid added sig/transaction SIG:Transaction status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 5, 2021
@ti-chi-bot ti-chi-bot added status/LGT1 Indicates that a PR has LGTM 1. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Aug 5, 2021
@you06
Copy link
Contributor

you06 commented Aug 5, 2021

/approve

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 5, 2021
@AilinKid
Copy link
Contributor Author

AilinKid commented Aug 5, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: da070d7

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 5, 2021
@zhouqiang-cl
Copy link
Contributor

/run-unit-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/execution SIG execution sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra sig/transaction SIG:Transaction size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TiDB panics at CheckUpdateList after change columns
6 participants