Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain, session: Add new sysvarcache to replace global values cache (#24359) #26030

Closed

Conversation

ti-srebot
Copy link
Contributor

@ti-srebot ti-srebot commented Jul 8, 2021

cherry-pick #24359 to release-4.0
You can switch your code base to this Pull Request by using git-extras:

# In tidb repo:
git pr https://github.com/pingcap/tidb/pull/26030

After apply modifications, you can push your change to this PR via:

git push git@github.com:ti-srebot/tidb.git pr/26030:release-4.0-0f10bef470f4

What problem does this PR solve?

Issue Number: close #24326

Problem Summary:

The existing global vars cache only caches for 2 seconds, and does not perform notification to other servers when cache is invalid. This PR changes the design to be basically the same as the privileges system cache.

It fixes two immediate bugs:

  • Read after write consistency from setting global vars (on the initiating instance only)
  • Running SHOW VARIABLES for the first time in a session previously took 1 second in some cases(!). It should now read the values from memory.

What is changed and how it works?

What's Changed:

  • The global vars cache is replaced with a new cache.
  • All operations check the cache first rather than read from mysql.global_variables.
  • Updating a global variable sends a notification to other servers via etcd.
  • Updating a global variable is read-after-write consistent on a single server.

This opens the door, but does not fix some remaining issues - the session cache should be populated with a copy of all session vars when the session starts. Currently there is a lazy loading mechanism which is not MySQL compatible. It also caches some global variables in the session systems[] which is incorrect. In a followup PR I hope to remove the array builtinGlobalVariable and simplify loadCommonGlobalVariablesIfNeeded to just copy session vars.

Related changes

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

I manually verified that behavior is correct with multiple servers receiving notice from etcd.

Side effects

  • There is an small upgrade issue with this PR: because the cache now lives for a long time, it relies on a notification from etcd that the cache is stale and needs to be refreshed. If the cluster is running with mixed-versions, and the SET GLOBAL statement is run on the older version, it will not send an etcd notification to newer servers that their cache is stale. I think this behavior is acceptable, since the cache will be refreshed within a few minutes, but it should be made clear in the release notes.

Release note

  • The system variables cache has been replaced to follow a similar design to privileges cache, where all TiDB servers are immediately notified of changes to variables. This lifts a previous limitation where changes to global variables may take 2 seconds to take effect. To take advantage of this feature, all TiDB servers need to be upgraded so that they are able to both publish and subscribe to system variable changes. Operating with mixed versions might result in an issue where system variable changes take longer (30 seconds) to propagate to the cluster.

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot
Copy link
Member

@ti-srebot: This cherry pick PR is for a release branch and has not yet been approved by release team.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick, it must first be approved by the collaborators.

AFTER it has been approved by collaborators, please ping the release team in a comment to request a cherry pick review.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot added sig/execution SIG execution sig/sql-infra SIG: SQL Infra size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/4.0-cherry-pick labels Jul 8, 2021
@ti-srebot ti-srebot added this to the v4.0.13 milestone Jul 8, 2021
@morgo
Copy link
Contributor

morgo commented Jul 8, 2021

Hello! I'm sorry, I discussed cherry picking this before with my reviewers. We agreed that it looks like it is too risky unfortunately :(

I am going to close this issue. But please feel free to reach out to me if you have evidence to the contrary.

@morgo morgo closed this Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/cherry-pick-not-approved sig/execution SIG execution sig/sql-infra SIG: SQL Infra size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/4.0-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants