-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: pd can support bucket steam and save them. #4670
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Codecov Report
@@ Coverage Diff @@
## master #4670 +/- ##
==========================================
- Coverage 75.42% 75.00% -0.42%
==========================================
Files 294 294
Lines 28334 28463 +129
==========================================
- Hits 21371 21349 -22
- Misses 5078 5213 +135
- Partials 1885 1901 +16
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
cf7f152
to
cdc56d9
Compare
06236c9
to
ea4c1d7
Compare
36d68c7
to
f9140d0
Compare
94751cb
to
2e22f1f
Compare
/open |
server/cluster/cluster_worker.go
Outdated
|
||
// HandleBucketHeartbeat processes RegionInfo reports from client | ||
func (c *RaftCluster) HandleBucketHeartbeat(buckets *metapb.Buckets) error { | ||
return c.processBucketHeartbeat(buckets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this abstruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the first, HandleBucketHeartbeat should handle the change of the bucket meta(keys and version), but in the second , it will handle the statistics.
The implementation is almost the same as region heartbeat, can we abstract the common part to reduce code? |
7d81749
to
b20dd37
Compare
server/cluster/cluster.go
Outdated
} | ||
|
||
// region should not update if the version of the buckets is less than the old one. | ||
if old := region.GetBuckets(); old != nil && old.Version >= buckets.Version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be old.Version > buckets.Version
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
region will not update if version is not change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the region heartbeat only checks <
here. As for a region, the version only changes when splitting or merging happens, so it's still necessary to update other statistical information while the version remains the same. I assume a bucket should also work like that, right?
pd/server/core/basic_cluster.go
Line 362 in f605a2c
if region.GetRegionEpoch().GetVersion() < item.GetRegionEpoch().GetVersion() && !isRegionRecreated(region) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a possibility that we have an ABA problem here? Consider the following case:
There are two requests coming at the same time. Both request A and request B have passed the version check, A is a little bit earlier than B. But B which has a larger version is finished earlier than A. After A is finished, B is lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the region heartbeat only checks
<
here. As for a region, the version only changes when splitting or merging happens, so it's still necessary to update other statistical information while the version remains the same. I assume a bucket should also work like that, right?pd/server/core/basic_cluster.go
Line 362 in f605a2c
if region.GetRegionEpoch().GetVersion() < item.GetRegionEpoch().GetVersion() && !isRegionRecreated(region) {
the version of the bucket maybe happened at cron job or wirte skew, so it should drop the buckets info if the version is same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a possibility that we have an ABA problem here? Consider the following case: There are two requests coming at the same time. Both request A and request B have passed the version check, A is a little bit earlier than B. But B which has a larger version is finished earlier than A. After A is finished, B is lost.
yes, the conor will be happened , I will use CAS to avoid the latest buckets to be replaced.
if r.approximateSize != 0 { | ||
return | ||
func (r *RegionInfo) Inherit(origin *RegionInfo) { | ||
// regionSize should not be zero if region is not empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can check whether the origin
is nil at the beginning of this function and return directly if it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
region size will be 1 if origin is nil and the size of the new region is zero.
23eb75a
to
7ef9147
Compare
Signed-off-by: bufferflies <1045931706@qq.com>
7ef9147
to
a78e632
Compare
a3acd39
to
93976a9
Compare
93976a9
to
d34547c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. And we need a concurrency test for this PR.
0ac9ced
to
f2d0db6
Compare
Signed-off-by: bufferflies <1045931706@qq.com>
f2d0db6
to
9d481c8
Compare
I have added one concurrency test for ABA and one benchmark test for region update. |
/merge |
@bufferflies: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 23dbb43
|
What problem does this PR solve?
Issue Number: Close #4669
What is changed and how it works?
Check List
Tests
Code changes
Side effects
Related changes
Release note