Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedule: improve the leader distribution after region scatter #2659

Merged
merged 4 commits into from
Jul 28, 2020

Conversation

nolouch
Copy link
Contributor

@nolouch nolouch commented Jul 16, 2020

Signed-off-by: nolouch nolouch@gmail.com

What problem does this PR solve?

Fix #2655

What is changed and how it works?

  • add leader store counter
  • picker the minimal store

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)
    the new result with this PR.
MySQL [test]> select count(s.region_id) cnt, s.index_name, p.store_id from INFORMATION_SCHEMA.TIKV_REGION_STATUS s join INFORMATION_SCHEMA.tikv_region_peers p on s.region_id = p.region_id where s.table_name = 'ss' group by index_name, p.store_id order by index_name,cnt desc;
+-----+------------+----------+
| cnt | index_name | store_id |
+-----+------------+----------+
| 172 | NULL       |        6 |
| 172 | NULL       |       11 |
| 172 | NULL       |        1 |
| 170 | NULL       |       93 |
| 170 | NULL       |  3670905 |
| 170 | NULL       |       10 |
| 170 | NULL       |        4 |
| 170 | NULL       |        5 |
| 170 | NULL       |        8 |
|   1 | idx1       |       11 |
|   1 | idx1       |        1 |
|   1 | idx1       |        6 |
+-----+------------+----------+
12 rows in set (0.43 sec)

MySQL [test]> select count(s.region_id) cnt, s.index_name, p.store_id from INFORMATION_SCHEMA.TIKV_REGION_STATUS s join INFORMATION_SCHEMA.tikv_region_peers p on s.region_id = p.region_id where s.table_name = 'ss' and p.is_leader = 1 group by index_name, p.store_id order by index_name,cnt desc;
+-----+------------+----------+
| cnt | index_name | store_id |
+-----+------------+----------+
|  59 | NULL       |        1 |
|  58 | NULL       |        6 |
|  57 | NULL       |       93 |
|  57 | NULL       |       10 |
|  57 | NULL       |        8 |
|  57 | NULL       |        4 |
|  56 | NULL       |        5 |
|  56 | NULL       |  3670905 |
|  55 | NULL       |       11 |
|   1 | idx1       |        6 |
+-----+------------+----------+
10 rows in set (0.47 sec)

Release note

  • improve the leader distribution after region scatter

Signed-off-by: nolouch <nolouch@gmail.com>
Signed-off-by: nolouch <nolouch@gmail.com>
@nolouch nolouch requested review from lhy1024 and disksing July 16, 2020 13:42
@Yisaer Yisaer self-requested a review July 17, 2020 07:22
@@ -153,6 +179,8 @@ func (r *RegionScatterer) scatterRegion(region *core.RegionInfo) *operator.Opera
}

scatterWithSameEngine(ordinaryPeers, r.ordinaryEngine)
// FIXME: target leader only consider the ordinary engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about creating an issue to track this (Ignore me if already).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add more comments.

server/schedule/region_scatterer.go Outdated Show resolved Hide resolved
server/schedule/region_scatterer.go Outdated Show resolved Hide resolved
@@ -153,6 +179,8 @@ func (r *RegionScatterer) scatterRegion(region *core.RegionInfo) *operator.Opera
}

scatterWithSameEngine(ordinaryPeers, r.ordinaryEngine)
// FIXME: target leader only consider the ordinary engine.
targetLeader := r.collectAvailableLeaderStores(targetPeers, r.ordinaryEngine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the leader store be collected again after collectAvailableLeaderStores in scatterWithSameEngine function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now, only consider the dorinaryEngine.

Comment on lines +38 to +42
func (s *selectedLeaderStores) put(id uint64) {
s.mu.Lock()
defer s.mu.Unlock()
s.stores[id] = s.stores[id] + 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the leader is transferred manually, will the old store's count minus 1 and new store's count plus 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. This scheduler assumes that the leader and region will not change significantly after scheduling. Maybe we need to discuss optimization in another issue.

Copy link
Contributor

@Yisaer Yisaer Jul 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can directly to make a mechanism (syncer or tracker) to record the correct leader count distribution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, it will be more complicated to do this, such as truncate table we need to remove the regions from the tracker, and what if recover a table? The current approach is at least feasible in general, there will be no more operator after cluster in balanced, especially in big cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the tracker you mentioned is a good idea, and I even want to use it to report why there is this operator produce.
cc @Yisaer

Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 20, 2020
@nolouch nolouch added the needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. label Jul 24, 2020
Copy link
Member

@rleungx rleungx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Jul 28, 2020
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Jul 28, 2020
@nolouch
Copy link
Contributor Author

nolouch commented Jul 28, 2020

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 28, 2020
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot ti-srebot merged commit b1f967b into tikv:master Jul 28, 2020
ti-srebot pushed a commit to ti-srebot/pd that referenced this pull request Jul 28, 2020
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #2684

@nolouch nolouch deleted the fix-scatter-leader branch July 28, 2020 03:22
nolouch added a commit to ti-srebot/pd that referenced this pull request Aug 3, 2020
ti-srebot added a commit that referenced this pull request Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/schedule Scheduling logic. needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Imporve the scatter leader of the region behavior in the presplit
5 participants