-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Optimize Tablet Report #54848
Conversation
a0a9f14
to
00a3ee5
Compare
6007573
to
b25d1a8
Compare
3aa14d9
to
fdc42ae
Compare
@@ -34,6 +34,7 @@ struct TMasterInfo { | |||
11: optional list<string> disabled_disks | |||
12: optional list<string> decommissioned_disks | |||
13: optional bool encrypted; | |||
14: optional bool stop_regular_tablet_report; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment to explain what is flag mean, And which version can deprecate this flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
fdc42ae
to
cd578d0
Compare
cd578d0
to
22888be
Compare
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 71 / 84 (84.52%) file detail
|
[BE Incremental Coverage Report]✅ pass : 28 / 30 (93.33%) file detail
|
Why I'm doing:
In StarRocks, FE will periodically diff the tablets in BE and the tablets recorded in metadata, and then process the inconsistent tablets. The current implementation is that BE reports the full number of tablets to the FE Leader regularly (default 1 minute), and the Leader maintains a reporting queue, and then retrieves one BE tablet from the queue each time for single-threaded processing. For large-scale clusters, the speed of FE processing usually cannot keep up with the speed of BE reporting, resulting in the existence of all BE tablets in the reporting queue, which causes memory waste. This optimization uses the Leader's active pull mode to control the tablets in the reporting queue within a BE range.
What I'm doing:
After optimization, a new TabletController daemon is added to regularly pull the full number of tablets from the Backend. The pull condition is
BE still retains the ability to actively report tablets to FE Leader, but only for emergency situations, such as disk corruption and the need to immediately remove replicas from FE metadata.
Test(a cluster with 5 million tablets)
after optimization

before optimization

We can see the GC time has become smoother.
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: