Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support THP in tiup cluster check #964

Merged
merged 8 commits into from
Dec 9, 2020
Merged

Conversation

anywhy
Copy link
Contributor

@anywhy anywhy commented Dec 3, 2020

What problem does this PR solve?

try implement #798

What is changed and how it works?

Check List

Tests

  • Integration test

Release notes:

NONE

@ti-chi-bot ti-chi-bot requested review from lonng and lucklove December 3, 2020 01:37
@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 3, 2020
@codecov-io
Copy link

codecov-io commented Dec 3, 2020

Codecov Report

Merging #964 (331114f) into master (a9370cd) will decrease coverage by 3.82%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #964      +/-   ##
==========================================
- Coverage   55.44%   51.61%   -3.83%     
==========================================
  Files         263      263              
  Lines       19833    19854      +21     
==========================================
- Hits        10996    10248     -748     
- Misses       7118     7972     +854     
+ Partials     1719     1634      -85     
Flag Coverage Δ
cluster 38.35% <0.00%> (-5.26%) ⬇️
dm 24.18% <0.00%> (-0.09%) ⬇️
integrate 46.38% <0.00%> (-3.88%) ⬇️
playground 20.56% <ø> (ø)
tiup 16.79% <ø> (ø)
unittest 22.94% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/cluster/command/check.go 6.27% <0.00%> (-72.93%) ⬇️
pkg/cluster/operation/check.go 0.00% <0.00%> (-51.95%) ⬇️
pkg/cluster/task/check.go 0.00% <0.00%> (-24.44%) ⬇️
pkg/cluster/task/limits.go 0.00% <0.00%> (-68.75%) ⬇️
pkg/cluster/task/sysctl.go 0.00% <0.00%> (-66.67%) ⬇️
components/cluster/command/audit.go 27.27% <0.00%> (-54.55%) ⬇️
pkg/cluster/task/rmdir.go 0.00% <0.00%> (-50.00%) ⬇️
pkg/cluster/operation/operation.go 34.78% <0.00%> (-43.48%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9370cd...331114f. Read the comment docs.

@@ -58,6 +58,7 @@ var (
CheckNameSELinux = "selinux"
CheckNameCommand = "command"
CheckNameFio = "fio"
CheckNameTHP = "THP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep up with the previous code style, use lowercase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, got it

@@ -33,6 +33,7 @@ var (
CheckTypePackage = "package"
CheckTypePartitions = "partitions"
CheckTypeFIO = "fio"
CheckTypeTHP = "THP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, THP should be part of CheckTypeSystemConfig

@@ -651,3 +652,20 @@ func CheckFIOResult(rr, rw, lat []byte) []*CheckResult {

return results
}

// CheckTHPEnabled checks THP in /sys/kernel/mm/transparent_hugepage/enabled,/sys/kernel/mm/transparent_hugepage/defrag
func CheckTHPEnabled(opt *CheckOptions, l []byte) []*CheckResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/CheckTHPEnabled/CheckTHP/

var results []*CheckResult

for _, line := range strings.Split(string(l), "\n") {
if !strings.Contains(line, "[never]") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, if THP is disabled, we should also add a Msg result

@@ -31,6 +31,7 @@ function cmd_subtest() {
echo $check_result | grep "os-version"
echo $check_result | grep "selinux"
echo $check_result | grep "service"
echo $check_result | grep "THP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

).
Shell(
inst.GetHost(),
"cat /sys/kernel/mm/transparent_hugepage/defrag",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Maybe you can merge the two command into one:
cat /sys/kernel/mm/transparent_hugepage/{enabled,defrag}
  1. IMO, maybe you can implement the THP Check like CheckSELinux()

@@ -442,6 +442,11 @@ func fixFailedChecks(ctx *task.Context, host string, res *operator.CheckResult,
),
true)
msg = fmt.Sprintf("will try to %s, reboot might be needed", color.HiBlueString("disable SELinux"))
case operator.CheckNameTHP:
t.Shell(host,
"echo never > /sys/kernel/mm/transparent_hugepage/enabled && echo never > /sys/kernel/mm/transparent_hugepage/defrag",
Copy link
Contributor

@9547 9547 Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will disable THP in runtime, if the system rebooted, the THP's status maybe enabled again? Should we disable it permanently?

  • Need to permanently disabled: seems need to change the grub.cfg, but this file maybe differ in different Linux System;
  • Don't need to: we should tell the user this fix is available until reboot

I perfer to the latter one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the origin issue is reference to tidb-ansible, maybe just keep the same behavior as tidb-ansible is enough

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this config does not have an entry in sysctl, so the only way to make it consist after reboot would be adding a kernel parameter at boot. ref https://www.kernel.org/doc/html/v5.4/admin-guide/mm/transhuge.html#boot-parameter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unpractical to add a kernel parameter, just remind the user to run tiup cluster check again after reboot.

Copy link
Contributor

@9547 9547 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@lucklove
Copy link
Member

lucklove commented Dec 9, 2020

/merge

@ti-chi-bot
Copy link
Member

@lucklove: adding 'status/can-merge' to this PR must have 1 LGTMs

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the tidb-community-bots/ti-community-prow repository.

@lucklove
Copy link
Member

lucklove commented Dec 9, 2020

/lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 9, 2020
@lucklove
Copy link
Member

lucklove commented Dec 9, 2020

/merge

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 9, 2020
@ti-chi-bot
Copy link
Member

Can merge label has been added.

Git tree hash: 9725709

@ti-chi-bot
Copy link
Member

@anywhy: Your PR has out-of-dated and I have automatically updated it for you.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the tidb-community-bots/ti-community-prow repository.

@ti-chi-bot ti-chi-bot merged commit 250c47b into pingcap:master Dec 9, 2020
@anywhy anywhy deleted the disable_THP branch December 19, 2020 11:50
lucklove added a commit to lucklove/tiup that referenced this pull request Dec 22, 2020
Introduced by pingcap#964

There is a newline at the end of output of cat command,
so the '[never]' check will fail on the empty newline.
ti-chi-bot pushed a commit that referenced this pull request Dec 24, 2020
Introduced by #964

There is a newline at the end of output of cat command,
so the '[never]' check will fail on the empty newline.
lucklove added a commit that referenced this pull request Dec 31, 2020
Introduced by #964

There is a newline at the end of output of cat command,
so the '[never]' check will fail on the empty newline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants