-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroups: make sure cgroup still exists after task restart #12875
Conversation
repro example job "restarts" {
datacenters = ["dc1"]
type = "service"
group "g1" {
restart {
attempts = 5
mode = "delay"
delay = "1s"
interval = "5s"
}
task "t1-raw_exec" {
driver = "raw_exec"
config {
command = "/usr/bin/bash"
args = ["-c", "sleep 10"]
}
}
task "t2-exec" {
driver = "exec"
config {
command = "/usr/bin/bash"
args = ["-c", "sleep 10"]
}
}
}
} with fix:
|
3f0b766
to
e5e3e99
Compare
Running relevant tests locally on a cgroups v2 machine
test-cgutil.log |
This PR modifies raw_exec and exec to ensure the cgroup for a task they are driving still exists during a task restart. These drivers have the same bug but with different root cause. For raw_exec, we were removing the cgroup in 2 places - the cpuset manager, and in the unix containment implementation (the thing that uses freezer cgroup to clean house). During a task restart, the containment would remove the cgroup, and when the task runner hooks went to start again would block on waiting for the cgroup to exist, which will never happen, because it gets created by the cpuset manager which only runs as an alloc pre-start hook. The fix here is to simply not delete the cgroup in the containment implementation; killing the PIDs is enough. The removal happens in the cpuset manager later anyway. For exec, it's the same idea, except DestroyTask is called on task failure, which in turn calls into libcontainer, which in turn deletes the cgroup. In this case we do not have control over the deletion of the cgroup, so instead we hack the cgroup back into life after the call to DestroyTask. All of this only applies to cgroups v2.
e5e3e99
to
37ffd2f
Compare
Will follow up with CI changes to run on ubuntu-22.04 in GHA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Does the most recent commit cover #12877 as well do you think or is that going to need to be a separate investigation?
Let's keep it open; so far I haven't been able to reproduce but my laptop is on an older kernel |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR modifies raw_exec and exec to ensure the cgroup for a task
they are driving still exists during a task restart. These drivers
have the same bug but with different root cause.
For raw_exec, we were removing the cgroup in 2 places - the cpuset
manager, and in the unix containment implementation (the thing that
uses freezer cgroup to clean house). During a task restart, the
containment would remove the cgroup, and when the task runner hooks
went to start again would block on waiting for the cgroup to exist,
which will never happen, because it gets created by the cpuset manager
which only runs as an alloc pre-start hook. The fix here is to simply
not delete the cgroup in the containment implementation; killing the
PIDs is enough. The removal happens in the cpuset manager later anyway.
For exec, it's the same idea, except DestroyTask is called on task
failure, which in turn calls into libcontainer, which in turn deletes
the cgroup. In this case we do not have control over the deletion of
the cgroup, so instead we hack the cgroup back into life after the
call to DestroyTask.
All of this only applies to cgroups v2.
Fixes #12863
No CL because cgroupsv2 hasn't shipped yet.