-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: change image-gc-high-threshold below docker dm.min_free_space #40432
Conversation
The change makes sense to me, but @dashpole, could you please take another look at this? |
@sjenning do you think this is the cause of this issue?. I noticed many people were using devicemapper, and some were wondering why image garbage collection hadn't kicked in. The change makes sense to me as well. Would it also make sense to decrease the ImageGCLowThresholdPercent? I am not actually sure what kubemark is, but I found a default for ImageGCHighThresholdPercent here as well. Should we change that value to match? |
@dashpole could be. The issues seems to have a number of different cases in it. The case this PR handles is when Docker refuses to pull/start a container due to low dm thin pool space, yet the node doesn't report disk pressure and doesn't do image GC, effectively wedging the node. |
/approve I approve this change, but expect @dashpole review the code more. Thanks! |
This is /lgtm |
@dchen1107 this lgtm as well |
if and when we look to define a default for imagefs based eviction thresholds, we need to keep this in mind as well. |
@derekwaynecarr, why is that? The image-gc-high-threshold modified in this PR only affects periodic garbage collection. Eviction passes this check and directly triggers deletion of all unused images IIRC. |
I wonder if it is not possible to change the docker setting to 8% for example? Given that disk reclamation is slow, I suspect that even with this PR, we will see image pull failures. |
I don't see why we have to change kube defaults based on the behavior of a single docker storage driver. |
/approve cancel |
[APPROVALNOTIFIER] This PR is APPROVED The following people have approved this PR: dchen1107, sjenning Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@vishh -- should not the default value kube selects work reasonably well across the popular set of container runtime storage drivers out of the box? if operators wanted to be more aggressive for a particular storage driver, then they can modify the default higher? it seems anti-user to have a default that doesnt work well on popular options by default? |
to phrase it another way, we should select a default for OSS version of k8s that works for the broadest number of users, and causes the least amount of noise in the form of related issue creation when things just do not work. |
I'd like to understand the rationale behind docker's 10% limit before
having kube *just* embrace that default.
…On Wed, Feb 1, 2017 at 2:59 PM, Derek Carr ***@***.***> wrote:
to phrase it another way, we should select a default for OSS version of
k8s that works for the broadest number of users, and causes the least
amount of noise in the form of related issue creation when things just do
not work.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#40432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKAhpT5t5gqy3NU5cM2mjHTa_IxEkks5rYQ5GgaJpZM4LtyEm>
.
|
@vishh -- prior to sending this PR, @sjenning and I spoke with @mrunalp and @rhvgoyal to understand why the 10% value for dm.min_free_space value was chosen, and it was chosen because they felt it was a sensible default, and reducing it below 10% felt too low in their opinion. They can weigh in w/ more details if they feel I am misrepresenting our discussion, but it seemed fair. My prior point remains which is that I think OSS k8s defaults should work with the broadest set of container run-time storage driver options out of the box just to reduce the amount of issues/noise/support in the broader community for when something does not work. Are we more likely to get issues that we are not utlizing an extra 5% of disk, or that something just doesn't work on Centos/Fedora/etc when using POSIX compliant storage drivers? We are not going to get the existing versions of docker out in the wild to all change. |
@derekwaynecarr k8s has a lot of workarounds to integrate with docker. I
would like to understand if this is one such workaround, in which case it
would be better to have a plan to try and fix docker in the future, or if
it is a necessary permanent change.
Just curious, can docker's default be 5% for example, in which case k8s
would prevent the node from ever reach that low threshold?
I agree that we want kube to work by default across most distros.
…On Wed, Feb 1, 2017 at 4:17 PM, Derek Carr ***@***.***> wrote:
@vishh <https://github.com/vishh> -- prior to sending this PR, @sjenning
<https://github.com/sjenning> and I spoke with @mrunalp
<https://github.com/mrunalp> and @rhvgoyal <https://github.com/rhvgoyal>
to understand why the 10% value for dm.min_free_space value was chosen, and
it was chosen because they felt it was a sensible default, and reducing it
below 10% felt too low in their opinion. They can weigh in w/ more details
if they feel I am misrepresenting our discussion, but it seemed fair.
My prior point remains which is that I think OSS k8s defaults should work
with the broadest set of container run-time storage driver options out of
the box just to reduce the amount of issues/noise/support in the broader
community for when something does not work. Are we more likely to get
issues that we are not utlizing an extra 5% of disk, or that something just
doesn't work on Centos/Fedora/etc when using POSIX compliant storage
drivers? We are not going to get the existing versions of docker out in the
wild to all change.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#40432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKBd2P3FQVn0cDrrov8XMtC3e_dPOks5rYSCfgaJpZM4LtyEm>
.
|
Ping. This PR is still not merged. |
@vishh - for the moment, we are carrying a reduced default in openshift. for k8s, i am still not sure how to proceed. as a default, i think 85% in OSS is not the worst. ideally, we would have a way to ask a container runtime if it was doing any behind the scenes image management configuration behavior like this and dynamically reduce or warn in the kubelet our settings in response if the one-size fits all default is not portable. |
Instead of attempting to understand each storage driver, can we instead
treat the capacity of a devicemapper partition to be 90% of its actual
capacity?
…On Thu, Feb 23, 2017 at 7:45 PM, Derek Carr ***@***.***> wrote:
@vishh <https://github.com/vishh> - for the moment, we are carrying a
reduced default in openshift. for k8s, i am still not sure how to proceed.
as a default, i think 85% in OSS is not the worst. ideally, we would have a
way to ask a container runtime if it was doing any behind the scenes image
management configuration behavior like this and dynamically reduce or warn
in the kubelet our settings in response if the one-size fits all default is
not portable.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#40432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKJvCjAOR19FDSdfQ6YPwgFnawuoQks5rflJbgaJpZM4LtyEm>
.
|
@sjenning: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue |
docker dm.min_free_space defaults to 10%, which "specifies the min free space percent in a thin pool require for new device creation to succeed....Whenever a new a thin pool device is created (during docker pull or during container creation), the Engine checks if the minimum free space is available. If sufficient space is unavailable, then device creation fails and any relevant docker operation fails." [1]
This setting is preventing the storage usage to cross the 90% limit. However, image GC is expected to kick in only beyond image-gc-high-threshold. The image-gc-high-threshold has a default value of 90%, and hence GC never triggers. If image-gc-high-threshold is set to a value lower than (100 - dm.min_free_space)%, GC triggers.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1408309
@derekwaynecarr @sdodson @rhvgoyal