Set NSS_SDB_USE_CACHE=no to avoid memory growth #1716

jordansissel · 2019-09-11T22:10:11Z

On affected systems[1], the readinessProbe costs 5mb of memory (file system cache) per invocation which is observed in some cases[2] to never be reclaimed. This is a bug in the kernel and/or kubernetes, but in the meantime, this env var can help prevent the readinessProbe from causing memory usage growth and eventually causing Kubernetes to evict pods.

[1] Certain versions of CentOS and GKE are known affected.
[2] It is observed in production that for pods without a strict memory limit, the container's file system cache (dentry) is not reclaimed appropriately, causing the kubelet to believe the pod is using far more memory than is actually being used by processes in the pod.

References: #1635

This PR is a proposal to work around a bug in the kernel/kubernetes. The reason I propose it is that this issue took 5 working days to investigate[3] and eventually find a workaround. I would spare others this same fate by having the workaround enabled by default ;)

[3] The investigation traveled the gradient from elasticsearch, to kubernetes, to the linux kernel, and all things in between. It was an interesting but ultimately unenjoyable adventure given I really didn't want to be spending a week digging into the kernel.

jordansissel · 2019-09-11T22:17:06Z

We have this deployed by setting env vars in one production cluster (ECK 0.9.0):

        containers: 
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -Xms5g -Xmx5g
          # Curl in the readinessProbe triggers a kernel memory accounting bug(feature?)
          # So we disable it that part of curl.
          - name: NSS_SDB_USE_CACHE
            value: "no"

The impact of this change is observed in the following charts:

Memory usage in a one-week period where I was engaged with this exciting issue. Note the massive memory growth (caused by readinessProbe/curl) and high plateau (far above the 'memory request'). The right-side of the chart shows the beginning of my test where I set the NSS_SDB_USE_CACHE=no env var.

If we zoom into a 24 hour period just before I enabled this ENV var, we can see that memory usage is quite stable:

jordansissel · 2019-09-11T22:57:19Z

If there's 👍 on this proposed implementation, I can write tests. Let me know whatcha like :)

sebgl · 2019-09-12T08:23:37Z

Awesome ❤️
We probably need to update unit tests accordingly - something the team can do if you're busy @jordansissel.

jordansissel · 2019-09-12T18:06:38Z

need to update unit tests accordingly

Ahh, env vars need to be sorted (I see WithEnv sorts them, but the expected := ... in the test suite doesn't sort them.). I pushed a new commit which should pass this particular failure (does locally). Waiting on CI now.

On affected systems[1], the readinessProbe costs 5mb of memory per invocation which is observed in some cases[2] to never be reclaimed. This is a bug in the kernel and/or kubernetes, but in the meantime, this env var can help prevent the readinessProbe from causing memory usage growth and eventually causing Kubernetes to evict pods. [1] Certain versions of CentOS and GKE are known affected. [2] It is observed in production that for pods without a strict memory `limit`, the container's file system cache (dentry) is not reclaimed appropriately, causing the kubelet to believe the pod is using far more memory than is actually being used by processes in the pod. Fix unit tests to sort env vars so that comparisons will succeed. References: #1635

anyasabo · 2019-09-12T19:48:08Z

pkg/controller/elasticsearch/nodespec/podspec_test.go

@@ -180,6 +180,13 @@ func TestBuildPodTemplateSpec(t *testing.T) {
 		},
 	}

+	// pods built with BuildPodTemplateSpec sort env vars, so our expected result must be sorted as well.


👍 My initial idea was to use go-spew (which also might make the error output below easier to read), but that will only sort map keys

anyasabo

Thanks for adding and including the issue links in the comments

jordansissel · 2019-09-12T20:02:34Z

The elasticsearch-ci/docs check is stalled (or dead?) due to problems on elasticsearch-ci. The infra team is aware and working on the problem.

jordansissel · 2019-09-12T23:24:41Z

CI finally finished (<3 to our infra team who've been working on stabilizing Jenkins resources much of today)

jordansissel · 2019-09-16T15:35:22Z

Copying other references here for posterity:

Allow ECK to specify a custom readiness check for Elasticsearch and Kibana #1581 (comment)
https://issuetracker.google.com/issues/140577001
kubelet counts active page cache against memory.available (maybe it shouldn't?) kubernetes/kubernetes#43916

jordansissel force-pushed the issue/disable-curl-cache branch from 708a864 to b1366d6 Compare September 11, 2019 22:28

sebgl mentioned this pull request Sep 12, 2019

Cluster goes into CrashLoopBackOff after 5-10 minutes of running #1076

Closed

jordansissel force-pushed the issue/disable-curl-cache branch from 95a6ee9 to c25a794 Compare September 12, 2019 19:03

anyasabo reviewed Sep 12, 2019

View reviewed changes

anyasabo approved these changes Sep 12, 2019

View reviewed changes

jordansissel merged commit d083077 into master Sep 12, 2019

jordansissel mentioned this pull request Sep 12, 2019

Allow ECK to specify a custom readiness check for Elasticsearch and Kibana #1581

Closed

jordansissel deleted the issue/disable-curl-cache branch September 12, 2019 23:44

jordansissel mentioned this pull request Sep 19, 2019

Set default resource limits for ES/Kibana/APM #1454

Closed

jomeier mentioned this pull request Apr 13, 2020

[vSphere] extreme high, rising memory consumption of etcd-quorum-guards in machine-config-operator okd-project/okd#146

Closed

jmlrt mentioned this pull request Sep 23, 2020

[elasticsearch][kibana] disable nss dentry cache elastic/helm-charts#818

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set NSS_SDB_USE_CACHE=no to avoid memory growth #1716

Set NSS_SDB_USE_CACHE=no to avoid memory growth #1716

jordansissel commented Sep 11, 2019 •

edited

Loading

jordansissel commented Sep 11, 2019

jordansissel commented Sep 11, 2019

sebgl commented Sep 12, 2019

jordansissel commented Sep 12, 2019

anyasabo Sep 12, 2019

anyasabo left a comment

jordansissel commented Sep 12, 2019

jordansissel commented Sep 12, 2019

jordansissel commented Sep 16, 2019

Set NSS_SDB_USE_CACHE=no to avoid memory growth #1716

Set NSS_SDB_USE_CACHE=no to avoid memory growth #1716

Conversation

jordansissel commented Sep 11, 2019 • edited Loading

jordansissel commented Sep 11, 2019

jordansissel commented Sep 11, 2019

sebgl commented Sep 12, 2019

jordansissel commented Sep 12, 2019

anyasabo Sep 12, 2019

Choose a reason for hiding this comment

anyasabo left a comment

Choose a reason for hiding this comment

jordansissel commented Sep 12, 2019

jordansissel commented Sep 12, 2019

jordansissel commented Sep 16, 2019

jordansissel commented Sep 11, 2019 •

edited

Loading