Choose JVM options ergonomically #30684

danielmitterdorfer · 2018-05-17T13:57:26Z

With this commit we add the possibility to define further JVM options
(and system properties) based on the current environment. As a proof of
concept, it chooses Netty's allocator ergonomically based on the maximum
defined heap size.

This PR is work in progress and just up for the curious. Especially, we need to
run experiments to decide at which heap size to switch the allocator from
pooled to unpooled.

With this commit we add the possibility to define further JVM options (and system properties) based on the current environment. As a proof of concept, it chooses Netty's allocator ergonomically based on the maximum defined heap size.

elasticmachine · 2018-05-17T13:57:30Z

Pinging @elastic/es-core-infra

nik9000

I think it is pretty cute and it could be quite useful down the road.

I expect it'd be easier to get the properties if we started the launcher with the user defined jvm arguments but I expect that could cause funny issues down the road. Also it'd be silly to start the launcher with a huge heap. So I think your approach here is probably better than starting the launcher with the user defined jvm arguments. Though it is nice to think about it.

s1monw

LGTM

dakrone · 2018-05-17T14:50:10Z

distribution/tools/launchers/src/main/java/org/elasticsearch/tools/launchers/JvmErgonomics.java

+                }
+            }
+        }
+        return null;


Would it be possible to return what the default JVM heap is here instead of null? Seems like we still may want to make changes to JVM options for an unset heap.

Either that, or since we explicitly set it in our jvm.options, emit a warning or error that no heap has been specified (someone must have removed the option)

The JVM chooses the heap size ergonomically when the heap size is not specified.

The JVM chooses the heap size ergonomically when the heap size is not specified.

Do we have the ability to see what it would choose? If running on a small machine, for instance, we'd still want to disable Netty's pooled allocator if the JVM is going to automatically choose a 400mb heap

It can be quite tricky because the ergonomics these days depend on other flags on the JVM (e.g., -XX:+UseCGroupMemoryLimitForHeap).

We could try starting a JVM with all the flags specified and -XX:+PrintFlagsFinal -version and scrape the output for MaxHeapSize.

One aspect that I like about that approach is that it is pretty lightweight (it just checks the provided arguments). Yes, it will not choose settings ergonomically if the user has manually removed the heap size setting from jvm.options. However, I think this is (a) not the common case as this is probably one of the most important JVM-related settings in Elasticsearch and (b) we'd lose the ability to provide an ergonomic choice but otherwise everything works as is.

Jason has mentioned a good approach to get around this limitation. I think it would work but I see the following potential risks:

We start a javaprocess with the very same settings as Elasticsearch so this will negatively affect our startup time to a small degree (I ran time java -Xms8G -Xmx8G -XX:+UnlockDiagnosticVMOptions -XX:+AlwaysPreTouch -XX:+PrintFlagsFinal -Xlog:gc\*,gc+age=trace,safepoint:file=/tmp/gc.log:utctime,pid,tags:filecount=32,filesize=64m -version on several machines and measured between 0.5 and 3 seconds Wall clock time). We'd also output several GC log files (every JVM startup creates a new one) which might be confusing.

We use an -XX flag (-XX:+PrintFlagsFinal) to parse output. In theory -XX may disappear without any prior notice. Personally, I doubt though that this would happen with -XX:+PrintFlagsFinal.

We would need to parse output that is not under our control.

In the end both approaches have their advantages and disadvantages but if we contrast this with the requirement that the user specifies -Xmx in order for us to provide ergonomic choices I think the current approach is a reasonable compromise? I think though that it would be nice to emit a warning that ergonomic choices are turned off in case we cannot detect a heap size?

@danielmitterdorfer The -version avoids actually starting a JVM so we can -XX:+AlwaysPreTouch with no impact.

Thanks for the correction. I've corrected my comment accordingly.

I think though that it would be nice to emit a warning that ergonomic choices are turned off in case we cannot detect a heap size?

I think this is a reasonable solution, right now since we ship a jvm.options it would require a user to go in and remove the settings that we have, so I don't think it will be a common case.

We don't remove the options and use -XX:+UseCGroupMemoryLimitForHeap for our docker image(s) do we?

We don't remove the options and use -XX:+UseCGroupMemoryLimitForHeap for our docker image(s) do we?

We do not do that.

With this commit we switch to the unpooled allocator at 1GB heap size (value determined experimentally determined, see PR for more details). We are also explicit about the choice of the allocator in either case.

danielmitterdorfer · 2018-06-18T11:29:13Z

Benchmark Scenarios

I ran benchmarks for the following data sets to determine at which point we should switch the allocator (click on the links below for a thorough description of the data set and an example document):

I ran those for one node, three nodes (one replica) and one node with X-Pack Security enabled. Heap size was 512MB, 768MB and 1G. All benchmarks are based on revision 37f67d9.

Results

Raw results can be downloaded from raw_results.zip. Here I just want to highlight some of the results.

GC metrics

Overall old GC decreases (sometimes significantly) at the expense of a slight increase in young GC times. This is expected because there is no pool that takes up space (in the old generation) but rather we allocate new objects for every request most of which are short lived.

This is a rather typical case from geonames:

One node:

Three nodes, one replica:

For other benchmarks (fulltext, i.e. PMC) we are able to use lower heap sizes when we switch to the unpooled allocator:

Note that the benchmark did not finish successfully for a 512MB heap in either case but with the unpooled allocator we were able to finish with 768MB.

Indexing Throughput

Throughput increases in almost all cases except when we have rather small documents (i.e. geopoints). Here are several examples (the bars show the median; black lines indicate minimum and maximum):

For small documents (one property with one geopoint per document) we see a slight decrease in indexing throughput for heap sizes > 512MB:

In other cases we typically see an increase in indexing throughput and less variation (i.e. the black line gets smaller).

For HTTP logs the difference gets more pronounced the less heap size we have available (one node):

geonames (one node):

geonames (three nodes, one replica):

For PMC the difference is quite pronounced already for 1GB heaps:

one node:

Again, we were not able to finish this benchmark for 512MB heap size but succeeded for 768MB heap size with the unpooled allocator.

three nodes:

Query Time

In almost all cases service time is roughly identical (within run-to-run variation):

This is an ideal case showing the match-all query for geonames on one Elasticsearch node:

For scrolls we see a little bit higher variation in some cases (this is geonames with X-Pack):

Times also tend to be a little bit worse with the unpooled allocator at higher heap sizes (see the scroll picture above).

Conclusion

Based on the data it is not a hard and clear decision at which point we should switch but for the most part the situation has already improved in the 1GB case. The smaller heap sizes get, the more the unpooled allocator makes sense. I'd switch at 1GB heap size to the unpooled allocator (which is already reflected in the last commit) but if we want to be more conservative we could also switch at 768MB.

@dakrone, @nik9000, @s1monw, @jasontedor are you still ok with my changes given those results? Also picking up the previous discussion on what to do when the user did not specify a heap size (i.e. -Xms / -Xmx is manually removed from jvm.options): I think this is quite an edge case and would ignore it for now. We can always refine later if needed.

nik9000 · 2018-06-18T20:41:46Z

I'm in.

Seriously, though, you've proven to me that ergonomically picking some JVM options based on other provided options is a great idea. I'm fine with kicking this optimization in at 1GB because it looks like it mostly helps.

I wonder:

If a user specifies a value for an option that we'd pick ergonomically what should we do? I figure we can either accept the user's choice or refuse to start. I don't think we should overwrite their choice with ours.
If a user doesn't specify the heap size should we try to scrape it from running the jvm with -XX:+PrintFlagsFinal -version? This got me to wondering, if the user doesn't specify the heap size is it valid for us to:
2a. Refuse to start?
2b. Ergonomically pick a heap size based on the ram on the machine/container?

danielmitterdorfer · 2018-06-19T06:21:12Z

Thanks for your feedback. To your thoughts:

If a user specifies a value for an option that we'd pick ergonomically what should we do?

If a user sets some value explicitly, this effectively disables our ergonomic choice. I think this makes sense because it allows us to provide sensible defaults but still enables users to tune things on their own.

(paraphrasing you below)

If the user doesn't specify the heap size is it valid for us to refuse to start or ergonomically pick a heap size [...]?

IMHO apart from the choice of the garbage collector, the heap size is the single most important choice that you can make in the JVM configuration. For a small tool it might be ok to not specify the heap size but for server software like Elasticsearch I don't think it is a good idea to leave the heap size unspecified. Why?

It makes the configuration less predictable: If you deploy to machines with different configurations you might end up with different heap sizes, i.e. it is one more thing to be aware about if you don't set it explicitly.
It allows for the minimum and maximum heap size to be different so the JVM might not allocate all heap up-front so pay the cost for potential resizing at runtime. That may or may not be acceptable depending on the use case. Also, it can happen that at the time the JVM wants to resize the heap you don't get enough memory from the OS (depending on OS configuration).

So if we want to change behavior I'd tend to refuse to start rather than picking something for the user. If we intend to do this, I think we should do this in a separate PR though?

nik9000 · 2018-06-19T13:52:34Z

If we intend to do this, I think we should do this in a separate PR though?

Right! But thinking about this informs my thinking about how we should handle the other ergonomic choices when the user doesn't specify the heap size. If we don't plan to start if the user doesn't specify the heap size then we don't have to handle it at all. Given that we plan to talk more about heap choices in the future, I think your choice not to set any ergonomic options if heap size isn't specified is fine for this PR. I think we should get this in.

danielmitterdorfer · 2018-06-20T11:13:23Z

Thanks all for your feedback. I've merged this now to master and will check how our benchmark behave over the next few days as well.

Choose JVM options ergonomically

4452580

With this commit we add the possibility to define further JVM options (and system properties) based on the current environment. As a proof of concept, it chooses Netty's allocator ergonomically based on the maximum defined heap size.

danielmitterdorfer added >enhancement :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts WIP v7.0.0 labels May 17, 2018

nik9000 reviewed May 17, 2018

View reviewed changes

s1monw approved these changes May 17, 2018

View reviewed changes

dakrone reviewed May 17, 2018

View reviewed changes

danielmitterdorfer added 2 commits June 18, 2018 13:03

Merge remote-tracking branch 'origin/master' into ergonomics

0d2e862

Switch allocator at 1GB heap and be explicit

8f162f2

With this commit we switch to the unpooled allocator at 1GB heap size (value determined experimentally determined, see PR for more details). We are also explicit about the choice of the allocator in either case.

danielmitterdorfer added review and removed WIP labels Jun 18, 2018

nik9000 approved these changes Jun 19, 2018

View reviewed changes

danielmitterdorfer removed the review label Jun 20, 2018

danielmitterdorfer merged commit 2aefb72 into elastic:master Jun 20, 2018

danielmitterdorfer deleted the ergonomics branch June 20, 2018 11:13

danielmitterdorfer mentioned this pull request Oct 5, 2018

Should we use the unpooled allocator in the transport client? #34330

Closed

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

jasontedor mentioned this pull request May 8, 2019

Remove manual parsing of JVM options #41962

Merged

mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020

Choose JVM options ergonomically #30684

Choose JVM options ergonomically #30684

Uh oh!

Conversation

danielmitterdorfer commented May 17, 2018

Uh oh!

elasticmachine commented May 17, 2018

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielmitterdorfer May 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielmitterdorfer commented Jun 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Scenarios

Results

GC metrics

Indexing Throughput

Query Time

Conclusion

Uh oh!

nik9000 commented Jun 18, 2018

Uh oh!

danielmitterdorfer commented Jun 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nik9000 commented Jun 19, 2018

Uh oh!

danielmitterdorfer commented Jun 20, 2018

Uh oh!

Uh oh!

danielmitterdorfer May 18, 2018 •

edited

Loading

danielmitterdorfer commented Jun 18, 2018 •

edited

Loading

danielmitterdorfer commented Jun 19, 2018 •

edited

Loading