Description
Netty4HttpServerTransport
uses the settings http.netty.receive_predictor_min
and http.netty.receive_predictor_max
to provide a properly configured RecvByteBufAllocator
implementation to Netty. Their default value is controlled by the setting transport.netty.receive_predictor_size
which varies between 64 kb and 512 kb (per allocated buffer).
The before-mentioned allocator is responsible for allocating memory buffers when handling incoming network packets and Netty will allocate one buffer per network packet.
We have run comparative benchmarks (nyc_taxis) track, once locally (i.e. via loopback) and once distributed (i.e. via a Ethernet) and analyzed the allocation behavior of Elasticsearch.
Network Connection Type | Bytes allocated outside of TLABs on network layer |
---|---|
Loopback | ~ 78 GB |
Ethernet | ~ 2.13 TB |
Note: On this particular machine transport.netty.receive_predictor_size
was 512kb.
The root cause seems to be related to MTU (which differs greatly between loopback and regular network devices (65536 vs. 1500)). A smaller MTU leads to more network packets (but the buffer size stays the same) thus leading to more GC pressure.
In a custom build of Elasticsearch we set http.netty.receive_predictor_min
to 5kb and http.netty.receive_predictor_max
to 64kb and got comparable allocation behavior between local and distributed benchmarks.
Note: Our analysis focused only Netty4HttpServerTransport
for a single Elasticsearch node. It is expected that Netty4Transport
exhibits similar behavior and we should change the buffer size there too.