Open
Description
Seen in a Cloud cluster (with related forum thread):
[instance-0000000016] exception caught on transport layer [Netty4TcpChannel{localAddress=/172.17.0.7:19547, remoteAddress=/10.47.48.237:37520, profile=default}], closing connection
java.lang.IllegalStateException: Pipeline state corrupted by uncaught exception
at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:83) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:63) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372) ~[?:?]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) ~[?:?]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.IllegalArgumentException: CompositeBytesReference cannot hold more than 2GB
at org.elasticsearch.common.bytes.CompositeBytesReference.ofMultiple(CompositeBytesReference.java:60) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.common.bytes.CompositeBytesReference.of(CompositeBytesReference.java:41) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.transport.InboundAggregator.finishAggregation(InboundAggregator.java:104) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:147) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:121) ~[elasticsearch-8.4.3.jar:?]
at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:86) ~[elasticsearch-8.4.3.jar:?]
... 33 more
I expect you can get this to happen with e.g. an ingest pipeline that expands each document's size by an order of magnitude or so, combined with transport-level compression. I am guessing that's what the user in question was doing here. Maybe we can split up the shard bulk in this case, but in general we should just refuse to send such a large message rather than allowing the receiver to explode like this.
Relates #75334 too, we also ought to be able to at least skip over such oversized messages without tearing down the connection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment