Skip to content

Serialization exceptions should be handled in OutboundHandler #94839

Open
@droberts195

Description

@droberts195

While trying to get X-Pack yaml tests running in a 3 node cluster (usually they run in a single node cluster) I noticed a problem with serialization between nodes for geo shapes. The problem is that the search hit serialization code expects to be able to serialize the hit contents as “generic values” but there’s no generic value writer that works for geo shapes. The effect is to crash a node in the cluster:

java.lang.IllegalArgumentException: can not write type [class org.elasticsearch.xpack.spatial.index.fielddata.GeoShapeValues$GeoShapeValue]
	at org.elasticsearch.common.io.stream.StreamOutput.writeGenericValue(StreamOutput.java:822) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.io.stream.StreamOutput.writeCollection(StreamOutput.java:1045) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.document.DocumentField.writeTo(DocumentField.java:116) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.search.SearchHit.lambda$writeTo$1(SearchHit.java:255) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.io.stream.StreamOutput.writeMap(StreamOutput.java:623) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.search.SearchHit.writeTo(SearchHit.java:255) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.io.stream.StreamOutput.lambda$writeArray$31(StreamOutput.java:939) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.io.stream.StreamOutput.writeArray(StreamOutput.java:916) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.io.stream.StreamOutput.writeArray(StreamOutput.java:939) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.search.SearchHits.writeTo(SearchHits.java:100) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.search.fetch.FetchSearchResult.writeTo(FetchSearchResult.java:53) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.search.fetch.QueryFetchSearchResult.writeTo(QueryFetchSearchResult.java:74) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.OutboundMessage.serialize(OutboundMessage.java:70) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.OutboundHandler.sendMessage(OutboundHandler.java:178) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.OutboundHandler.sendResponse(OutboundHandler.java:138) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.TcpTransportChannel.sendResponse(TcpTransportChannel.java:58) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:44) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.lambda$onResponse$0(ChannelActionListener.java:31) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionListener.run(ActionListener.java:357) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:31) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:19) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:50) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:47) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$3.doRun(ActionRunnable.java:72) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]
[2023-03-28T05:26:44,633][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [yamlRestTest-1] fatal error in thread [elasticsearch[yamlRestTest-1][search][T#7]], exiting
java.lang.AssertionError: java.lang.Exception: wrapped[org.elasticsearch.transport.RequestHandlerRegistry$$Lambda$6200/0x0000000801aeec80@703ffb85, org.elasticsearch.tasks.TaskManager$$Lambda$6475/0x0000000801b513a8@16603dfb]
	at org.elasticsearch.core.Releasables$4.assertFirstRun(Releasables.java:150) ~[elasticsearch-core-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.core.Releasables$4.close(Releasables.java:155) ~[elasticsearch-core-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:51) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:37) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:59) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:71) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionListener.run(ActionListener.java:359) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:31) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:19) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:50) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:47) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionRunnable$3.doRun(ActionRunnable.java:72) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.Exception: wrapped[org.elasticsearch.transport.RequestHandlerRegistry$$Lambda$6200/0x0000000801aeec80@703ffb85, org.elasticsearch.tasks.TaskManager$$Lambda$6475/0x0000000801b513a8@16603dfb]
	at org.elasticsearch.core.Releasables$4.assertFirstRun(Releasables.java:149) ~[elasticsearch-core-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.core.Releasables$4.close(Releasables.java:155) ~[elasticsearch-core-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:42) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.lambda$onResponse$0(ChannelActionListener.java:31) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	at org.elasticsearch.action.ActionListener.run(ActionListener.java:357) ~[elasticsearch-8.8.0-SNAPSHOT.jar:?]
	... 12 more

This indicates a bug in the transport layer I think (although only as a consequence of other bugs): it’s not safe to bubble an exception up all the way from OutboundHandler.sendResponse to the ChannelActionListener like this, because of the double-free problem that this AssertionError highlights. We need to handle the exception in the OutboundHandler somehow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Coordination/NetworkHttp and internode communication implementations>bugTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions