Closed
Description
Build scan:
https://gradle-enterprise.elastic.co/s/sg54toruksniu
Repro line:
./gradlew ':x-pack:plugin:ml:qa:ml-with-security:integTestRunner' --tests "org.elasticsearch.smoketest.MlWithSecurityIT.test {yaml=ml/data_frame_analytics_crud/Test get stats given multiple analytics}" \
-Dtests.seed=9F814CADF1385EAE \
-Dtests.security.manager=true \
-Dtests.locale=es-SV \
-Dtests.timezone=Etc/GMT+1 \
-Druntime.java=8
Reproduces locally?:
No
Applicable branches:
7.9 observed, but almost certainly could affect 7.x and master too given what it is
Failure history:
Failure excerpt:
This is a case of "all shards failed" on a .ml-stats*
search.
The server side logs show this:
[2020-08-10T08:30:10,269][INFO ][o.e.c.m.MetadataCreateIndexService] [integTest-0] [.ml-stats-000001] creating index, cause [api], templates [.ml-stats], shards [1]/[1]
[2020-08-10T08:30:10,269][INFO ][o.e.c.r.a.AllocationService] [integTest-0] updating number_of_replicas to [0] for indices [.ml-stats-000001]
[2020-08-10T08:30:10,270][DEBUG][o.e.c.c.PublicationTransportHandler] [integTest-0] received diff cluster state version [8417] with uuid [cT3AZ6XDT6udgnWX6wdaLg], diff size [1110]
[2020-08-10T08:30:10,319][DEBUG][o.e.c.c.C.CoordinatorPublication] [integTest-0] publication ended successfully: Publication{term=1, version=8417}
[2020-08-10T08:30:10,325][INFO ][o.e.x.i.IndexLifecycleTransition] [integTest-0] moving index [.ml-stats-000001] from [null] to [{"phase":"new","action":"complete","name":"complete"}] in policy [ml-size-based-ilm-policy]
[2020-08-10T08:30:10,326][DEBUG][o.e.c.c.PublicationTransportHandler] [integTest-0] received diff cluster state version [8418] with uuid [pGnODAmLRkayZvZQvpCk0A], diff size [509]
[2020-08-10T08:30:10,358][ERROR][o.e.x.m.a.TransportGetDataFrameAnalyticsStatsAction] [integTest-0] [foo-2] Item failure encountered during multi search for request [indices=[.ml-stats-000001, .ml-stats-write], source={"size":1,"query":{"bool":{"filter":[{"term":{"job_id":{"value":"foo-2","boost":1.0}}},{"term":{"type":{"value":"analytics_data_counts","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"sort":[{"timestamp":{"order":"desc","unmapped_type":"long"}}]}]: all shards failed
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:551) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:309) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:582) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:393) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:68) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:245) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:403) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:638) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1281) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1255) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:61) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportChannel.sendErrorResponse(TransportChannel.java:56) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:51) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.lambda$runAsync$0(SearchService.java:414) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:710) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_241]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_241]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_241]
There is also a secondary problem, which is that we seem to be trying to respond after already sending a failure response:
[2020-08-10T08:30:10,362][ERROR][o.e.r.a.RestResponseListener] [integTest-0] failed to send failure response
java.lang.IllegalStateException: Channel is already closed
at org.elasticsearch.rest.RestController$ResourceHandlingHttpChannel.close(RestController.java:511) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.rest.RestController$ResourceHandlingHttpChannel.sendResponse(RestController.java:504) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.rest.action.RestActionListener.onFailure(RestActionListener.java:58) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:98) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:71) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:71) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.xpack.ml.action.TransportGetDataFrameAnalyticsStatsAction.lambda$searchStats$7(TransportGetDataFrameAnalyticsStatsAction.java:220) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:89) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.finish(TransportMultiSearchAction.java:178) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.handleResponse(TransportMultiSearchAction.java:164) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.onFailure(TransportMultiSearchAction.java:157) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:98) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.raisePhaseFailure(AbstractSearchAsyncAction.java:573) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:551) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:309) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:582) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:393) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:68) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:245) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:403) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:638) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1281) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1255) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:61) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportChannel.sendErrorResponse(TransportChannel.java:56) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:51) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.lambda$runAsync$0(SearchService.java:414) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:710) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_241]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_241]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_241]
Suppressed: org.elasticsearch.ElasticsearchException: all shards failed
at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:55) ~[?:?]
at org.elasticsearch.xpack.ml.action.TransportGetDataFrameAnalyticsStatsAction.lambda$searchStats$7(TransportGetDataFrameAnalyticsStatsAction.java:220) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:89) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.finish(TransportMultiSearchAction.java:178) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.handleResponse(TransportMultiSearchAction.java:164) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.TransportMultiSearchAction$1.onFailure(TransportMultiSearchAction.java:157) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:98) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.raisePhaseFailure(AbstractSearchAsyncAction.java:573) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:551) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:309) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:582) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:393) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:68) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:245) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:403) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:638) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1281) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1255) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:61) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.transport.TransportChannel.sendErrorResponse(TransportChannel.java:56) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:51) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.lambda$runAsync$0(SearchService.java:414) ~[elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:710) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.9.0-SNAPSHOT.jar:7.9.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_241]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_241]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_241]
Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
... 23 more
There are two possibilities here:
- The
.ml-stats*
index is getting created as a direct effect of something the test does, in which case we need to wait for yellow status in the code that creates it - The
.ml-stats*
index is getting created by some background service, in which case this issue is a variation on [CI] Deleting data frame analytics jobs can throw all shards failed #60462