Skip to content

test fails with: can't resolve processed cluster state with uuid #32767

Closed
@alpar-t

Description

@alpar-t

Example build failure

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=amazon/2608/console

Reproduction line (does not reproduce locally)

./gradlew :server:integTest \
  -Dtests.seed=CFEB21B624409D08 \
  -Dtests.class=org.elasticsearch.cluster.ack.AckIT \
  -Dtests.method="testIndicesAliasesAcknowledgement" \
  -Dtests.security.manager=true \
  -Dtests.locale=ar-EG \
  -Dtests.timezone=Europe/Busingen

Example relevant log

12:05:51   1> [2018-08-09T14:05:49,772][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] before test
12:05:51   1> [2018-08-09T14:05:49,772][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] [AckIT#testIndicesAliasesAcknowledgement]: setting up test
12:05:51   1> [2018-08-09T14:05:49,773][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [98] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,786][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_sm0] adding template [random_index_template] for index patterns [*]
12:05:51   1> [2018-08-09T14:05:49,787][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [99] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,790][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] [AckIT#testIndicesAliasesAcknowledgement]: all set up test
12:05:51   1> [2018-08-09T14:05:49,790][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] using custom data_path for index: [bZFgSZDbso]
12:05:51   1> [2018-08-09T14:05:49,802][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm0] [test] creating index, cause [api], templates [random_index_template], shards [1]/[0], mappings []
12:05:51   1> [2018-08-09T14:05:49,803][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [100] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,813][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] [AckIT#testIndicesAliasesAcknowledgement]: cleaning up after test
12:05:51   1> [2018-08-09T14:05:49,815][ERROR][o.e.t.d.TestZenDiscovery ] [node_sd2] unexpected failure applying [master {node_sm0}{hGe5mTwjSEeYCIoxAs1RNQ}{AIWCpEcCT2-pAsyJJvAk6w}{127.0.0.1}{127.0.0.1:36719} committed version [100]]
12:05:51   1> java.lang.IllegalStateException: can't resolve processed cluster state with uuid [8KrcUpcjR4ulxE1zmnc2Xg], version [100]
12:05:51   1> 	at org.elasticsearch.discovery.zen.PendingClusterStatesQueue.markAsProcessed(PendingClusterStatesQueue.java:147) ~[main/:?]
12:05:51   1> 	at org.elasticsearch.discovery.zen.ZenDiscovery$3.onSuccess(ZenDiscovery.java:795) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService$SafeClusterApplyListener.onSuccess(ClusterApplierService.java:533) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:427) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:157) [main/:?]
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [main/:?]
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [main/:?]
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) [main/:?]
12:05:51   1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
12:05:51   1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
12:05:51   1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
12:05:51   1> [2018-08-09T14:05:49,815][ERROR][o.e.t.d.TestZenDiscovery ] [node_sd2] unexpected exception while failing [master {node_sm0}{hGe5mTwjSEeYCIoxAs1RNQ}{AIWCpEcCT2-pAsyJJvAk6w}{127.0.0.1}{127.0.0.1:36719} committed version [100]]
12:05:51   1> java.lang.IllegalArgumentException: can't resolve failed cluster state with uuid [8KrcUpcjR4ulxE1zmnc2Xg], version [100]
12:05:51   1> 	at org.elasticsearch.discovery.zen.PendingClusterStatesQueue.markAsFailed(PendingClusterStatesQueue.java:110) ~[main/:?]
12:05:51   1> 	at org.elasticsearch.discovery.zen.ZenDiscovery$3.onFailure(ZenDiscovery.java:807) [main/:?]
12:05:51   1> 	at org.elasticsearch.discovery.zen.ZenDiscovery$3.onSuccess(ZenDiscovery.java:797) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService$SafeClusterApplyListener.onSuccess(ClusterApplierService.java:533) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:427) [main/:?]
12:05:51   1> 	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:157) [main/:?]
12:05:51   2> REPRODUCE WITH: ./gradlew :server:integTest -Dtests.seed=CFEB21B624409D08 -Dtests.class=org.elasticsearch.cluster.ack.AckIT -Dtests.method="testIndicesAliasesAcknowledgement" -Dtests.security.manager=true -Dtests.locale=ar-EG -Dtests.timezone=Europe/Busingen
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [main/:?]
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [main/:?]
12:05:51   1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) [main/:?]
12:05:51   1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
12:05:51   1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
12:05:51   1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
12:05:51   1> 	Suppressed: java.lang.IllegalStateException: can't resolve processed cluster state with uuid [8KrcUpcjR4ulxE1zmnc2Xg], version [100]
12:05:51   1> 		at org.elasticsearch.discovery.zen.PendingClusterStatesQueue.markAsProcessed(PendingClusterStatesQueue.java:147) ~[main/:?]
12:05:51   1> 		at org.elasticsearch.discovery.zen.ZenDiscovery$3.onSuccess(ZenDiscovery.java:795) [main/:?]
12:05:51   1> 		at org.elasticsearch.cluster.service.ClusterApplierService$SafeClusterApplyListener.onSuccess(ClusterApplierService.java:533) [main/:?]
12:05:51   1> 		at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485) [main/:?]
12:05:51   1> 		at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:427) [main/:?]
12:05:51   1> 		at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:157) [main/:?]
12:05:51   1> 		at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [main/:?]
12:05:51   1> 		at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [main/:?]
12:05:51   1> 		at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) [main/:?]
12:05:51   1> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
12:05:51   1> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
12:05:51   1> 		at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
12:05:51   1> [2018-08-09T14:05:49,819][INFO ][o.e.c.r.a.AllocationService] [node_sm0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test][0]] ...]).
12:05:51   1> [2018-08-09T14:05:49,820][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [101] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,823][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm0] [test/TdvnaRhdSu290Bl-3Dj4qg] deleting index
12:05:51   1> [2018-08-09T14:05:49,823][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [102] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,833][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_sm0] removing template [random_index_template]
12:05:51   1> [2018-08-09T14:05:49,833][WARN ][o.e.d.z.PublishClusterStateAction] [node_sm0] timed out waiting for all nodes to process published state [103] (timeout [0s], pending nodes: [{node_sd1}{SUmmm71tS4S8Rjjmu7RRzQ}{KMqfr68_QR2_o6qIfel-pw}{127.0.0.1}{127.0.0.1:37251}, {node_sc3}{D3fZVHArSOm2z7CSJQjLLQ}{pdC1isitQaSDqccMh9K1ZQ}{127.0.0.1}{127.0.0.1:36167}, {node_sd2}{jqkfT8R_Q_ihxv7ORnNGcQ}{mRSClYlNTSu04uSSA50YbQ}{127.0.0.1}{127.0.0.1:39169}])
12:05:51   1> [2018-08-09T14:05:49,836][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] [AckIT#testIndicesAliasesAcknowledgement]: cleaned up after test
12:05:51   1> [2018-08-09T14:05:49,836][INFO ][o.e.c.a.AckIT            ] [testIndicesAliasesAcknowledgement] after test
12:05:51 FAILURE 0.12s J2 | AckIT.testIndicesAliasesAcknowledgement <<< FAILURES!
12:05:51    > Throwable #1: java.lang.AssertionError: CreateIndexResponse failed - not acked
12:05:51    > Expected: <true>
12:05:51    >      but: was <false>
12:05:51    > 	at __randomizedtesting.SeedInfo.seed([CFEB21B624409D08:4BF121D23848E9F9]:0)
12:05:51    > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
12:05:51    > 	at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:108)
12:05:51    > 	at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked(ElasticsearchAssertions.java:96)
12:05:51    > 	at org.elasticsearch.test.ESIntegTestCase.createIndex(ESIntegTestCase.java:741)
12:05:51    > 	at org.elasticsearch.cluster.ack.AckIT.testIndicesAliasesAcknowledgement(AckIT.java:214)
12:05:51    > 	at java.lang.Thread.run(Thread.java:748)

Frequency

This particular test did not fail before however test logs shows that can't resolve processed cluster state with uuid started happening on 2018-06-24 with failures almost every week since.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions