Skip to content

MixedClusterClientYamlTestSuiteIT failure due to trying to delete indices being snapshotted/creating indices that already exist #39721

Closed
@gwbrown

Description

@gwbrown

This hit a 6.7 intake build on one of my commits. I'm pretty sure it's not related to the changes in the commit, as they pertain mostly to Watcher.

CI Link: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.7+intake/306/console

A ton of stuff in MixedClusterClientYamlTestSuiteIT failed, none reproduces locally. Sample reproduce line:

./gradlew :qa:mixed-cluster:v5.6.16#mixedClusterTestRunner \
  -Dtests.seed=AFFFC69E10895A69 \
  -Dtests.class=org.elasticsearch.backwards.MixedClusterClientYamlTestSuiteIT \
  -Dtests.method="test {p0=cat.snapshots/10_basic/Test cat snapshots output}" \
  -Dtests.security.manager=true \
  -Dtests.locale=he-IL \
  -Dtests.timezone=America/Grenada \
  -Dcompiler.java=11 \
  -Druntime.java=8

There's two kinds of exceptions that keep popping up in the logs that look like they might be related.

One is a failure to delete some indices that are currently being snapshotted:

org.elasticsearch.client.ResponseException: method [DELETE], host [http://[::1]:33913], URI [*], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[node-0][127.0.0.1:44482][indices:admin/delete]"}],"type":"illegal_argument_exception","reason":"Cannot delete indices that are being snapshotted: [[index2/vK05jwCyTIiYeTcVLqKQpw], [index1/TvYH0C5FSHSG4l_PhEQMHA]]. Try again after snapshot finishes or cancel the currently running snapshot."},"status":400}
	at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:936)
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:233)
	at org.elasticsearch.test.rest.ESRestTestCase.wipeCluster(ESRestTestCase.java:455)
	at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:273)
	at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.client.ResponseException: method [DELETE], host [http://[::1]:33913], URI [*], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[node-0][127.0.0.1:44482][indices:admin/delete]"}],"type":"illegal_argument_exception","reason":"Cannot delete indices that are being snapshotted: [[index2/vK05jwCyTIiYeTcVLqKQpw], [index1/TvYH0C5FSHSG4l_PhEQMHA]]. Try again after snapshot finishes or cancel the currently running snapshot."},"status":400}
	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:552)
	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:537)
	at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)
	at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
	at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
	at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
	... 1 more

And the other is trying to create an index that already exists, which just so happens to be one of the indices that couldn't be deleted because it was being snapshotted. This one is a bit harder to get a clean stack trace of, so here's a snippet of the returned JSON:

"error" : {
  1>         "root_cause" : [
  1>           {
  1>             "type" : "index_already_exists_exception",
  1>             "reason" : "index [index1/TvYH0C5FSHSG4l_PhEQMHA] already exists",
  1>             "index_uuid" : "TvYH0C5FSHSG4l_PhEQMHA",
  1>             "index" : "index1",
  1>             "stack_trace" : "[index1/TvYH0C5FSHSG4l_PhEQMHA] ResourceAlreadyExistsException[index [index1/TvYH0C5FSHSG4l_PhEQMHA] already exists]
  1> 	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateIndexName(MetaDataCreateIndexService.java:147)
  1> 	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validate(MetaDataCreateIndexService.java:512)
  1> 	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.access$000(MetaDataCreateIndexService.java:106)
  1> 	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:239)
  1> 	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45)
  1> 	at org.elasticsearch.cluster.service.ClusterService.executeTasks(ClusterService.java:634)
  1> 	at org.elasticsearch.cluster.service.ClusterService.calculateTaskOutputs(ClusterService.java:612)
  1> 	at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:571)
  1> 	at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263)
  1> 	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)
  1> 	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576)
  1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247)

Logs, for posterity: consoleText.txt.zip

[edit: pasted the wrong second snippet the first time]

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions