Skip to content

[CI] DedicatedClusterSnapshotRestoreIT testSnapshotDeleteRelocatingPrimaryIndex failing #89927

Closed
@original-brownbear

Description

@original-brownbear

Build scan:
https://gradle-enterprise.elastic.co/s/4647z2olruffa/tests/:server:internalClusterTest/org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT/testSnapshotDeleteRelocatingPrimaryIndex

Reproduction line:
./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT.testSnapshotDeleteRelocatingPrimaryIndex" -Dtests.seed=7CB99D2060E62358 -Dtests.locale=ar-KW -Dtests.timezone=ECT -Druntime.java=17

Applicable branches:
main

Reproduces locally?:
Yes

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT&tests.test=testSnapshotDeleteRelocatingPrimaryIndex

Failure excerpt:

java.util.concurrent.TimeoutException: Timeout waiting for task.

  at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:228)
  at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:53)
  at org.elasticsearch.test.ClusterServiceUtils.awaitClusterState(ClusterServiceUtils.java:226)
  at org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase.awaitClusterState(AbstractSnapshotIntegTestCase.java:577)
  at org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase.awaitNoMoreRunningOperations(AbstractSnapshotIntegTestCase.java:560)
  at org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase.awaitNoMoreRunningOperations(AbstractSnapshotIntegTestCase.java:555)
  at org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT.testSnapshotDeleteRelocatingPrimaryIndex(DedicatedClusterSnapshotRestoreIT.java:1140)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:568)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:833)

As discussed with @pxsalehi this is a real bug that can lead to aborted or failed shard snapshots never completing due to an oversight in #88209 . This could theoretically also affect other tests around aborting/failing shard snapshots. A fix is known and in-progress.

Metadata

Metadata

Assignees

Labels

:Distributed Coordination/Snapshot/RestoreAnything directly related to the `_snapshot/*` APIs>test-failureTriaged test failures from CITeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions