Skip to content

[CI] SharedClusterSnapshotRestoreIT suite times out #61541

@nik9000

Description

@nik9000

Build scan:
https://gradle-enterprise.elastic.co/s/pa53kwessobpw/

Repro line:

Reproduces locally?:
No.

Applicable branches:
At least back to 7.6.

Failure history:
https://build-stats.elastic.co/goto/f724f2915cd7dde4087f8326e924f8db

Happens maybe once a week.

Failure excerpt:

  2> WARNING: Suite execution timed out: org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT
  2> ==== jstack at approximately timeout time ====
...
  2> "TEST-SharedClusterSnapshotRestoreIT.testDeleteRepositoryWhileSnapshotting-seed#[9732112406BB2DF8]" ID=2374 WAITING on java.util.concurrent.CountDownLatch$Sync@60bd29c5
  2> 	at java.base@11.0.2/jdk.internal.misc.Unsafe.park(Native Method)
  2> 	- waiting on java.util.concurrent.CountDownLatch$Sync@60bd29c5
  2> 	at java.base@11.0.2/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
  2> 	at java.base@11.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
  2> 	at java.base@11.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
  2> 	at java.base@11.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
  2> 	at java.base@11.0.2/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
  2> 	at app//org.elasticsearch.test.ESIntegTestCase.indexRandom(ESIntegTestCase.java:1450)
  2> 	at app//org.elasticsearch.test.ESIntegTestCase.indexRandom(ESIntegTestCase.java:1377)
  2> 	at app//org.elasticsearch.test.ESIntegTestCase.indexRandom(ESIntegTestCase.java:1360)
  2> 	at app//org.elasticsearch.test.ESIntegTestCase.indexRandom(ESIntegTestCase.java:1336)
  2> 	at app//org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase.indexRandomDocs(AbstractSnapshotIntegTestCase.java:383)
  2> 	at app//org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT.testDeleteRepositoryWhileSnapshotting(SharedClusterSnapshotRestoreIT.java:1691)
  2> 	at java.base@11.0.2/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  2> 	at java.base@11.0.2/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  2> 	at java.base@11.0.2/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  2> 	at java.base@11.0.2/java.lang.reflect.Method.invoke(Method.java:566)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
  2> 	at app//org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  2> 	at app//org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  2> 	at app//org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  2> 	at app//org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  2> 	at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  2> 	at app//org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  2> 	at app//org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  2> 	at app//org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  2> 	at app//org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  2> 	at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  2> 	at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$$Lambda$145/0x000000080029c040.run(Unknown Source)
  2> 	at java.base@11.0.2/java.lang.Thread.run(Thread.java:834)

...

It timed out indexing 100 random docs. That shouldn't be a problem, I guess.

Metadata

Metadata

Labels

:Distributed Coordination/Snapshot/RestoreAnything directly related to the `_snapshot/*` APIs>test-failureTriaged test failures from CITeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions