Several tests fail with "failed to process cluster event" due to timeout

I noticed several tests failing with related error messages:

* `failed to process cluster event (delete_repository [*]) within 30s`
* `failed to process cluster event (put-mapping [idx/iv7FVCFrT7K3iGZ0c-6X2w]) within 13s`
* `failed to process cluster event (delete-index [[test-idx/TdYM8OZkRkOHagTQzNiCFQ]]) within 30s`

From initial analysis it's hard to tell whether they have different root causes or whether there is one underlying cause. However, it appears to me that the issues is not due the individual test case hence I have raised only one issue but please feel free to split this into individual issues if further analysis uncovers that this makes sense.

**Build scan**:

* https://gradle-enterprise.elastic.co/s/hig7ipu7fpa4m
* https://gradle-enterprise.elastic.co/s/7ulfjotymwkpe
* https://gradle-enterprise.elastic.co/s/xdv5j7sg3z3j2
* https://gradle-enterprise.elastic.co/s/rpm3r3o7nztss

**Repro line**:

```
# taken from the most recent failure in https://gradle-enterprise.elastic.co/s/hig7ipu7fpa4m
./gradlew ':x-pack:plugin:security:internalClusterTest' --tests "org.elasticsearch.integration.FieldLevelSecurityTests.testParentChild" -Dtests.seed=3E1FD9F52E1FDC19 -Dtests.security.manager=true -Dtests.locale=en-GB -Dtests.timezone=Pacific/Gambier -Druntime.java=11
```

**Reproduces locally?**:

No

**Applicable branches**:

* master
* 7.9

**Failure history**:

According to [build stats](https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!t,value:0),time:(from:now-6M,mode:quick,to:now))&_a=(columns:!(sys.os,branch),index:e58bf320-7efd-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'%22failed%20to%20process%20cluster%20event%22%20AND%20NOT%20branch:%22pull-request%22'),sort:!(time,desc))), this failure has occured four times within the last 30 days; 12 times in the last 6 months (excluding pull request builds).

**Failure excerpt**:

```
05:03:42   2> org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (delete_repository [*]) within 30s
05:03:42         at __randomizedtesting.SeedInfo.seed([3E1FD9F52E1FDC19:27DEB90468F17E06]:0)
05:03:42         at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143)
05:03:42         at java.util.ArrayList.forEach(ArrayList.java:1540)
05:03:42         at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142)
05:03:42         at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:674)
05:03:42         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
05:03:42         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
05:03:42         at java.lang.Thread.run(Thread.java:834)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Several tests fail with "failed to process cluster event" due to timeout #62853

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Several tests fail with "failed to process cluster event" due to timeout #62853

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions