Enforce higher priority for RepositoriesService ClusterStateApplier #59040

fcofdez · 2020-07-05T11:24:53Z

This avoids shards allocation failures when the repository instance
comes in the same ClusterState update as the shard allocation.

Backport of #58808

elasticmachine · 2020-07-05T11:24:55Z

Pinging @elastic/es-distributed (:Distributed/Cluster Coordination)

This avoids shards allocation failures when the repository instance comes in the same ClusterState update as the shard allocation. Backport of elastic#58808

original-brownbear · 2020-07-05T12:34:13Z

@fcofdez looks like the relevant test that this PR adds failed on this PR. Is this a possible failure mode in master as well maybe?

fcofdez · 2020-07-06T07:42:48Z

It seems like the change introduced in 3b71a31 wasn't ported to 7.x, in particular the failure is triggered by trying to modify DISCOVERY_ZEN_MINIMUM_MASTER_NODES_SETTING in InternalTestCluster.java while the master is still blocked waiting for a third data node to join. does that setting still apply? @ywelsch @original-brownbear

DaveCTurner · 2020-07-06T09:26:32Z

Indeed #39466 was a master-only change since we must still support Zen1 in 7.x for the purposes of rolling upgrades from 6.x. It seems strange that the cluster hasn't recovered by this point, however; I'll dig deeper.

fcofdez · 2020-07-06T09:44:23Z

Thanks for the clarification @DaveCTurner. In the test I was forcing the recovery to be held until a third data node joins the cluster, the reason was to avoid race conditions around the registration of the cluster state listener that collects the data to do the assertions. So I think that the behavior is kind of expected, I'm not sure if we have a different way to avoid that possible race condition in 7.x without forcing the recovery to be postponed until an additional node joins.

DaveCTurner · 2020-07-06T10:26:50Z

I see, sorry for the delay, it's taken me over an hour to shave all the yaks needed to run this darn test (required an IntelliJ upgrade). I understand now.

I think you can avoid these checks with autoManageMasterNodes:

diff --git a/x-pack/plugin/searchable-snapshots/src/test/java/org/elasticsearch/xpack/searchablesnapshots/ClusterStateApplierOrderingTests.java b/x-pack/plugin/searchable-snapshots/src/test/java/org/elasticsearch/xpack/searchablesnapshots/ClusterStateApplierOrderingTests.java
index 8277714a3ec..3a21761fcfb 100644
--- a/x-pack/plugin/searchable-snapshots/src/test/java/org/elasticsearch/xpack/searchablesnapshots/ClusterStateApplierOrderingTests.java
+++ b/x-pack/plugin/searchable-snapshots/src/test/java/org/elasticsearch/xpack/searchablesnapshots/ClusterStateApplierOrderingTests.java
@@ -35,10 +35,13 @@ import static org.hamcrest.Matchers.equalTo;
 import static org.hamcrest.Matchers.greaterThan;
 import static org.hamcrest.Matchers.is;

-@ESIntegTestCase.ClusterScope(scope = TEST, numDataNodes = 2)
+@ESIntegTestCase.ClusterScope(scope = TEST, numDataNodes = 0, autoManageMasterNodes = false)
 public class ClusterStateApplierOrderingTests extends BaseSearchableSnapshotsIntegTestCase {

     public void testRepositoriesServiceClusterStateApplierIsCalledBeforeIndicesClusterStateService() throws Exception {
+        internalCluster().setBootstrapMasterNodeIndex(0);
+        internalCluster().startNodes(2);
+
         final String fsRepoName = "fsrepo";
         final String indexName = "test-index";
         final String restoredIndexName = "restored-index";

fcofdez · 2020-07-06T12:31:49Z

retest this please

fcofdez · 2020-07-06T15:43:32Z

retest this please

fcofdez · 2020-07-06T17:11:04Z

retest this please

DaveCTurner

LGTM

Enforce higher priority for RepositoriesService ClusterStateApplier

60ff993

This avoids shards allocation failures when the repository instance comes in the same ClusterState update as the shard allocation. Backport of elastic#58808

fcofdez force-pushed the repositories-cluster-state-applier-priority-7.x branch from 08c239a to 60ff993 Compare July 5, 2020 11:47

Avoid cluster setting update after adding a new node

edbbc5a

DaveCTurner approved these changes Jul 7, 2020

View reviewed changes

fcofdez merged commit 0752a86 into elastic:7.x Jul 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enforce higher priority for RepositoriesService ClusterStateApplier #59040

Enforce higher priority for RepositoriesService ClusterStateApplier #59040

Uh oh!

fcofdez commented Jul 5, 2020

Uh oh!

elasticmachine commented Jul 5, 2020

Uh oh!

original-brownbear commented Jul 5, 2020

Uh oh!

fcofdez commented Jul 6, 2020 •

edited

Loading

Uh oh!

DaveCTurner commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020 •

edited

Loading

Uh oh!

DaveCTurner commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

DaveCTurner left a comment

Uh oh!

Uh oh!

Enforce higher priority for RepositoriesService ClusterStateApplier #59040

Enforce higher priority for RepositoriesService ClusterStateApplier #59040

Uh oh!

Conversation

fcofdez commented Jul 5, 2020

Uh oh!

elasticmachine commented Jul 5, 2020

Uh oh!

original-brownbear commented Jul 5, 2020

Uh oh!

fcofdez commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaveCTurner commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaveCTurner commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

fcofdez commented Jul 6, 2020

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fcofdez commented Jul 6, 2020 •

edited

Loading

fcofdez commented Jul 6, 2020 •

edited

Loading