Skip to content

Allow election of nodes outside voting config #43243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

Today we suppress election attempts on master-eligible nodes that are not in
the voting configuration. In fact this restriction is not necessary: any
master-eligible node can safely become master as long as it has a fresh enough
cluster state and can gather a quorum of votes. Moreover, this restriction is
sometimes undesirable: there may be a reason why we do not want any of the
nodes in the voting configuration to become master.

The reason for this restriction is as follows. If you want to shut the master
down then you might first exclude it from the voting configuration. When this
exclusion succeeds you might reasonably expect that a new master has been
elected, since the voting config exclusion is almost always a step towards
shutting the node down. If we allow nodes outside the voting configuration to
be the master then the excluded node will continue to be master, which is
confusing.

This commit adjusts the logic to allow master-eligible nodes to attempt an
election even if they are not in the voting configuration. If such a master is
successfully elected then it adds itself to the voting configuration. This
commit also adjusts the logic that causes master nodes to abdicate when they
are excluded from the voting configuration, to avoid the confusion described
above.

Relates #37712, #37802.

Today we suppress election attempts on master-eligible nodes that are not in
the voting configuration. In fact this restriction is not necessary: any
master-eligible node can safely become master as long as it has a fresh enough
cluster state and can gather a quorum of votes. Moreover, this restriction is
sometimes undesirable: there may be a reason why we do not want any of the
nodes in the voting configuration to become master.

The reason for this restriction is as follows. If you want to shut the master
down then you might first exclude it from the voting configuration. When this
exclusion succeeds you might reasonably expect that a new master has been
elected, since the voting config exclusion is almost always a step towards
shutting the node down. If we allow nodes outside the voting configuration to
be the master then the excluded node will continue to be master, which is
confusing.

This commit adjusts the logic to allow master-eligible nodes to attempt an
election even if they are not in the voting configuration. If such a master is
successfully elected then it adds itself to the voting configuration. This
commit also adjusts the logic that causes master nodes to abdicate when they
are excluded from the voting configuration, to avoid the confusion described
above.

Relates elastic#37712, elastic#37802.
@DaveCTurner DaveCTurner added >non-issue :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.3.0 labels Jun 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@@ -189,7 +190,8 @@ public Coordinator(String nodeName, Settings settings, ClusterSettings clusterSe

private ClusterFormationState getClusterFormationState() {
return new ClusterFormationState(settings, getStateForMasterService(), peerFinder.getLastResolvedAddresses(),
StreamSupport.stream(peerFinder.getFoundPeers().spliterator(), false).collect(Collectors.toList()), getCurrentTerm());
Stream.concat(Stream.of(getLocalNode()), StreamSupport.stream(peerFinder.getFoundPeers().spliterator(), false))
.collect(Collectors.toList()), getCurrentTerm());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes unrelated bug in which the ClusterFormationFailureHelper wasn't claiming to have discovered the local node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for this?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up. I've left 2 nits.

@@ -189,7 +190,8 @@ public Coordinator(String nodeName, Settings settings, ClusterSettings clusterSe

private ClusterFormationState getClusterFormationState() {
return new ClusterFormationState(settings, getStateForMasterService(), peerFinder.getLastResolvedAddresses(),
StreamSupport.stream(peerFinder.getFoundPeers().spliterator(), false).collect(Collectors.toList()), getCurrentTerm());
Stream.concat(Stream.of(getLocalNode()), StreamSupport.stream(peerFinder.getFoundPeers().spliterator(), false))
.collect(Collectors.toList()), getCurrentTerm());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for this?

@DaveCTurner
Copy link
Contributor Author

I backed out the change to the ClusterFormationFailureHelper in c8d8caf, moved it to #43316 and added a test.

I addressed the other comment in 9636d55.

@DaveCTurner DaveCTurner merged commit d95b53d into elastic:master Jun 18, 2019
@DaveCTurner DaveCTurner deleted the 2019-06-14-elect-nodes-outside-voting-configuration branch June 18, 2019 11:10
DaveCTurner added a commit that referenced this pull request Jun 18, 2019
Today we suppress election attempts on master-eligible nodes that are not in
the voting configuration. In fact this restriction is not necessary: any
master-eligible node can safely become master as long as it has a fresh enough
cluster state and can gather a quorum of votes. Moreover, this restriction is
sometimes undesirable: there may be a reason why we do not want any of the
nodes in the voting configuration to become master.

The reason for this restriction is as follows. If you want to shut the master
down then you might first exclude it from the voting configuration. When this
exclusion succeeds you might reasonably expect that a new master has been
elected, since the voting config exclusion is almost always a step towards
shutting the node down. If we allow nodes outside the voting configuration to
be the master then the excluded node will continue to be master, which is
confusing.

This commit adjusts the logic to allow master-eligible nodes to attempt an
election even if they are not in the voting configuration. If such a master is
successfully elected then it adds itself to the voting configuration. This
commit also adjusts the logic that causes master nodes to abdicate when they
are excluded from the voting configuration, to avoid the confusion described
above.

Relates #37712, #37802.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >non-issue v7.3.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants