[Zen2] Introduce PreVoteCollector #32847

DaveCTurner · 2018-08-14T14:24:29Z

An election requires a node to select a term that is higher than all
previously-seen terms. If nodes are too enthusiastic about starting elections
then they can effectively excludes itself from the cluster until the leader can
bump to a still-higher term, and if this process repeats then a single faulty
node can prevent the cluster from making useful progress.

The solution is to start the election with a pre-voting round to ensure that
there is at least a quorum of nodes who believe there to be no leader.

An election requires a node to select a term that is higher than all previously-seen terms. If nodes are too enthusiastic about starting elections then they can effectively excludes itself from the cluster until the leader can bump to a still-higher term, and if this process repeats then a single faulty node can prevent the cluster from making useful progress. The solution is to start the election with a pre-voting round to ensure that there is at least a quorum of nodes who believe there to be no leader.

ywelsch

I've left some initial feedback. I'm missing a test that checks that a subsequent election (simulated on coordinationstate) will succeed given a successful prevoting round.

ywelsch · 2018-08-14T16:26:52Z

server/src/main/java/org/elasticsearch/cluster/coordination/PreVoteCollectorFactory.java

+
+            final long currentMaxTermSeen = updateMaxTermSeen(response.getCurrentTerm());
+
+            final PreVoteResponse currentPreVoteResponse = preVoteResponse;


I wonder if we should capture this in the PreVotingRound constructor. This should very rarely be changing (similarly to cluster state) if there is an ongoing prevoting round. This makes this instance also internally more consistent. Finally, I would like to see if we can make this inner class static, so that we know exactly about all the dependencies it has.

We can use the current (i.e. last-accepted) cluster state for this, I think.

The inner class uses these from the outer class:

logger

transportService

startElection.run()

toString()

preVoteResponse.getCurrentTerm() (in the constructor)

I got rid of the getCurrentTerm() call in the constructor, but I would prefer to use the others as-is rather than making it static.

ywelsch · 2018-08-14T16:31:56Z

server/src/main/java/org/elasticsearch/cluster/coordination/PreVoteCollectorFactory.java

+
+    public static final String REQUEST_PRE_VOTE_ACTION_NAME = "internal:cluster/request_pre_vote";
+
+    private final AtomicLong maxTermSeen = new AtomicLong(0);


I wonder if we should leave this out of the PR for now. I'm expecting this "maxTermSeen" thingy to appear in more components, and would like to treat it uniformly (maybe just a callback).

ywelsch · 2018-08-14T16:34:47Z

server/src/main/java/org/elasticsearch/cluster/coordination/PreVoteCollectorFactory.java

+
+        void start(final Iterable<DiscoveryNode> broadcastNodes) {
+
+            final boolean isRunningChanged = isRunning.compareAndSet(false, true);


this becomes obsolete by changing to isClosed

ywelsch · 2018-08-14T16:37:09Z

server/src/main/java/org/elasticsearch/cluster/coordination/PreVoteCollectorFactory.java

+            final long currentMaxTermSeen = updateMaxTermSeen(response.getCurrentTerm());
+
+            final PreVoteResponse currentPreVoteResponse = preVoteResponse;
+            if (response.getLastAcceptedTerm() > currentPreVoteResponse.getLastAcceptedTerm()


I wonder if we can do something similar as for the isElectionQuorum trick where we share implementation of these checks with CoordinationState.

I'd prefer not to do so. The machinery to share these few lines of code obscures what they do, and the extra abstraction in CoordinationState takes it further away from the formal model.

ywelsch · 2018-08-14T16:38:35Z

server/src/main/java/org/elasticsearch/cluster/coordination/PreVoteCollectorFactory.java

+import static org.elasticsearch.cluster.coordination.CoordinationState.isElectionQuorum;
+import static org.elasticsearch.common.util.concurrent.ConcurrentCollections.newConcurrentSet;
+
+public class PreVoteCollectorFactory extends AbstractComponent {


maybe omit the "Factory" here.

…fresh

ywelsch

A few nits, looks good o.w.

ywelsch · 2018-08-17T13:38:51Z