Skip to content

Conversation

@cjen1-msft
Copy link
Contributor

@cjen1-msft cjen1-msft commented Dec 8, 2025

Closes #7402

This brings us closer to parity with what etcd does with proposeRequestVotes, which helps to reduce downtime when hosts are killed.

The idea is that when we receive a sigterm when ignore_first_sigterm is enabled, if we're the leader, we should to nominate a successor rather than waiting for an election timeout.
This can also prevent repeated elections, as the nominated successor will have the highest MatchIndex, and hence likely be a valid candidate before any other valid candidates have had their election timeout trigger.

One limitation of the current implementation is that the ProposeRequestVote message is not guaranteed to send before the process terminates. I'm not quite sure what the best approach would be for this.

@achamayou
Copy link
Member

One limitation of the current implementation is that the ProposeRequestVote message is not guaranteed to send before the process terminates. I'm not quite sure what the best approach would be for this.

I don't think we can ever guarantee that, this is a best effort behaviour. Once we disintermediate sending (i.e. remove the ringbuffer), we can try to prioritise this (and other consensus traffic) higher than the rest, but that's still best effort, not a guarantee.

@cjen1-msft
Copy link
Contributor Author

I don't think we can ever guarantee that, this is a best effort behaviour. Once we disintermediate sending (i.e. remove the ringbuffer), we can try to prioritise this (and other consensus traffic) higher than the rest, but that's still best effort, not a guarantee.

To be specific we could go further than this PR to do this during every shutdown of the enclave, where we delay shutdown until the relevant message is sent. But yes especially with the ringbuffer this is difficult to do, and more difficult to get any actual guarantees about it.

@achamayou
Copy link
Member

@cjen1-msft do you mean every shutdown while we are primary?

@cjen1-msft
Copy link
Contributor Author

cjen1-msft commented Dec 8, 2025

@cjen1-msft do you mean every shutdown while we are primary?

@achamayou Yep. So every shutdown of a primary should nominate a successor (not just those where we have ignore_first_sigterm set), and the key constraint would be waiting long enough for the message to be sent before we shutdown.
But doing that is currently difficult without falling back to an arbitrary timeout (eg delay shutdown until 1s after a sigterm).

@cjen1-msft cjen1-msft marked this pull request as ready for review December 8, 2025 16:59
@cjen1-msft cjen1-msft requested a review from a team as a code owner December 8, 2025 16:59
Copilot AI review requested due to automatic review settings December 8, 2025 16:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for proposing a successor vote when a leader receives a SIGTERM signal, bringing CCF closer to parity with etcd's proposeRequestVotes behavior. This reduces downtime when a leader must be suddenly retired by allowing it to nominate the most up-to-date node as a successor rather than waiting for an election timeout.

Key changes:

  • Refactored successor nomination logic into a reusable send_propose_request_vote() method
  • Added nominate_successor() interface method that is called on SIGTERM via stop_notice()
  • Extended TLA+ specification with SigTermProposeVote action for formal verification

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tla/consensus/ccfraft.tla Added SigTermProposeVote action to model leader proposing successor on termination
tla/consensus/Traceccfraft.tla Added trace validation for the new step_down_and_nominate_successor event
tla/consensus/SIMccfraft.tla Added simulation support for SigTermProposeVote
tla/consensus/SIMccfraft.cfg Configured simulation to use SIMSigTermProposeVote
src/node/node_state.h Added call to nominate_successor() in stop_notice() handler
src/kv/kv_types.h Added virtual nominate_successor() method to consensus interface
src/consensus/aft/raft.h Refactored successor nomination logic into send_propose_request_vote() and implemented nominate_successor()
src/consensus/aft/test/driver.h Added nominate_successor command support for test scenarios
src/consensus/aft/test/driver.cpp Implemented parsing for nominate_successor command
tests/raft_scenarios/nominate_successor New test scenario validating successor selection based on match_idx
tests/raft_scenarios_runner.py Modified command tracking to persist across entries for proper trace validation
tests/e2e_operations.py Added E2E test verifying propose_request_vote behavior on SIGTERM
doc/architecture/consensus/index.rst Documented the new ProposeRequestVote behavior on termination signal

cjen1-msft and others added 3 commits December 8, 2025 17:10
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@cjen1-msft cjen1-msft added the run-long-verification Run Long Verification jobs label Dec 9, 2025
@achamayou
Copy link
Member

This brings us closer to parity with what etcd does with proposeRequestVotes.

This is true, and in this case we think it is beneficial to make this change, but it's worth clarifying that parity with etcd behaviour is not generally a goal.

cjen1-msft and others added 4 commits December 10, 2025 10:02
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Amaury Chamayou <amaury@xargs.fr>
@cjen1-msft cjen1-msft enabled auto-merge (squash) December 10, 2025 10:07
@cjen1-msft cjen1-msft disabled auto-merge December 10, 2025 10:08
@achamayou achamayou enabled auto-merge (squash) December 10, 2025 15:51
@achamayou achamayou merged commit 025afae into microsoft:main Dec 10, 2025
23 checks passed
@cjen1-msft cjen1-msft deleted the sigterm-propose-vote branch December 10, 2025 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-long-verification Run Long Verification jobs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ProposeRequestVote on sigterm

2 participants