Closed
Description
In Zen, ClusterState
is stored through GatewayMetaState
. There are the following issues with the current approach:
- Since
GatewayMetaState
implementsClusterStateApplier
, if there was anIOException
during storing the state,ClusterState
change still will be applied to in-memory node state. GatewayMetaState
stores global metadata and index metadata for each index in separate files. SeeMetaDataStateFormatter
.MetaDataStateFormatter
ensures that global/index metadata is stored atomically. However, if there is a change to the global metadata and index metadata or metadata of several indices, state update could be partial.
Zen2 needs a reliable mechanism to store ClusterState
without these drawbacks. Two alternative approaches were discussed:
- Instead of storing metadata in separate files, create translog for
ClusterState
diffs. OnceClusterState.Diff
is received, persist it to the translog. There could be a background merging process that merges multipleClusterState
diffs together. - Enhance existing solution, by adding the manifest file that will contain pointers to global state/index metadata files and will ensure atomicity.
While the 1st approach is preferable in the long run, for now, we decided to go with the 2nd approach.
Below is the list of things that should be done:
- Migrate
MetaDataStateFormat
to Lucene directory abstraction for easier failure testing. (Switch MetaDataStateFormat to Lucene directory abstraction #33989) - Change
MetaDataStateFormat.write
semantics, to clearly distinguish 2 failure cases - write has failed andloadLatestState
must return old state and write has failed and loadLatestState may return either old or new state. ([Zen2] Change MetaDataStateFormat write semantics #34709) - Add manifest file support ([Zen2] Write manifest file #35049)
- Move
ClusterState
fields to be persisted toMetaData
field (exceptversion
, which is updated very often and will go directly to Manifest). Namely,term
,lastCommitedConfiguration
,lastAcceptedConfiguration
andvotingTombstones
. ([Zen2] Move ClusterState fields to be persisted to ClusterState.MetaData #35625) - Implement
PersistedState
interface for Zen2. Note,term != currentTerm
andterm
goes toMetaData
,currentTerm
goes toManifest
. ([Zen2] PersistedState interface implementation #35819) - Properly handle
WriteStateException
thrown byGatewayMetaState
.
Although points 1-3 are relevant for Zen, it's decided to make changes on Zen2 branch.
Relates to #32006