Skip to content

Omit writing index metadata for non-replicated closed indices on data-only node #47285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Sep 30, 2019

Fixes a bug related to how "closed replicated indices" (introduced in 7.2) interact with the index metadata storage mechanism, which has special handling for closed indices (but incorrectly handles replicated closed indices). On non-master-eligible data nodes, it's possible for the node's manifest file (which tracks the relevant metadata state that the node should persist) to become out of sync with what's actually stored on disk, leading to an inconsistency that is then detected at startup, refusing for the node to start up.

The solution used here is to remove the code that treats closed indices specially. This code has not aged well, and its use is dubious as best.

Closes #47276

@ywelsch ywelsch added >bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.5.0 v7.4.1 labels Sep 30, 2019
@ywelsch ywelsch requested review from tlrx and DaveCTurner September 30, 2019 09:52
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@ywelsch ywelsch changed the title Omit writing index metadata for closed indices on data-only node Omit writing index metadata for non-replicated closed indices on data-only node Sep 30, 2019
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I left an optional suggestion.

@@ -244,10 +229,10 @@ private long writeGlobalState(AtomicClusterStateWriter writer, MetaData newMetaD
}

// exposed for tests
static Set<Index> getRelevantIndices(ClusterState state, ClusterState previousState, Set<Index> previouslyWrittenIndices) {
static Set<Index> getRelevantIndices(ClusterState state) {
Set<Index> relevantIndices;
if (isDataOnlyNode(state)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as how we're touching this, if we checked state.nodes().getLocalNode().isMasterNode() first then we could say merely else if (state.nodes().getLocalNode(). isDataOnlyNode()) and ditch the isDataOnlyNode method.

And there's no need for the intermediate relevantIndices, just return the thing already :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@@ -207,8 +207,7 @@ private long writeGlobalState(AtomicClusterStateWriter writer, MetaData newMetaD
return actions;
}

private static Set<Index> getRelevantIndicesOnDataOnlyNode(ClusterState state, ClusterState previousState, Set<Index>
previouslyWrittenIndices) {
private static Set<Index> getRelevantIndicesOnDataOnlyNode(ClusterState state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💥

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I left some comments but feel free to address them or not

clusterStateWithJustOpenedIndex(indexMetaData, false),
clusterStateWithClosedIndex(indexMetaData, false),
Collections.emptySet());
clusterStateWithJustOpenedIndex(indexMetaData, false));
assertThat(indices.size(), equalTo(0));
}

public void testGetRelevantIndicesForWasClosedPrevWrittenIndexOnDataOnlyNode() {
IndexMetaData indexMetaData = createIndexMetaData("test");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is now a duplicate of the previous one, and I think that PrevNotWritten/PrevWritten becomes misleading now getRelevantIndicesOnDataOnlyNode is changed. I'd prefer to have testGetRelevantIndicesForClosedIndexOnDataOnlyNode testing a closed index replicated/not yet replicated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated

.build());
indexRandom(randomBoolean(), randomBoolean(), randomBoolean(), IntStream.range(0, randomIntBetween(0, 50))
.mapToObj(n -> client().prepareIndex(indexName, "_doc").setSource("num", n)).collect(toList()));
assertAcked(client().admin().indices().prepareClose(indexName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget the waitForActiveShards when backporting :) (I always forgot it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, good one :)

@DaveCTurner
Copy link
Contributor

For posterity, we discussed the various implications of this change. It means that we stop updating the on-disk index metadata for shards of closed non-replicated indices on master-ineligible data nodes. (Master-eligible nodes persist the metadata of all indices in the cluster state, and replicated closed indices appear in the routing table).

The index metadata stored on the disk of a master-ineligible data node is read when the node starts up and also when importing a dangling index. The metadata is read at startup so we can fail the node immediately if it has some invalid/unreadable index metadata but otherwise this has no effect since it is superseded by fresher metadata from the master at join time. It's pretty weak as a corruption check (it doesn't detect non-metadata corruption) and doesn't help much in terms of ensuring the node is compatible with all the indices currently in the cluster, so we're ok with losing that.

When importing a dangling index we already have no freshness guarantees. Furthermore the shard data of a closed index cannot change while that index is closed or unassigned, so we do not need to keep the metadata up to date in order to be sure we will be able to import it as a dangling index in the future. With this change there'll be more chance that a dangling index import will see older mappings or settings, but they will never be so old that the shard is unreadable.

@ywelsch ywelsch merged commit 38f0221 into elastic:master Sep 30, 2019
ywelsch added a commit that referenced this pull request Sep 30, 2019
…-only node (#47285)

Fixes a bug related to how "closed replicated indices" (introduced in 7.2) interact with the index
metadata storage mechanism, which has special handling for closed indices (but incorrectly
handles replicated closed indices). On non-master-eligible data nodes, it's possible for the
node's manifest file (which tracks the relevant metadata state that the node should persist) to
become out of sync with what's actually stored on disk, leading to an inconsistency that is then
detected at startup, refusing for the node to start up.

Closes #47276
ywelsch added a commit that referenced this pull request Sep 30, 2019
…-only node (#47285)

Fixes a bug related to how "closed replicated indices" (introduced in 7.2) interact with the index
metadata storage mechanism, which has special handling for closed indices (but incorrectly
handles replicated closed indices). On non-master-eligible data nodes, it's possible for the
node's manifest file (which tracks the relevant metadata state that the node should persist) to
become out of sync with what's actually stored on disk, leading to an inconsistency that is then
detected at startup, refusing for the node to start up.

Closes #47276
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v7.4.1 v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Elasticsearch fails to start with error: "Failed to find metadata for index" on every restart
5 participants