Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index creation waits for write consistency shards #18985

Conversation

abeyad
Copy link

@abeyad abeyad commented Jun 20, 2016

Before returning, index creation now waits for the write consistency
number of shards to be available. An index can not take any indexing or
other replication operations without the write consistency level of
shards being available anyway, so waiting on the index creation response
in order for this condition to be met makes sense, and allows API users
to not depend on cluster health checks before attempting indexing
operations on the newly created index.

@@ -77,6 +79,8 @@

private boolean updateAllTypes = false;

private WriteConsistencyLevel waitOnShards = WriteConsistencyLevel.DEFAULT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd name. Maybe call it waitForConsistency?

@nik9000
Copy link
Member

nik9000 commented Jun 20, 2016

Wonderful news! I don't know a ton about the cluster state stuff but it seems sane and after applying it this gist passes:
https://gist.github.com/nik9000/3d65d5d883c7fd918a0835947c6549eb

@@ -83,15 +94,34 @@ protected void masterOperation(final CreateIndexRequest request, final ClusterSt

@Override
public void onResponse(ClusterStateUpdateResponse response) {
listener.onResponse(new CreateIndexResponse(response.isAcknowledged()));
// the cluster state that includes the newly created index has been acknowledged,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should put this into MetaDataCreateIndexService instead (so we don't need to duplicate it into the two transport actions).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch I actually did this in a new commit which I will push up soon.

@abeyad abeyad force-pushed the index-creation-wait-for-shards branch 2 times, most recently from a99a910 to b511a5c Compare June 24, 2016 06:10
@@ -440,6 +443,28 @@ public CreateIndexRequest updateAllTypes(boolean updateAllTypes) {
return this;
}

public int waitOnActiveShards() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a superclass ReplicationRequest where we should put these methods (as they apply to all the replication requests).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateIndexRequest does not extend ReplicationRequest, nor does UpdateRequest, etc.

.put(IndexMetaData.INDEX_NUMBER_OF_REPLICAS_SETTING.getKey(), 1)).get();
ensureGreen("test");
.put(IndexMetaData.INDEX_NUMBER_OF_REPLICAS_SETTING.getKey(), 1))
.setWaitForActiveShards(ActiveShardCount.ALL).get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure green has the nice reporting if we fail to get to green. Can we have an equivalent utility method that takes a create index response and check for shard acked, and log things? maybe a variant of assertAcked

@bleskes
Copy link
Contributor

bleskes commented Jul 13, 2016

This looks great. I left some very minor comments. I do think we miss the change to TransportIndexAction and TransportBulkAction where we set the waitForActive shards to 0.

Of course, @ywelsch should also LGTM this as he is the main reviewer

@abeyad
Copy link
Author

abeyad commented Jul 13, 2016

@bleskes Thank you for the review. 6c2e48a addresses your code review comments.

@@ -39,25 +39,31 @@
private static final String DRY_RUN = "dry_run";
private static final String ROLLED_OVER = "rolled_over";
private static final String CONDITIONS = "conditions";
private static final String CREATE_INDEX_ACKNOWLEDGED = "create_index_acknowledged";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. i'm not sure we want to expose the complexity of the internal two steps. For example if the alias is acked and the index isn't, it doesn't really mean that the index isn't known by the nodes - having processed the alias change guarantees it. I think we can keep a simple acknowledge like in other responses, which means all nodes know about the new index and the new alias? it maybe then that with rollover the shards are acked but not the request. If we want to avoid that , I will only wait for shards active after the alias change is done.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it maybe then that with rollover the shards are acked but not the request. If we want to avoid that , I will only wait for shards active after the alias change is done.

That's kind of why its a bit nebulous. The semantics of acknowledged on ClusterStateUpdateResponse should be the same, but the trick here is that we have two different cluster state updates - one for index creation and one for the alias. We can move the shard waiting to after the alias swap, but then we also need to change the MetaDataCreateIndexService to make onlyCreateIndex public, and probably change the names.... createIndex only creates the index, createIndexAndWait includes the waiting... if they are to be public methods.. thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can explicitly set the the waitForShards of the index creation to NONE and use the setting of the request to wait after the alias was acked. All variants of this are not ideal, so I would go for what you find the simplest - I tend to agree that consistency is the best here and thus leaning towards what I just described.

@abeyad
Copy link
Author

abeyad commented Jul 14, 2016

@bleskes fcf60ee addresses the latest comments

case ALL_ACTIVE_SHARDS:
return ALL;
case 0:
return NONE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we "special case" NONE here but not ONE, maybe it's simpler just to remove this method as well as the validateValue one and use new ActiveShardCount(...) in the two places it's currently used (and also ad

if (value < -2) {
  throw new IllegalArgumentException(...)
}

to the constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I you decide to go this route you should also remember to replace the reference equality checks (this == ActiveShardCount.NONE) by equals checks or by looking at value (this.value == 0).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I preferred the current way is that we don't have to recreate new objects for what will most likely be the same values re-used over and over again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sounds good. maybe add the case 1: then.

@ywelsch
Copy link
Contributor

ywelsch commented Jul 15, 2016

I left minor comments. Once addressed, feel free to push. LGTM! Thanks @abeyad.

Before returning, index creation now waits for the configured number
of shard copies to be started. In the past, a client would create an
index and then potentially have to check the cluster health to wait
to execute write operations. With the cluster health semantics changing
so that index creation does not cause the cluster health to go RED,
this change enables waiting for the desired number of active shards
to be active before returning from index creation.

Relates elastic#9126
@abeyad abeyad force-pushed the index-creation-wait-for-shards branch from fcf60ee to 6184ba3 Compare July 15, 2016 15:17
@abeyad
Copy link
Author

abeyad commented Jul 15, 2016

@ywelsch @bleskes Thank you very much for all the valuable feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >enhancement resiliency v5.0.0-alpha5
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants