Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes write consistency level across replication action APIs in favor of wait_for_active_shards #19454

Merged
merged 5 commits into from
Aug 2, 2016

Conversation

abeyad
Copy link

@abeyad abeyad commented Jul 15, 2016

Removes the notion of write consistency level across replication action APIs
(index, update, delete, re-index, etc) in favor of waiting for active shard copy
count (wait_for_active_shards) that was introduced in #18985 for index creation.

The reason for the removal of write consistency for replication operations is that
it led to a false notion of what guarantees were made by checking the "write
consistency" level. The way write consistency checks worked before was to check
the number of active shard copies as a replication request (e.g. indexing) arrived. If
the required number of active shards were present (defaulted to "quorum"),
then the replication operation would proceed. However, it is entirely possible that
replication would still fail on a majority or even all replicas after the write consistency
check, but the operation still succeeds on the primary. Also, our use of the terms
"write consistency" and "quorum" in relation to replication actions did not convey
the same semantics as found in the literature, as we don't have an equivalent
notion of read consistency and quorum typically relates to a guarantee in the
consensus model which we do not have for replication actions. Hence, we have
changed the terminology to use wait_for_active_shards as it better conveys the fact
that this check is a best attempt at maintaining the desired level of replication and
thereby resiliency.

This PR also removes the index.write_consistency setting in favor of a new setting: index.write.wait_for_active_shards. index.write.wait_for_active_shards is an
index-level setting that is dynamically updatable, and can be set to either all or
any non-negative value up to the total number of shard copies for a shard in an
index (number of replicas + 1). The default value for this setting is 1 (meaning only
wait for the primary shard).

@abeyad
Copy link
Author

abeyad commented Jul 15, 2016

@bleskes @ywelsch FYI, I marked it as a WIP in case we want to change the name from wait_for_active_shards to something else. Otherwise its ready for review.

@clintongormley clintongormley added the :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. label Jul 17, 2016
return false;
}
return true;
public EvalResult enoughShardsActive(final IndexShardRoutingTable shardRoutingTable, final IndexMetaData indexMetaData) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we try to avoid passing in the indexMetaData? I think the shard routing table should have all we need. The resolve method can be changed, or even removed as we can special case the testing for all (check for unassigned) and none (allways succeed) here.

Also, it would be great if we can remove the EvalResult class. We currently create it for each operation. Instead I would suggest having the boolean as we had before, and people can either call a describe method to get a string description or construct a string themselves. All the info they need is easily available, with the exception of the resolve() value, but that's clear from the name of the ActiveShardCount.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes the primary motivation for the resolve method was to abstract the value within the class. you are correct in that we could just pass in the number of replicas instead of the entire IndexMetaData, but I believe the resolve method has utility, otherwise we have to expose a getValue method on ActiveShardCount, which seems easily avoided if we keep the resolve method. Thoughts?

@bleskes
Copy link
Contributor

bleskes commented Jul 18, 2016

Thx @abeyad . Left some comments. I also miss the (migration) doc changes. Are those coming in another PR (and please, let's not forget :))

@abeyad abeyad force-pushed the remove-write-consistency-level branch from 683fc5e to 762301f Compare July 19, 2016 21:13
@abeyad
Copy link
Author

abeyad commented Jul 19, 2016

@bleskes
c23e9da addresses the code review in general
536a2db removes the checkWriteConsistency() method from transport replication actions in favor of ActiveShardCount
762301f makes the index.write.wait_for_active_shards setting updatable

I have a couple things to discuss with you regarding documentation, then I will push that up in this PR as well.

@abeyad abeyad removed the WIP label Jul 19, 2016
@abeyad abeyad changed the title [WIP] Removes the notion of write consistency level across all APIs Removes the notion of write consistency level across all APIs Jul 19, 2016
@abeyad abeyad added :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Jul 21, 2016
@abeyad abeyad changed the title Removes the notion of write consistency level across all APIs Removes write consistency level across all APIs in favor of wait_for_active_shards Jul 21, 2016
@abeyad abeyad force-pushed the remove-write-consistency-level branch from a3ad82a to 48c1d9d Compare July 21, 2016 21:04
@abeyad
Copy link
Author

abeyad commented Jul 21, 2016

@bleskes @clintongormley 48c1d9d contains the documentation updates. I was not sure if I should add something to the resiliency page or remove the entry in the resiliency page that talks about us adding write consistency level: https://github.com/elastic/elasticsearch/blob/master/docs/resiliency/index.asciidoc#validate-quorum-before-accepting-a-write-request-status-done-v140

@abeyad abeyad changed the title Removes write consistency level across all APIs in favor of wait_for_active_shards Removes write consistency level across replication action APIs in favor of wait_for_active_shards Jul 25, 2016
@abeyad abeyad force-pushed the remove-write-consistency-level branch from fbb4ac8 to bcfd58f Compare July 25, 2016 15:52

// active shard count not met, should throw exception
client().admin().indices().prepareUpdateSettings(indexName).setSettings(
Settings.builder().put(WAIT_FOR_ACTIVE_SHARDS_SETTING.getKey(), Integer.toString(internalCluster().numDataNodes() + 1)).build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes use an illegal negative value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

On Fri, Jul 29, 2016 at 10:10 AM Boaz Leskes notifications@github.com
wrote:

In core/src/test/java/org/elasticsearch/indexing/IndexActionIT.java
#19454 (comment)
:

  • /**
  • \* This test ensures that indexing operations adhere to the {@link IndexSettings#WAIT_FOR_ACTIVE_SHARDS_SETTING}.
    
  • */
    
  • public void testChangeWaitForActiveShardsS

etting() throws Exception {

  •    final String indexName = "test";
    
  •    internalCluster().ensureAtLeastNumDataNodes(2);
    
  •    final int numReplicas = internalCluster().numDataNodes() + randomIntBetween(0, 3);
    
  •    final Settings originalSettings = Settings.builder()
    
  •                                          .put(IndexMetaData.INDEX_NUMBER_OF_SHARDS_SETTING.getKey(), 1)
    
  •                                          .put(IndexMetaData.INDEX_NUMBER_OF_REPLICAS_SETTING.getKey(), numReplicas)
    
  •                                          .build();
    
  •    assertAcked(client().admin().indices().prepareCreate(indexName).setSettings(originalSettings).get());
    
  •    // active shard count not met, should throw exception
    
  •    client().admin().indices().prepareUpdateSettings(indexName).setSettings(
    
  •        Settings.builder().put(WAIT_FOR_ACTIVE_SHARDS_SETTING.getKey(), Integer.toString(internalCluster().numDataNodes() + 1)).build()
    

sometimes use an illegal negative value?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/pull/19454/files/bcfd58f1bbac5e0dc7a65fa9365821ef0b39c052#r72797211,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABjkQVlW2m0rrYxBAqkBROOhsNqwqe2bks5qagm5gaJpZM4JNuiL
.

@bleskes
Copy link
Contributor

bleskes commented Jul 29, 2016

Thx @abeyad . Left some very minor comments.

I was not sure if I should add something to the resiliency page or remove the entry in the resiliency page that talks about us adding write consistency level: https://github.com/elastic/elasticsearch/blob/master/docs/resiliency/index.asciidoc#validate-quorum-before-accepting-a-write-request-status-done-v140

I would just removed that paragraph. It never made it into official code - I think it was done on the improve_zen feature branch back in 1.4 but removed before the merge into master.

@abeyad
Copy link
Author

abeyad commented Jul 30, 2016

@bleskes I pushed 8f7f4bd that addresses your code review comments

/**
* The number of active shard copies to check for before proceeding with a write operation.
*/
public static final Setting<ActiveShardCount> WAIT_FOR_ACTIVE_SHARDS_SETTING =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of mess already - so there are many conventions here - but let's try to follow of the majority here and name this SETTING_WAIT_FOR_ACTIVE_SHARDS ? also it will be good to open a follow up to discuss location of index settings and their naming conventions. It's all over the place now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, I was thoroughly confused myself as to what goes in IndexMetaData vs IndexSettings, as well as the conventions. Also, some specialized classes hold their own index level settings. I will make the change to SETTING_WAIT_FOR_ACTIVE_SHARDS and open a discussion issue

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #19723 to discuss the issue of index-level setting names and locations

@bleskes
Copy link
Contributor

bleskes commented Aug 1, 2016

@abeyad thx. I left some minor comments.

@abeyad
Copy link
Author

abeyad commented Aug 1, 2016

@bleskes I pushed 8164041

Ali Beyad added 4 commits August 1, 2016 13:35
favor of waiting for active shard copy count (wait_for_active_shards).
…ency()

method to determine if a write consistency check should be performed
before proceeding with the action.  This commit removes this method from
the transport replication actions in favor of setting the ActiveShardCount
on the request, with setting the value to ActiveShardCount.NONE if the
transport action's checkWriteConsistency() method returned false.
dynamically updatable for both index creation and write operations.
/**
* This test ensures that index creation adheres to the {@link IndexMetaData#SETTING_WAIT_FOR_ACTIVE_SHARDS}.
*/
public void testChangeWaitForActiveShardsSetting() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this called change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a great name, I'll change it to something more descriptive like testDefaultWaitForActiveShardsUsesIndexSetting and for the other one.

@bleskes
Copy link
Contributor

bleskes commented Aug 1, 2016

thx @abeyad . Almost there. Left some very minor comments.

@abeyad abeyad force-pushed the remove-write-consistency-level branch from 8164041 to 4923da9 Compare August 1, 2016 23:15
@abeyad
Copy link
Author

abeyad commented Aug 1, 2016

@bleskes i pushed 4923da9

@bleskes
Copy link
Contributor

bleskes commented Aug 2, 2016

LGTM. Thx @abeyad

@abeyad
Copy link
Author

abeyad commented Aug 2, 2016

Thanks for your review and feedback @bleskes !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants