Skip to content

Change format how settings represent lists / array #26723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Sep 20, 2017

Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs.

This change moves away from the current internal representation towards a single decoded string representation. A list is encoded into a JSON list internally in a fully backwards compatible way. A list is then a single key value pair in settings and all prefix based settings are internally converted to the new representation in the settings builder. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted.

Today we represent each value of a list setting with it's own dedicated key that ends
with the index of the value in the list. Aside of the obvious weirdness this has several
issues especially if lists are massive since it causes massive runtime penatlies when validating
settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn
massive slowdown on all subsequent validations runs. This change moves away from the current internal
representation towards a single decoded string representation. A list is encoded into a JSON list internally
in a fully backwards compatible way. A list is then a single key value pair in settings and all prefix based
settings are internally converted to the new representation in the settings builder.
This change also forbids to add a settings that ends with a `.0` which was internally used to detect a list setting.
Once this has been rolled out for an entire major version all the internal `.0` handling can be removed since
all settings will be converted.
@s1monw s1monw added :Core/Infra/Settings Settings infrastructure and APIs >bug v7.0.0 labels Sep 20, 2017
@s1monw s1monw changed the title Chance format how settings represent lists / array Change format how settings represent lists / array Sep 20, 2017
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at it and thought about it some. I've left what I thought. I'll come back and have another look soon.

}

private static boolean isInternalList(String value) {
return value != null
&& value.length() >= 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be 2. 4 is the minimum number of bytes but length is in chars.

}
}
return builder;
}

private static List<String> maybeGetList( String value) throws IOException {
if (value != null && isInternalList(value)) { // we try to write it as a list if it is a list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "write" is correct here.

@@ -739,6 +745,14 @@ public Builder put(Object... settings) {
* @return The builder
*/
public Builder put(String key, String value) {
if (key.endsWith(".0")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me uncomfortable.... I see that we used to do it as well. Can we get away with not doing it anymore now that we have putArray?

@@ -407,21 +407,21 @@ public void testDiff() throws IOException {
ClusterSettings settings = new ClusterSettings(Settings.EMPTY, new HashSet<>(Arrays.asList(fooBar, fooBarBaz, foorBarQuux,
someGroup, someAffix)));
Settings diff = settings.diff(Settings.builder().put("foo.bar", 5).build(), Settings.EMPTY);
assertEquals(4, diff.size()); // 4 since foo.bar.quux has 3 values essentially
assertEquals(2, diff.size()); // 4 since foo.bar.quux has 3 values essentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the comment anymore.

assertThat(diff.getAsInt("foo.bar.baz", null), equalTo(17));
assertArrayEquals(diff.getAsArray("foo.bar.quux", null), new String[] {"d", "e", "f"});

diff = settings.diff(
Settings.builder().put("some.group.foo", 5).build(),
Settings.builder().put("some.group.foobar", 17, "some.group.foo", 25).build());
assertEquals(6, diff.size()); // 6 since foo.bar.quux has 3 values essentially
assertEquals(4, diff.size()); // 6 since foo.bar.quux has 3 values essentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. These comments were helpful but are not incorrect.

assertThat(diff.getAsInt("foo.bar.baz", null), equalTo(1));
assertArrayEquals(diff.getAsArray("foo.bar.quux", null), new String[] {"a", "b", "c"});

diff = settings.diff(
Settings.builder().put("foo.bar", 5).build(),
Settings.builder().put("foo.bar.baz", 17).putArray("foo.bar.quux", "d", "e", "f").build());
assertEquals(4, diff.size()); // 4 since foo.bar.quux has 3 values essentially
assertEquals(2, diff.size()); // 4 since foo.bar.quux has 3 values essentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too.

@@ -432,7 +432,7 @@ public void testDiff() throws IOException {
Settings.builder().put("some.prefix.foo.somekey", 5).build(),
Settings.builder().put("some.prefix.foobar.somekey", 17,
"some.prefix.foo.somekey", 18).build());
assertEquals(6, diff.size()); // 6 since foo.bar.quux has 3 values essentially
assertEquals(4, diff.size()); // 6 since foo.bar.quux has 3 values essentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

@rjernst
Copy link
Member

rjernst commented Sep 26, 2017

When discussing this many weeks ago, I suggested having the value of the internal map be a discriminated union. This could either simply be Object with the discrimination done by instanceof checks, or a simple POJO with a boolean+Object. I'm apprehensive about this PR because it means every iteration of a list setting requires creation of new strings (by parsing the "list" string). Why are these special marker strings to denote whether the value is a list any better than actually having a List and using instanceof? In either case, we have to ensure access to the internal map is removed, but the latter is much more transparent and easier to not mess up.

@s1monw
Copy link
Contributor Author

s1monw commented Sep 27, 2017

I am closing this one since the encapsulation of Settings is so broken today I first have to clean up all the cruft around it to make it work. I think ultimately we can just have a String,Object map internally but before that we have to get rid of things like getAsMap on settings.

@s1monw s1monw closed this Sep 27, 2017
@s1monw s1monw deleted the fix_list_settings branch October 4, 2017 07:24
s1monw added a commit to s1monw/elasticsearch that referenced this pull request Oct 4, 2017
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to elastic#26723
s1monw added a commit that referenced this pull request Oct 5, 2017
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to #26723
s1monw added a commit that referenced this pull request Oct 5, 2017
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to #26723
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants