-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Change format how settings represent lists / array #26723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penatlies when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. This change moves away from the current internal representation towards a single decoded string representation. A list is encoded into a JSON list internally in a fully backwards compatible way. A list is then a single key value pair in settings and all prefix based settings are internally converted to the new representation in the settings builder. This change also forbids to add a settings that ends with a `.0` which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal `.0` handling can be removed since all settings will be converted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at it and thought about it some. I've left what I thought. I'll come back and have another look soon.
} | ||
|
||
private static boolean isInternalList(String value) { | ||
return value != null | ||
&& value.length() >= 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be 2. 4 is the minimum number of bytes but length is in chars.
} | ||
} | ||
return builder; | ||
} | ||
|
||
private static List<String> maybeGetList( String value) throws IOException { | ||
if (value != null && isInternalList(value)) { // we try to write it as a list if it is a list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "write" is correct here.
@@ -739,6 +745,14 @@ public Builder put(Object... settings) { | |||
* @return The builder | |||
*/ | |||
public Builder put(String key, String value) { | |||
if (key.endsWith(".0")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me uncomfortable.... I see that we used to do it as well. Can we get away with not doing it anymore now that we have putArray
?
@@ -407,21 +407,21 @@ public void testDiff() throws IOException { | |||
ClusterSettings settings = new ClusterSettings(Settings.EMPTY, new HashSet<>(Arrays.asList(fooBar, fooBarBaz, foorBarQuux, | |||
someGroup, someAffix))); | |||
Settings diff = settings.diff(Settings.builder().put("foo.bar", 5).build(), Settings.EMPTY); | |||
assertEquals(4, diff.size()); // 4 since foo.bar.quux has 3 values essentially | |||
assertEquals(2, diff.size()); // 4 since foo.bar.quux has 3 values essentially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the comment anymore.
assertThat(diff.getAsInt("foo.bar.baz", null), equalTo(17)); | ||
assertArrayEquals(diff.getAsArray("foo.bar.quux", null), new String[] {"d", "e", "f"}); | ||
|
||
diff = settings.diff( | ||
Settings.builder().put("some.group.foo", 5).build(), | ||
Settings.builder().put("some.group.foobar", 17, "some.group.foo", 25).build()); | ||
assertEquals(6, diff.size()); // 6 since foo.bar.quux has 3 values essentially | ||
assertEquals(4, diff.size()); // 6 since foo.bar.quux has 3 values essentially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. These comments were helpful but are not incorrect.
assertThat(diff.getAsInt("foo.bar.baz", null), equalTo(1)); | ||
assertArrayEquals(diff.getAsArray("foo.bar.quux", null), new String[] {"a", "b", "c"}); | ||
|
||
diff = settings.diff( | ||
Settings.builder().put("foo.bar", 5).build(), | ||
Settings.builder().put("foo.bar.baz", 17).putArray("foo.bar.quux", "d", "e", "f").build()); | ||
assertEquals(4, diff.size()); // 4 since foo.bar.quux has 3 values essentially | ||
assertEquals(2, diff.size()); // 4 since foo.bar.quux has 3 values essentially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too.
@@ -432,7 +432,7 @@ public void testDiff() throws IOException { | |||
Settings.builder().put("some.prefix.foo.somekey", 5).build(), | |||
Settings.builder().put("some.prefix.foobar.somekey", 17, | |||
"some.prefix.foo.somekey", 18).build()); | |||
assertEquals(6, diff.size()); // 6 since foo.bar.quux has 3 values essentially | |||
assertEquals(4, diff.size()); // 6 since foo.bar.quux has 3 values essentially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
When discussing this many weeks ago, I suggested having the value of the internal map be a discriminated union. This could either simply be Object with the discrimination done by instanceof checks, or a simple POJO with a boolean+Object. I'm apprehensive about this PR because it means every iteration of a list setting requires creation of new strings (by parsing the "list" string). Why are these special marker strings to denote whether the value is a list any better than actually having a List and using instanceof? In either case, we have to ensure access to the internal map is removed, but the latter is much more transparent and easier to not mess up. |
I am closing this one since the encapsulation of Settings is so broken today I first have to clean up all the cruft around it to make it work. I think ultimately we can just have a |
Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to elastic#26723
Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723
Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723
Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs.
This change moves away from the current internal representation towards a single decoded string representation. A list is encoded into a JSON list internally in a fully backwards compatible way. A list is then a single key value pair in settings and all prefix based settings are internally converted to the new representation in the settings builder. This change also forbids to add a settings that ends with a
.0
which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal.0
handling can be removed since all settings will be converted.