Skip to content

Commit 0b26d22

Browse files
committed
Merge branch '6.x' into ccr-6.x
* 6.x: Share common readFrom/writeTo code in AcknowledgeResponse (#30983) [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient Fix rest test skip version Fix docs build. Add a doc value format to binary fields. (#30860) Only auto-update license signature if all nodes ready (#30859) Add BlobContainer.writeBlobAtomic() (#30902) Move caching of the size of a directory to `StoreDirectory`. (#30581) Clarify docs about boolean operator precedence. (#30808) Docs: remove notes on sparsity. (#30905) Improve documentation of dynamic mappings. (#30952) Decouple MultiValueMode. (#31075) Docs: Clarify constraints on scripted similarities. (#31076)
2 parents 51963e5 + f485e8c commit 0b26d22

File tree

97 files changed

+1077
-975
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

97 files changed

+1077
-975
lines changed

buildSrc/src/main/resources/checkstyle_suppressions.xml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -524,8 +524,6 @@
524524
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]cluster[/\\]settings[/\\]ClusterSettingsIT.java" checks="LineLength" />
525525
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]cluster[/\\]shards[/\\]ClusterSearchShardsIT.java" checks="LineLength" />
526526
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]cluster[/\\]structure[/\\]RoutingIteratorTests.java" checks="LineLength" />
527-
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]common[/\\]blobstore[/\\]FsBlobStoreContainerTests.java" checks="LineLength" />
528-
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]common[/\\]blobstore[/\\]FsBlobStoreTests.java" checks="LineLength" />
529527
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]common[/\\]breaker[/\\]MemoryCircuitBreakerTests.java" checks="LineLength" />
530528
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]common[/\\]geo[/\\]ShapeBuilderTests.java" checks="LineLength" />
531529
<suppress files="server[/\\]src[/\\]test[/\\]java[/\\]org[/\\]elasticsearch[/\\]common[/\\]hash[/\\]MessageDigestsTests.java" checks="LineLength" />

docs/reference/how-to/general.asciidoc

Lines changed: 0 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -40,94 +40,3 @@ better. For instance if a user searches for two words `foo` and `bar`, a match
4040
across different chapters is probably very poor, while a match within the same
4141
paragraph is likely good.
4242

43-
[float]
44-
[[sparsity]]
45-
=== Avoid sparsity
46-
47-
The data-structures behind Lucene, which Elasticsearch relies on in order to
48-
index and store data, work best with dense data, ie. when all documents have the
49-
same fields. This is especially true for fields that have norms enabled (which
50-
is the case for `text` fields by default) or doc values enabled (which is the
51-
case for numerics, `date`, `ip` and `keyword` by default).
52-
53-
The reason is that Lucene internally identifies documents with so-called doc
54-
ids, which are integers between 0 and the total number of documents in the
55-
index. These doc ids are used for communication between the internal APIs of
56-
Lucene: for instance searching on a term with a `match` query produces an
57-
iterator of doc ids, and these doc ids are then used to retrieve the value of
58-
the `norm` in order to compute a score for these documents. The way this `norm`
59-
lookup is implemented currently is by reserving one byte for each document.
60-
The `norm` value for a given doc id can then be retrieved by reading the
61-
byte at index `doc_id`. While this is very efficient and helps Lucene quickly
62-
have access to the `norm` values of every document, this has the drawback that
63-
documents that do not have a value will also require one byte of storage.
64-
65-
In practice, this means that if an index has `M` documents, norms will require
66-
`M` bytes of storage *per field*, even for fields that only appear in a small
67-
fraction of the documents of the index. Although slightly more complex with doc
68-
values due to the fact that doc values have multiple ways that they can be
69-
encoded depending on the type of field and on the actual data that the field
70-
stores, the problem is very similar. In case you wonder: `fielddata`, which was
71-
used in Elasticsearch pre-2.0 before being replaced with doc values, also
72-
suffered from this issue, except that the impact was only on the memory
73-
footprint since `fielddata` was not explicitly materialized on disk.
74-
75-
Note that even though the most notable impact of sparsity is on storage
76-
requirements, it also has an impact on indexing speed and search speed since
77-
these bytes for documents that do not have a field still need to be written
78-
at index time and skipped over at search time.
79-
80-
It is totally fine to have a minority of sparse fields in an index. But beware
81-
that if sparsity becomes the rule rather than the exception, then the index
82-
will not be as efficient as it could be.
83-
84-
This section mostly focused on `norms` and `doc values` because those are the
85-
two features that are most affected by sparsity. Sparsity also affect the
86-
efficiency of the inverted index (used to index `text`/`keyword` fields) and
87-
dimensional points (used to index `geo_point` and numerics) but to a lesser
88-
extent.
89-
90-
Here are some recommendations that can help avoid sparsity:
91-
92-
[float]
93-
==== Avoid putting unrelated data in the same index
94-
95-
You should avoid putting documents that have totally different structures into
96-
the same index in order to avoid sparsity. It is often better to put these
97-
documents into different indices, you could also consider giving fewer shards
98-
to these smaller indices since they will contain fewer documents overall.
99-
100-
Note that this advice does not apply in the case that you need to use
101-
parent/child relations between your documents since this feature is only
102-
supported on documents that live in the same index.
103-
104-
[float]
105-
==== Normalize document structures
106-
107-
Even if you really need to put different kinds of documents in the same index,
108-
maybe there are opportunities to reduce sparsity. For instance if all documents
109-
in the index have a timestamp field but some call it `timestamp` and others
110-
call it `creation_date`, it would help to rename it so that all documents have
111-
the same field name for the same data.
112-
113-
[float]
114-
==== Avoid types
115-
116-
Types might sound like a good way to store multiple tenants in a single index.
117-
They are not: given that types store everything in a single index, having
118-
multiple types that have different fields in a single index will also cause
119-
problems due to sparsity as described above. If your types do not have very
120-
similar mappings, you might want to consider moving them to a dedicated index.
121-
122-
[float]
123-
==== Disable `norms` and `doc_values` on sparse fields
124-
125-
If none of the above recommendations apply in your case, you might want to
126-
check whether you actually need `norms` and `doc_values` on your sparse fields.
127-
`norms` can be disabled if producing scores is not necessary on a field, this is
128-
typically true for fields that are only used for filtering. `doc_values` can be
129-
disabled on fields that are neither used for sorting nor for aggregations.
130-
Beware that this decision should not be made lightly since these parameters
131-
cannot be changed on a live index, so you would have to reindex if you realize
132-
that you need `norms` or `doc_values`.
133-

docs/reference/index-modules/similarity.asciidoc

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -326,7 +326,18 @@ Which yields:
326326
// TESTRESPONSE[s/"took": 12/"took" : $body.took/]
327327
// TESTRESPONSE[s/OzrdjxNtQGaqs4DmioFw9A/$body.hits.hits.0._node/]
328328

329-
You might have noticed that a significant part of the script depends on
329+
WARNING: While scripted similarities provide a lot of flexibility, there is
330+
a set of rules that they need to satisfy. Failing to do so could make
331+
Elasticsearch silently return wrong top hits or fail with internal errors at
332+
search time:
333+
334+
- Returned scores must be positive.
335+
- All other variables remaining equal, scores must not decrease when
336+
`doc.freq` increases.
337+
- All other variables remaining equal, scores must not increase when
338+
`doc.length` increases.
339+
340+
You might have noticed that a significant part of the above script depends on
330341
statistics that are the same for every document. It is possible to make the
331342
above slightly more efficient by providing an `weight_script` which will
332343
compute the document-independent part of the score and will be available
@@ -491,7 +502,6 @@ GET /index/_search?explain=true
491502
492503
////////////////////
493504

494-
495505
Type name: `scripted`
496506

497507
[float]

docs/reference/mapping/dynamic/field-mapping.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,6 @@ PUT my_index/_doc/1
135135
}
136136
--------------------------------------------------
137137
// CONSOLE
138-
<1> The `my_float` field is added as a <<number,`double`>> field.
138+
<1> The `my_float` field is added as a <<number,`float`>> field.
139139
<2> The `my_integer` field is added as a <<number,`long`>> field.
140140

docs/reference/mapping/dynamic/templates.asciidoc

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,22 @@ name as an existing template, it will replace the old version.
4646
[[match-mapping-type]]
4747
==== `match_mapping_type`
4848

49-
The `match_mapping_type` matches on the datatype detected by
50-
<<dynamic-field-mapping,dynamic field mapping>>, in other words, the datatype
51-
that Elasticsearch thinks the field should have. Only the following datatypes
52-
can be automatically detected: `boolean`, `date`, `double`, `long`, `object`,
53-
`string`. It also accepts `*` to match all datatypes.
49+
The `match_mapping_type` is the datatype detected by the json parser. Since
50+
JSON doesn't allow to distinguish a `long` from an `integer` or a `double` from
51+
a `float`, it will always choose the wider datatype, ie. `long` for integers
52+
and `double` for floating-point numbers.
53+
54+
The following datatypes may be automatically detected:
55+
56+
- `boolean` when `true` or `false` are encountered.
57+
- `date` when <<date-detection,date detection>> is enabled and a string is
58+
found that matches any of the configured date formats.
59+
- `double` for numbers with a decimal part.
60+
- `long` for numbers without a decimal part.
61+
- `object` for objects, also called hashes.
62+
- `string` for character strings.
63+
64+
`*` may also be used in order to match all datatypes.
5465

5566
For example, if we wanted to map all integer fields as `integer` instead of
5667
`long`, and all `string` fields as both `text` and `keyword`, we

docs/reference/query-dsl/query-string-syntax.asciidoc

Lines changed: 4 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -235,26 +235,10 @@ states that:
235235
* `news` must not be present
236236
* `quick` and `brown` are optional -- their presence increases the relevance
237237

238-
The familiar operators `AND`, `OR` and `NOT` (also written `&&`, `||` and `!`)
239-
are also supported. However, the effects of these operators can be more
240-
complicated than is obvious at first glance. `NOT` takes precedence over
241-
`AND`, which takes precedence over `OR`. While the `+` and `-` only affect
242-
the term to the right of the operator, `AND` and `OR` can affect the terms to
243-
the left and right.
244-
245-
****
246-
Rewriting the above query using `AND`, `OR` and `NOT` demonstrates the
247-
complexity:
248-
249-
`quick OR brown AND fox AND NOT news`::
250-
251-
This is incorrect, because `brown` is now a required term.
252-
253-
`(quick OR brown) AND fox AND NOT news`::
254-
255-
This is incorrect because at least one of `quick` or `brown` is now required
256-
and the search for those terms would be scored differently from the original
257-
query.
238+
The familiar boolean operators `AND`, `OR` and `NOT` (also written `&&`, `||`
239+
and `!`) are also supported but beware that they do not honor the usual
240+
precedence rules, so parentheses should be used whenever multiple operators are
241+
used together. For instance the previous query could be rewritten as:
258242

259243
`((quick AND fox) OR (brown AND fox) OR fox) AND NOT news`::
260244

@@ -272,7 +256,6 @@ would look like this:
272256
}
273257
}
274258

275-
****
276259

277260
===== Grouping
278261

modules/aggs-matrix-stats/src/main/java/org/elasticsearch/search/aggregations/support/MultiValuesSource.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ public NumericDoubleValues getField(final int ordinal, LeafReaderContext ctx) th
4747
if (ordinal > names.length) {
4848
throw new IndexOutOfBoundsException("ValuesSource array index " + ordinal + " out of bounds");
4949
}
50-
return multiValueMode.select(values[ordinal].doubleValues(ctx), Double.NEGATIVE_INFINITY);
50+
return multiValueMode.select(values[ordinal].doubleValues(ctx));
5151
}
5252
}
5353

modules/lang-expression/src/main/java/org/elasticsearch/script/expression/DateMethodValueSource.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ class DateMethodValueSource extends FieldDataValueSource {
5454
public FunctionValues getValues(Map context, LeafReaderContext leaf) throws IOException {
5555
AtomicNumericFieldData leafData = (AtomicNumericFieldData) fieldData.load(leaf);
5656
final Calendar calendar = Calendar.getInstance(TimeZone.getTimeZone("UTC"), Locale.ROOT);
57-
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues(), 0d);
57+
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues());
5858
return new DoubleDocValues(this) {
5959
@Override
6060
public double doubleVal(int docId) throws IOException {

modules/lang-expression/src/main/java/org/elasticsearch/script/expression/DateObjectValueSource.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ class DateObjectValueSource extends FieldDataValueSource {
5656
public FunctionValues getValues(Map context, LeafReaderContext leaf) throws IOException {
5757
AtomicNumericFieldData leafData = (AtomicNumericFieldData) fieldData.load(leaf);
5858
MutableDateTime joda = new MutableDateTime(0, DateTimeZone.UTC);
59-
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues(), 0d);
59+
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues());
6060
return new DoubleDocValues(this) {
6161
@Override
6262
public double doubleVal(int docId) throws IOException {

modules/lang-expression/src/main/java/org/elasticsearch/script/expression/FieldDataValueSource.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ public int hashCode() {
6868
@SuppressWarnings("rawtypes") // ValueSource uses a rawtype
6969
public FunctionValues getValues(Map context, LeafReaderContext leaf) throws IOException {
7070
AtomicNumericFieldData leafData = (AtomicNumericFieldData) fieldData.load(leaf);
71-
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues(), 0d);
71+
NumericDoubleValues docValues = multiValueMode.select(leafData.getDoubleValues());
7272
return new DoubleDocValues(this) {
7373
@Override
7474
public double doubleVal(int doc) throws IOException {

modules/rank-eval/src/test/java/org/elasticsearch/index/rankeval/RatedRequestsTests.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,7 @@ public void testXContentRoundtrip() throws IOException {
131131
}
132132
}
133133

134+
@AwaitsFix(bugUrl="https://github.com/elastic/elasticsearch/issues/31104")
134135
public void testXContentParsingIsNotLenient() throws IOException {
135136
RatedRequest testItem = createTestItem(randomBoolean());
136137
XContentType xContentType = randomFrom(XContentType.values());

rest-api-spec/src/main/resources/rest-api-spec/test/search/190_index_prefix_search.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ setup:
6666
---
6767
"search index prefixes with span_multi":
6868
- skip:
69-
version: " - 6.2.99"
69+
version: " - 6.3.99"
7070
reason: span_multi throws an exception with prefix fields on < versions
7171

7272
- do:

server/src/main/java/org/elasticsearch/action/admin/cluster/repositories/delete/DeleteRepositoryResponse.java

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,9 @@
2020
package org.elasticsearch.action.admin.cluster.repositories.delete;
2121

2222
import org.elasticsearch.action.support.master.AcknowledgedResponse;
23-
import org.elasticsearch.common.io.stream.StreamInput;
24-
import org.elasticsearch.common.io.stream.StreamOutput;
2523
import org.elasticsearch.common.xcontent.ConstructingObjectParser;
26-
import org.elasticsearch.common.xcontent.ToXContentObject;
2724
import org.elasticsearch.common.xcontent.XContentParser;
2825

29-
import java.io.IOException;
30-
3126
/**
3227
* Unregister repository response
3328
*/
@@ -47,18 +42,6 @@ public class DeleteRepositoryResponse extends AcknowledgedResponse {
4742
super(acknowledged);
4843
}
4944

50-
@Override
51-
public void readFrom(StreamInput in) throws IOException {
52-
super.readFrom(in);
53-
readAcknowledged(in);
54-
}
55-
56-
@Override
57-
public void writeTo(StreamOutput out) throws IOException {
58-
super.writeTo(out);
59-
writeAcknowledged(out);
60-
}
61-
6245
public static DeleteRepositoryResponse fromXContent(XContentParser parser) {
6346
return PARSER.apply(parser, null);
6447
}

server/src/main/java/org/elasticsearch/action/admin/cluster/repositories/put/PutRepositoryResponse.java

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,9 @@
2020
package org.elasticsearch.action.admin.cluster.repositories.put;
2121

2222
import org.elasticsearch.action.support.master.AcknowledgedResponse;
23-
import org.elasticsearch.common.io.stream.StreamInput;
24-
import org.elasticsearch.common.io.stream.StreamOutput;
2523
import org.elasticsearch.common.xcontent.ConstructingObjectParser;
2624
import org.elasticsearch.common.xcontent.XContentParser;
2725

28-
import java.io.IOException;
29-
3026
/**
3127
* Register repository response
3228
*/
@@ -46,18 +42,6 @@ public class PutRepositoryResponse extends AcknowledgedResponse {
4642
super(acknowledged);
4743
}
4844

49-
@Override
50-
public void readFrom(StreamInput in) throws IOException {
51-
super.readFrom(in);
52-
readAcknowledged(in);
53-
}
54-
55-
@Override
56-
public void writeTo(StreamOutput out) throws IOException {
57-
super.writeTo(out);
58-
writeAcknowledged(out);
59-
}
60-
6145
public static PutRepositoryResponse fromXContent(XContentParser parser) {
6246
return PARSER.apply(parser, null);
6347
}

server/src/main/java/org/elasticsearch/action/admin/cluster/reroute/ClusterRerouteResponse.java

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -63,22 +63,32 @@ public RoutingExplanations getExplanations() {
6363

6464
@Override
6565
public void readFrom(StreamInput in) throws IOException {
66-
super.readFrom(in);
67-
state = ClusterState.readFrom(in, null);
68-
readAcknowledged(in);
69-
explanations = RoutingExplanations.readFrom(in);
66+
if (in.getVersion().onOrAfter(Version.V_6_4_0)) {
67+
super.readFrom(in);
68+
state = ClusterState.readFrom(in, null);
69+
explanations = RoutingExplanations.readFrom(in);
70+
} else {
71+
state = ClusterState.readFrom(in, null);
72+
acknowledged = in.readBoolean();
73+
explanations = RoutingExplanations.readFrom(in);
74+
}
7075
}
7176

7277
@Override
7378
public void writeTo(StreamOutput out) throws IOException {
74-
super.writeTo(out);
75-
if (out.getVersion().onOrAfter(Version.V_6_3_0)) {
79+
if (out.getVersion().onOrAfter(Version.V_6_4_0)) {
80+
super.writeTo(out);
7681
state.writeTo(out);
82+
RoutingExplanations.writeTo(explanations, out);
7783
} else {
78-
ClusterModule.filterCustomsForPre63Clients(state).writeTo(out);
84+
if (out.getVersion().onOrAfter(Version.V_6_3_0)) {
85+
state.writeTo(out);
86+
} else {
87+
ClusterModule.filterCustomsForPre63Clients(state).writeTo(out);
88+
}
89+
out.writeBoolean(acknowledged);
90+
RoutingExplanations.writeTo(explanations, out);
7991
}
80-
writeAcknowledged(out);
81-
RoutingExplanations.writeTo(explanations, out);
8292
}
8393

8494
@Override

0 commit comments

Comments
 (0)