-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Introduce 64-bit unsigned long field type #60050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
dffd748
7eb2d4a
612b7da
ada3422
7551cd6
e903940
9e057c0
ab54a23
4de3bd0
2b567c9
07470b5
17912bc
b2eef4c
b315a0f
7652e66
0508e70
24cbe55
cca6b30
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
[role="xpack"] | ||
[testenv="basic"] | ||
|
||
[[unsigned-long]] | ||
=== Unsigned long data type | ||
Unsigned long is a numeric field type that represents an unsigned 64-bit | ||
integer with a minimum value of 0 and a maximum value of +2^64^-1+ | ||
(from 0 to 18446744073709551615 inclusive). | ||
|
||
[source,console] | ||
-------------------------------------------------- | ||
PUT my_index | ||
{ | ||
"mappings": { | ||
"properties": { | ||
"my_counter": { | ||
"type": "unsigned_long" | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
|
||
Unsigned long can be indexed in a numeric or string form, | ||
representing integer values in the range [0, 18446744073709551615]. | ||
They can't have a decimal part. | ||
|
||
[source,console] | ||
-------------------------------- | ||
POST /my_index/_bulk?refresh | ||
{"index":{"_id":1}} | ||
{"my_counter": 0} | ||
{"index":{"_id":2}} | ||
{"my_counter": 9223372036854775808} | ||
{"index":{"_id":3}} | ||
{"my_counter": 18446744073709551614} | ||
{"index":{"_id":4}} | ||
{"my_counter": 18446744073709551615} | ||
-------------------------------- | ||
//TEST[continued] | ||
|
||
Term queries accept any numbers in a numeric or string form. | ||
|
||
[source,console] | ||
-------------------------------- | ||
GET /my_index/_search | ||
{ | ||
"query": { | ||
"term" : { | ||
"my_counter" : 18446744073709551615 | ||
} | ||
} | ||
} | ||
-------------------------------- | ||
//TEST[continued] | ||
|
||
Range query terms can contain values with decimal parts. | ||
In this case {es} converts them to integer values: | ||
`gte` and `gt` terms are converted to the nearest integer up inclusive, | ||
and `lt` and `lte` ranges are converted to the nearest integer down inclusive. | ||
|
||
It is recommended to pass ranges as strings to ensure they are parsed | ||
without any loss of precision. | ||
|
||
[source,console] | ||
-------------------------------- | ||
GET /my_index/_search | ||
{ | ||
"query": { | ||
"range" : { | ||
"my_counter" : { | ||
"gte" : "9223372036854775808.5", | ||
"lte" : "18446744073709551615" | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------- | ||
//TEST[continued] | ||
|
||
|
||
For queries with sort on an `unsigned_long` field, | ||
for a particular document {es} returns a sort value of the type `long` | ||
if the value of this document is within the range of long values, | ||
or of the type `BigInteger` if the value exceeds this range. | ||
|
||
NOTE: REST clients need to be able to handle big integer values | ||
in JSON to support this field type correctly. | ||
|
||
[source,console] | ||
-------------------------------- | ||
GET /my_index/_search | ||
{ | ||
"query": { | ||
"match_all" : {} | ||
}, | ||
"sort" : {"my_counter" : "desc"} | ||
} | ||
-------------------------------- | ||
//TEST[continued] | ||
|
||
Similarly to sort values, script values of an `unsigned_long` field | ||
return a `Number` representing a `Long` or `BigInteger`. | ||
The same values: `Long` or `BigInteger` are used for `terms` aggregations. | ||
|
||
==== Queries with mixed numeric types | ||
mayya-sharipova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Searches with mixed numeric types one of which is `unsigned_long` are | ||
supported, except queries with sort. Thus, a sort query across two indexes | ||
where the same field name has an `unsigned_long` type in one index, | ||
and `long` type in another, doesn't produce correct results and must | ||
be avoided. If there is a need for such kind of sorting, script based sorting | ||
can be used instead. | ||
|
||
Aggregations across several numeric types one of which is `unsigned_long` are | ||
supported. In this case, values are converted to the `double` type. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -426,6 +426,7 @@ ReducedQueryPhase reducedQueryPhase(Collection<? extends SearchPhaseResult> quer | |
if (queryResults.isEmpty()) { | ||
throw new IllegalStateException(errorMsg); | ||
} | ||
validateMergeSortValueFormats(queryResults); | ||
final QuerySearchResult firstResult = queryResults.stream().findFirst().get().queryResult(); | ||
final boolean hasSuggest = firstResult.suggest() != null; | ||
final boolean hasProfileResults = firstResult.hasProfileResults(); | ||
|
@@ -485,6 +486,36 @@ private static InternalAggregations reduceAggs(InternalAggregation.ReduceContext | |
performFinalReduce ? aggReduceContextBuilder.forFinalReduction() : aggReduceContextBuilder.forPartialReduction()); | ||
} | ||
|
||
/** | ||
* Checks that query results from all shards have consistent unsigned_long format. | ||
* Sort queries on a field that has long type in one index, and unsigned_long in another index | ||
* don't work correctly. Throw an error if this kind of sorting is detected. | ||
* //TODO: instead of throwing error, find a way to sort long and unsigned_long together | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This TODO makes sense. It's unfortunate that we need to do a special check here, but it feels worth it to me to avoid silently returning incorrect results. |
||
*/ | ||
private static void validateMergeSortValueFormats(Collection<? extends SearchPhaseResult> queryResults) { | ||
boolean[] ulFormats = null; | ||
boolean firstResult = true; | ||
for (SearchPhaseResult entry : queryResults) { | ||
DocValueFormat[] formats = entry.queryResult().sortValueFormats(); | ||
if (formats == null) return; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Checking my understanding -- are all shards guaranteed to have the same number of sort formats (even if some sort fields are unmapped)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jtibshirani Indeed, all shards are guaranteed to have the same number of sort formats. If a field is unmapped on a shard, we will get a shard failure, unless we specifically map an unmapped field to something else. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also wondering if @jimczi is ok with this final check I've added in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ++, that makes sense to me. You can maybe rename the function validateMergeSortValueFormats or something along those lines ? |
||
if (firstResult) { | ||
firstResult = false; | ||
ulFormats = new boolean[formats.length]; | ||
for (int i = 0; i < formats.length; i++) { | ||
ulFormats[i] = formats[i] == DocValueFormat.UNSIGNED_LONG_SHIFTED ? true : false; | ||
} | ||
} else { | ||
for (int i = 0; i < formats.length; i++) { | ||
// if the format is unsigned_long in one shard, and something different in another shard | ||
if (ulFormats[i] ^ (formats[i] == DocValueFormat.UNSIGNED_LONG_SHIFTED)) { | ||
throw new IllegalArgumentException("Can't do sort across indices, as a field has [unsigned_long] type " + | ||
"in one index, and different type in another index!"); | ||
} | ||
} | ||
} | ||
} | ||
} | ||
|
||
/* | ||
* Returns the size of the requested top documents (from + size) | ||
*/ | ||
|
Uh oh!
There was an error while loading. Please reload this page.