-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new 'unsigned_long' numeric field type support #6237
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #6237 +/- ##
============================================
- Coverage 70.52% 70.43% -0.09%
- Complexity 59412 59489 +77
============================================
Files 4862 4872 +10
Lines 285530 286190 +660
Branches 41153 41274 +121
============================================
+ Hits 201357 201579 +222
- Misses 67588 68038 +450
+ Partials 16585 16573 -12
... and 481 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
b630b3e
to
ae66f44
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
@@ -201,9 +201,15 @@ private static Object convertValueFromSortType(String fieldName, SortField.Type | |||
|
|||
case LONG: | |||
// for unsigned_long field type we want to pass search_after value through formatting | |||
if (value instanceof Number && format != DocValueFormat.UNSIGNED_LONG_SHIFTED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious , what was the reason we didnt support search_after earlier for unsigned_long and now we can ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unsigned_long
(in some form) was supported when DocValueFormat.UNSIGNED_LONG_SHIFTED
was used, but to be fair do not know how this value was used later on (this is/was x-pack feature, I have not looked into the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed Indexing & Search->Sort area. That looks good to me, but left few comments and doubts.
Someone has to help here on Search->aggregation code changes area who is more familiar with that code.
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor Comments.
PR/changes looks pretty good to me. The biggest plus point here is, changes are completely different path then existing NumericTypes, so no possibility of regression as far as I see, but yeah, these are big changes, but integ tests are covered nicely.
return h; | ||
} | ||
|
||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont need this to override right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we need it since visit
is abstract method in Query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, my bad, I was comapring with SortedLongDocValuesRangeQueries.
visitor.visitLeaf(this); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too we dont need to override ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly we need it, toString
is abstract method in Query
@Override | ||
public boolean matches() throws IOException { | ||
final long value = singleton.longValue(); | ||
return Long.compareUnsigned(value, lowerValue) >= 0 && Long.compareUnsigned(value, upperValue) <= 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_after is also kind of range query, that did not require any change ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw the code, search_after is not supported for unsigned_long as of now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ... it should be supported, I do have a test case for it [1]
I just saw the code, search_after is not supported for unsigned_long as of now.
Could you please point me the place that I missed, the search_after
should be supported I think, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow I overlooked here we were throwing exception in searchafter builder in case of unsigned_log
server/src/main/java/org/opensearch/search/searchafter/SearchAfterBuilder.java
All good ! Thank you for clarifying.
final BigInteger v = Numbers.toUnsignedLongExact(value); | ||
|
||
if (indexed) { | ||
fields.add(new BigIntegerPoint(name, v)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is getting indexed with BigIntegerPoint values, can we use Numeric point based sort optimization here too ? I see in IndexNumenricFieldData, we have custom comparator and dont use sort optimization in this case. May be we can just check, but I think it might not be possilbe since there is no SortType which fits unsigned long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, the SortType
is the show stopper here sadly :( I kept this particular feature (and index sort as well) as a future improvement, all because of the SortType
limitation
@@ -157,6 +157,17 @@ static void register(ValuesSourceRegistry.Builder builder) { | |||
compositeValuesSourceConfig.reverseMul() | |||
); | |||
|
|||
} else if (vs.isBigInteger()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WE might have to work little bit in future in case we add some more NumericType ither then unsigned_long which doesnt fit size of long/double, BigInteger is for all other types but lets see if at all we need to make this generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it at some but we are very much limited by Apache Lucene: the unsigned_long
is as far as we could get right now (we could not use the BigInteger
at full sadly). But sure, if you have suggestion to generalize it now - happy to take them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it would be not be required at all to have any such more type.
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
CHANGELOG.md
Outdated
@@ -90,6 +90,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | |||
- [Extensions] Add IdentityPlugin into core to support Extension identities ([#7246](https://github.com/opensearch-project/OpenSearch/pull/7246)) | |||
- Add descending order search optimization through reverse segment read. ([#7244](https://github.com/opensearch-project/OpenSearch/pull/7244)) | |||
- [Extensions] Moving RestActions APIs to protobuf serialization. ([#7302](https://github.com/opensearch-project/OpenSearch/pull/7302)) | |||
- Introduce new 'unsigned_long' numeric field type support ([#6237](https://github.com/opensearch-project/OpenSearch/pull/6237)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should say Add
unsigned_long numeric field type
.
I'm ready to merge otherwise :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Gradle Check (Jenkins) Run Completed with:
|
I decided to merge, we have time to revert if @nknize has strong opinions ;) |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6237-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2f21d824ac0dc76d826b9bbdc32ec80df7c9bf6
# Push it to GitHub
git push --set-upstream origin backport/backport-6237-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
@reta Looks like it needs a manual backport, hopefully not for the changelog. |
…project#6237) * Introduce new 'unsigned_long' numeric field type support Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Address code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Addressed code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> --------- Signed-off-by: Andriy Redko <andriy.redko@aiven.io> (cherry picked from commit e2f21d8)
…project#6237) * Introduce new 'unsigned_long' numeric field type support Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Address code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Addressed code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> --------- Signed-off-by: Andriy Redko <andriy.redko@aiven.io> (cherry picked from commit e2f21d8) Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
* Introduce new 'unsigned_long' numeric field type support * Address code review comments * Addressed code review comments --------- (cherry picked from commit e2f21d8) Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
…project#6237) * Introduce new 'unsigned_long' numeric field type support Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Address code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Addressed code review comments Signed-off-by: Andriy Redko <andriy.redko@aiven.io> --------- Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Signed-off-by: Andriy Redko andriy.redko@aiven.io
Description
Introduce new 'unsigned_long' numeric field type support: unsigned 64-bit integer in range of [0, 2^64-1].
Limitations: the
unsigned_long
could not be used as index sort fields (sort.field
index setting). The limitation also applies to the scenario when search is done over multiple indices and sorted by the field that hasunsigned_long
type in (at least) one of the indices but different numeric type(s) in others. The error will be returned in such cases.unsigned_long
( ifstore
is set totrue
, it is stored and returned asString
).support term aggregations
support sorting
range queries
support aggregations
access from scripts
more tests
benchmarks
The benchmarks have been conducted using
opensearch-benchmarking
tool againstgeo_names
dataset which does search and aggregation over numeric fieldpopulation
(the baseline haslong
type, the contended hasunsigned_long
). As expected, the queries involving sort are slow (since right nowunsigned_long
is not wrapped into sorted numeric field).Issues Resolved
Closes #2083, documentation opensearch-project/documentation-website#3585
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.