-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-26382][CORE] prefix comparator should handle -0.0 #23334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
assert(PrefixComparators.DOUBLE.compare(prefix, doubleMaxPrefix) === 1) | ||
} | ||
|
||
test("double prefix comparator handles other special values properly") { | ||
val nullValue = 0L | ||
// See `SortPrefix.nulValue` for how we deal with nulls for float/double type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve the test coverage a little bit.
By reading the PR description, sounds that this change is not for addressing an existing bug but just a safer guard, right? |
Test build #100231 has finished for PR 23334 at commit
|
Test build #100233 has finished for PR 23334 at commit
|
@viirya yes you are right |
LGTM pending jenkins |
@@ -69,6 +69,8 @@ public static long computePrefix(byte[] bytes) { | |||
* details see http://stereopsis.com/radix.html. | |||
*/ | |||
public static long computePrefix(double value) { | |||
// normalize -0.0 to 0.0, as they should be equal | |||
value = value == -0.0 ? 0.0 : value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, value == -0.0
is true
for both 0.0
and -0.0
, but the current one is the simplest one for this normalization. We don't need to exclude 0.0
here. +1 for the logic.
Shall we use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
Test build #100243 has finished for PR 23334 at commit
|
Although this is not a bug fix, can we have this in |
@kiszk good idea, updated #23239 @dongjoon-hyun I'm fine with it. Feel free to merge :) |
Thank you, @cloud-fan and @viirya , @HyukjinKwon , @kiszk . |
## What changes were proposed in this pull request? This is kind of a followup of #23239 The `UnsafeProject` will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it. However, for consistency and future-proof, this PR proposes to normalize `-0.0` in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well. This is not a bug fix, but a safe guard. ## How was this patch tested? existing tests Closes #23334 from cloud-fan/sort. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit befca98) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
## What changes were proposed in this pull request? This is kind of a followup of apache#23239 The `UnsafeProject` will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it. However, for consistency and future-proof, this PR proposes to normalize `-0.0` in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well. This is not a bug fix, but a safe guard. ## How was this patch tested? existing tests Closes apache#23334 from cloud-fan/sort. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
## What changes were proposed in this pull request? This is kind of a followup of apache#23239 The `UnsafeProject` will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it. However, for consistency and future-proof, this PR proposes to normalize `-0.0` in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well. This is not a bug fix, but a safe guard. ## How was this patch tested? existing tests Closes apache#23334 from cloud-fan/sort. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
## What changes were proposed in this pull request? This is kind of a followup of apache#23239 The `UnsafeProject` will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it. However, for consistency and future-proof, this PR proposes to normalize `-0.0` in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well. This is not a bug fix, but a safe guard. ## How was this patch tested? existing tests Closes apache#23334 from cloud-fan/sort. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit befca98) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
## What changes were proposed in this pull request? This is kind of a followup of apache#23239 The `UnsafeProject` will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it. However, for consistency and future-proof, this PR proposes to normalize `-0.0` in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well. This is not a bug fix, but a safe guard. ## How was this patch tested? existing tests Closes apache#23334 from cloud-fan/sort. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit befca98) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This is kind of a followup of #23239
The
UnsafeProject
will normalize special float/double values(NaN and -0.0), so the sorter doesn't have to handle it.However, for consistency and future-proof, this PR proposes to normalize
-0.0
in the prefix comparator, so that it's same with the normal ordering. Note that prefix comparator handles NaN as well.This is not a bug fix, but a safe guard.
How was this patch tested?
existing tests