-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-24834][CORE] use java comparison for float and double #21794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24834][CORE] use java comparison for float and double #21794
Conversation
The methods in `Util`, `nanSafeCompareDoubles` and `nanSafeCompareFloats` had the same semantics as java's `java.lang.Double.compare` and `java.lang.Float.compare` respectively.
Can one of the admins verify this patch? |
cc @JoshRosen who introduced this code originally, I think |
There is a way in which the
If that behaviour is indeed required, I could abandon this PR and instead fix the test cases to capture this. |
I think this is required since SparkSQL does not distinguish 0.0 from -0.0. Am I correct? |
It would be good to add test cases for them since it is not covered now. |
BTW if it becomes necessary to not change the semantics, I think the methods could at least be streamlined a bit:
More tests can't hurt, too. |
it does seem that spark currently does distinguish -0 and 0, at least as far as groupbys go
doubles are hashed via |
|
I think we'd have to close this due to the behavior change, but would merge an optimization of the existing behavior. |
yep fair - the intent I think was clarity rather than necessarily perf: it's misleading to have a method named 'nan safe' which has no special handling of nans. I'll look at opening a different PR which could increase clarity/may have minor perf benefit. |
The methods in
Utils
,nanSafeCompareDoubles
andnanSafeCompareFloats
havethe same semantics as java's
java.lang.Double.compare
andjava.lang.Float.compare
respectively.
What changes were proposed in this pull request?
Remove the two functions in
Utils
, and replace the call sites (only two for each function) with the corresponding java comparison function.How was this patch tested?
Existing spark tests should cover this.