Skip to content

[SPARK-18458][CORE] Fix signed integer overflow problem at an expression in RadixSort.java #15907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

kiszk
Copy link
Member

@kiszk kiszk commented Nov 16, 2016

What changes were proposed in this pull request?

This PR avoids that a result of an expression is negative due to signed integer overflow (e.g. 0x10?????? * 8 < 0). This PR casts each operand to long before executing a calculation. Since the result is interpreted as long, the result of the expression is positive.

How was this patch tested?

Manually executed query82 of TPC-DS with 100TB

@kiszk
Copy link
Member Author

kiszk commented Nov 16, 2016

@hvanhovell I found similar portions in RadixSort.java For example, at line 85.
Should we change these portions preventively while we do not have test suites, too?

@hvanhovell
Copy link
Contributor

cc @ericl

@hvanhovell
Copy link
Contributor

@kiszk yeah lets also address those

@kiszk
Copy link
Member Author

kiszk commented Nov 16, 2016

@hvanhovell as far as I know, I addressed them in RadixSort.java

@SparkQA
Copy link

SparkQA commented Nov 16, 2016

Test build #68728 has finished for PR 15907 at commit 3821a52.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 16, 2016

Test build #68732 has finished for PR 15907 at commit 472c0c3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Nov 16, 2016

Can you please have a more descriptive title? Presumably the pull request is not to add coredump.

@kiszk kiszk changed the title [SPARK-18458][CORE] core dumped running Spark SQL on large data volume (100TB) [SPARK-18458][CORE] Avoid signed integer overflow at an expression in RadixSort.java Nov 17, 2016
@kiszk
Copy link
Member Author

kiszk commented Nov 17, 2016

@rxin Sure, I did.

@ericl
Copy link
Contributor

ericl commented Nov 17, 2016

LGTM. I wonder if it's also worth changing all the int parameters to longs to be extra safe here with conversions, even if we know the values are bounded.

@kiszk kiszk changed the title [SPARK-18458][CORE] Avoid signed integer overflow at an expression in RadixSort.java [SPARK-18458][CORE] Fix signed integer overflow at an expression in RadixSort.java Nov 17, 2016
@kiszk kiszk changed the title [SPARK-18458][CORE] Fix signed integer overflow at an expression in RadixSort.java [SPARK-18458][CORE] Fix signed integer overflow problem at an expression in RadixSort.java Nov 17, 2016
@rxin
Copy link
Contributor

rxin commented Nov 17, 2016

Yea it might be a good idea to just use longs.

@kiszk
Copy link
Member Author

kiszk commented Nov 17, 2016

Ideally, it is a good idea. Do you want to do that change using this PR?

Another question is which scope do we apply such a change?

  1. Only in methods that I applied changes to.
  2. Only in 'RadixSort.java'
  3. Beyond public methods ('sort()' and 'sortKeyPrefixArray()')

Obviously, 3. requires a lot of changes to many files.

What do you think?

@rxin
Copy link
Contributor

rxin commented Nov 17, 2016

I'm thinking 2. What do you think?

@kiszk
Copy link
Member Author

kiszk commented Nov 17, 2016

It would be good.
In that case, is it better to insert explicit cast (from int to long) if a caller gives an int variable? This is why we want to explicitly express a specification of public APIs (sort() and sortKeyPrefixArray()).

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping the very small numbers (0..7, 0..255) as ints, and everything else as a long.

if (numRecords > 0) {
long[][] counts = getCounts(array, numRecords, startByteIndex, endByteIndex);
for (int i = startByteIndex; i <= endByteIndex; i++) {
if (counts[i] != null) {
for (long i = startByteIndex; i <= endByteIndex; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be a long, since it ranges from 0..7

@@ -40,28 +40,28 @@
* of always copying the data back to position zero for efficiency.
*/
public static int sort(
LongArray array, int numRecords, int startByteIndex, int endByteIndex,
LongArray array, long numRecords, long startByteIndex, long endByteIndex,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the byte indices as ints, since they are in 0..7 and we assert these below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I imagined that all of variables should be changed to long anyway.

@SparkQA
Copy link

SparkQA commented Nov 17, 2016

Test build #68793 has finished for PR 15907 at commit ae6820a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 17, 2016

Test build #68798 has finished for PR 15907 at commit ef54e8a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ericl
Copy link
Contributor

ericl commented Nov 18, 2016

Looks good

@@ -80,7 +80,7 @@ class SortBenchmark extends BenchmarkBase {
}
val buf = new LongArray(MemoryBlock.fromLongArray(array))
timer.startTiming()
RadixSort.sort(buf, size, 0, 7, false, false)
RadixSort.sort(buf, size.toLong, 0, 7, false, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need these toLongs? Doesn't scala do the cast implicitly?

Copy link
Member Author

@kiszk kiszk Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I wrote before, I wanted to emphasis the difference of types between caller and callee.
It is OK to rely on implicit cast, done.

var i = 0
val out = new Array[Long](length)
val out = new Array[Long](length.toInt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a checkedCast function similar to Guava's: https://google.github.io/guava/releases/19.0/api/docs/com/google/common/primitives/Ints.html#checkedCast(long)

and use those for all the toInt downcasts. Otherwise it'd be annoying to get mysterious negative integer exceptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! done in these files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I put this comment before pushed. Now, you can see it

@@ -176,7 +176,7 @@ public ShuffleSorterIterator getSortedIterator() {
int offset = 0;
if (useRadixSort) {
offset = RadixSort.sort(
array, pos,
array, (long)pos,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again doesn't java do the upcast implicitly?

Copy link
Member Author

@kiszk kiszk Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is OK to rely on implicit cast, done.

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68819 has finished for PR 15907 at commit 022e5b3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68820 has finished for PR 15907 at commit 3f9efdb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68821 has finished for PR 15907 at commit f2e2079.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -30,7 +32,7 @@ import org.apache.spark.util.collection.Sorter
import org.apache.spark.util.random.XORShiftRandom

class RadixSortSuite extends SparkFunSuite with Logging {
private val N = 10000 // scale this down for more readable results
private val N = 10000.toLong // scale this down for more readable results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write 10000L

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will do that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

inIndex = outIndex;
outIndex = tmp;
}
}
}
return inIndex;
return Ints.checkedCast(inIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than pull in a library method, how about require which lets you provide an actual error message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rxin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to add a very simple if check in the beginning of this function. I will do it when I merge this.

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68855 has finished for PR 15907 at commit 591626e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Nov 20, 2016

Merging in master/branch-2.1

asfgit pushed a commit that referenced this pull request Nov 20, 2016
…ion in RadixSort.java

## What changes were proposed in this pull request?

This PR avoids that a result of an expression is negative due to signed integer overflow (e.g. 0x10?????? * 8 < 0). This PR casts each operand to `long` before executing a calculation. Since the result is interpreted as long, the result of the expression is positive.

## How was this patch tested?

Manually executed query82 of TPC-DS with 100TB

Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>

Closes #15907 from kiszk/SPARK-18458.

(cherry picked from commit d93b655)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in d93b655 Nov 20, 2016
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ion in RadixSort.java

## What changes were proposed in this pull request?

This PR avoids that a result of an expression is negative due to signed integer overflow (e.g. 0x10?????? * 8 < 0). This PR casts each operand to `long` before executing a calculation. Since the result is interpreted as long, the result of the expression is positive.

## How was this patch tested?

Manually executed query82 of TPC-DS with 100TB

Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>

Closes apache#15907 from kiszk/SPARK-18458.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants