Skip to content

Commit 6e2509f

Browse files
committed
[SPARK-4808] Removing minimum number of elements read before spill check
We found that if we only start checking for spilling after reading 1000 elements in a Spillable, we would run out of memory if there are fewer than 1000 elements but each of those elements are very large. There is no real need to only check for spilling after reading 1000 things. It is still necessary, however, to mitigate the cost of entering a synchronized block. It turns out in practice however checking for spilling only on every 32 items is sufficient, without needing the minimum elements threshold.
1 parent 6580929 commit 6e2509f

File tree

1 file changed

+1
-5
lines changed

1 file changed

+1
-5
lines changed

core/src/main/scala/org/apache/spark/util/collection/Spillable.scala

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,6 @@ private[spark] trait Spillable[C] extends Logging {
4242
// Memory manager that can be used to acquire/release memory
4343
private[this] val shuffleMemoryManager = SparkEnv.get.shuffleMemoryManager
4444

45-
// Threshold for `elementsRead` before we start tracking this collection's memory usage
46-
private[this] val trackMemoryThreshold = 1000
47-
4845
// Initial threshold for the size of a collection before we start tracking its memory usage
4946
// Exposed for testing
5047
private[this] val initialMemoryThreshold: Long =
@@ -72,8 +69,7 @@ private[spark] trait Spillable[C] extends Logging {
7269
* @return true if `collection` was spilled to disk; false otherwise
7370
*/
7471
protected def maybeSpill(collection: C, currentMemory: Long): Boolean = {
75-
if (elementsRead > trackMemoryThreshold && elementsRead % 32 == 0 &&
76-
currentMemory >= myMemoryThreshold) {
72+
if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
7773
// Claim up to double our current memory from the shuffle memory pool
7874
val amountToRequest = 2 * currentMemory - myMemoryThreshold
7975
val granted = shuffleMemoryManager.tryToAcquire(amountToRequest)

0 commit comments

Comments
 (0)