[SPARK-24107][CORE] ChunkedByteBuffer.writeFully method has not reset the limit value #21175

jinhai-cloud · 2018-04-27T03:20:18Z

JIRA Issue: https://issues.apache.org/jira/browse/SPARK-24107?jql=text%20~%20%22ChunkedByteBuffer%22

ChunkedByteBuffer.writeFully method has not reset the limit value. When
chunks larger than bufferWriteChunkSize, such as 80 * 1024 * 1024 larger than
config.BUFFER_WRITE_CHUNK_SIZE(64 * 1024 * 1024)，only while once, will lost 16 * 1024 * 1024 byte

kiszk · 2018-04-27T03:47:40Z

Would it be possible to add a unit test?

jinhai-cloud · 2018-04-27T03:55:50Z

In class ChunkedByteBufferSuite
unit test:

test("writeFully() does not affect original buffer's position") {
val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80 * 1024 * 1024)))
chunkedByteBuffer.writeFully(new ByteArrayWritableChannel(chunkedByteBuffer.size.toInt))
assert(chunkedByteBuffer.getChunks().head.position() === 0)
}

Ngone51 · 2018-04-27T04:08:41Z

@Manbuyun you need to add the unit test into ChunkedByteBufferSuite.scala and push a new commit.

Ngone51

Better with a unit test.

maropu · 2018-04-27T04:38:06Z

Plz add [CORE] in the title.

Ngone51 · 2018-04-27T04:47:29Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -56,6 +56,12 @@ class ChunkedByteBufferSuite extends SparkFunSuite {
    assert(chunkedByteBuffer.getChunks().head.position() === 0)
  }

+  test("writeFully() does not affect original buffer's position") {


Hi @Manbuyun .You should add a new unit test to support your own change. For example, "writeFully() can write buffer which is larger than bufferWriteChunkSize correctly. " And update the test code.

Done. Thanks

Ngone51 · 2018-04-27T05:39:32Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

+  test("writeFully() can write buffer which is larger than bufferWriteChunkSize correctly") {
+    val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80*1024*1024)))
+    chunkedByteBuffer.writeFully(new ByteArrayWritableChannel(chunkedByteBuffer.size.toInt))
+    assert(chunkedByteBuffer.getChunks().head.position() === 0)


This assert is unnecessary for this PR change. Please replace it with assert channel's length here.

Ngone51 · 2018-04-27T05:46:45Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -56,6 +56,12 @@ class ChunkedByteBufferSuite extends SparkFunSuite {
    assert(chunkedByteBuffer.getChunks().head.position() === 0)
  }

+  test("writeFully() can write buffer which is larger than bufferWriteChunkSize correctly") {
+    val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80*1024*1024)))


nit: space beside *.

Done. Thanks

kiszk · 2018-04-27T06:27:09Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -56,6 +56,12 @@ class ChunkedByteBufferSuite extends SparkFunSuite {
    assert(chunkedByteBuffer.getChunks().head.position() === 0)
  }

+  test("writeFully() can write buffer which is larger than bufferWriteChunkSize correctly") {


nit: Would it be possible to add SPARK-24107: into the start of the string? It would help us connect a UT with JIRA entry.

Done. Thanks

Ngone51 · 2018-04-27T07:08:58Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

+  test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") {
+    val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80 * 1024 * 1024)))
+    chunkedByteBuffer.writeFully(new ByteArrayWritableChannel(chunkedByteBuffer.size.toInt))
+    assert(chunkedByteBuffer.size === (80L * 1024L * 1024L))


ByteArrayWritableChannel 's size, not chunkedByteBuffer's size.

My mistake, has been fixed. Thanks

Ngone51 · 2018-04-27T08:10:03Z

cc @kiszk @maropu @cloud-fan @jiangxb1987

maropu · 2018-04-27T10:01:22Z

core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala

@@ -63,10 +63,12 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) {
   */
  def writeFully(channel: WritableByteChannel): Unit = {
    for (bytes <- getChunks()) {
+      val limit = bytes.limit()
      while (bytes.remaining() > 0) {


This is not related to this pr though, while (bytes.hasRemaining) {?

maropu · 2018-04-27T10:02:34Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -56,6 +56,13 @@ class ChunkedByteBufferSuite extends SparkFunSuite {
    assert(chunkedByteBuffer.getChunks().head.position() === 0)
  }

+  test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") {
+    val chunkedByteBuffer = new ChunkedByteBuffer(Array(ByteBuffer.allocate(80 * 1024 * 1024)))


Can you configure bufferWriteChunkSize explicitly for this test purpose?

I have modified.Please check

maropu · 2018-04-27T10:06:23Z

core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala

@@ -63,10 +63,12 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) {
   */
  def writeFully(channel: WritableByteChannel): Unit = {
    for (bytes <- getChunks()) {
+      val limit = bytes.limit()


How about renaming limit to curChunkLimit?

maropu · 2018-04-27T12:41:35Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -56,6 +56,15 @@ class ChunkedByteBufferSuite extends SparkFunSuite {
    assert(chunkedByteBuffer.getChunks().head.position() === 0)
  }

+  test("SPARK-24107: writeFully() write buffer which is larger than bufferWriteChunkSize") {
+    val bufferWriteChunkSize = Option(SparkEnv.get).map(_.conf.get(config.BUFFER_WRITE_CHUNK_SIZE))
+            .getOrElse(config.BUFFER_WRITE_CHUNK_SIZE.defaultValue.get).toInt


How about setting this value via spark.buffer.write.chunkSize? e.g.,

spark/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala

Line 337 in 109935f

sc.conf.set("spark.default.parallelism", "4")

Ok. I have added. Please check

jiangxb1987 · 2018-04-27T14:58:25Z

core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala

        val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
        bytes.limit(bytes.position() + ioSize)
        channel.write(bytes)
+        bytes.limit(curChunkLimit)


I would rewrite this using:

try { val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) bytes.limit(bytes.position() + ioSize) channel.write(bytes) } finally { bytes.limit(curChunkLimit) }

to be safe.

Right. When channel write throw IOException

I have commit this modified

cloud-fan · 2018-04-27T15:52:05Z

ok to test

SparkQA · 2018-04-27T15:55:03Z

Test build #89928 has finished for PR 21175 at commit fb527c8.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

xuanyuanking

nit for style check

xuanyuanking · 2018-04-29T13:01:07Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -20,12 +20,12 @@ package org.apache.spark.io
 import java.nio.ByteBuffer

 import com.google.common.io.ByteStreams
-
-import org.apache.spark.SparkFunSuite
+import org.apache.spark.{SparkFunSuite, SharedSparkContext}


move SharedSparkContext before SparkFunSuite

I have fixed and commit. Thanks

xuanyuanking · 2018-04-29T13:02:31Z

core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala

@@ -20,12 +20,12 @@ package org.apache.spark.io
 import java.nio.ByteBuffer

 import com.google.common.io.ByteStreams


add an empty line behind 22 to separate spark and third-party group.

SparkQA · 2018-05-02T11:58:58Z

Test build #90035 has finished for PR 21175 at commit e78ef39.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-05-02T14:45:13Z

the R test is a known issue, I'm merging in to master and 2.3, thanks!

dongjoon-hyun · 2018-05-02T17:37:07Z

Hi, All.
I created SPARK-24152 because we start to merge by ignoring that known unknown SparkR failure.

… the limit value JIRA Issue: https://issues.apache.org/jira/browse/SPARK-24107?jql=text%20~%20%22ChunkedByteBuffer%22 ChunkedByteBuffer.writeFully method has not reset the limit value. When chunks larger than bufferWriteChunkSize, such as 80 * 1024 * 1024 larger than config.BUFFER_WRITE_CHUNK_SIZE(64 * 1024 * 1024)，only while once, will lost 16 * 1024 * 1024 byte Author: WangJinhai02 <jinhai.wang02@ele.me> Closes #21175 from manbuyun/bugfix-ChunkedByteBuffer. (cherry picked from commit 152eaf6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

JoshRosen · 2018-05-15T03:11:11Z

core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala

+          val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
+          bytes.limit(bytes.position() + ioSize)
+          channel.write(bytes)
+        } finally {


I don't think we need the try and finally here because getChunks() returns duplicated ByteBuffers which have their own position and limit.

I think the problem is, bytes.limit(bytes.position() + ioSize) will change the result of bytes.hasRemaining, so we have to restore the limit in each loop.

I get your point. if there is an exception, there is no next loop and we don't need to restore the limit. so try finally is not needed

JoshRosen · 2018-05-15T03:13:20Z

core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala

+      while (bytes.hasRemaining) {
+        try {
+          val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
+          bytes.limit(bytes.position() + ioSize)


The rationale for the limit() isn't super-clear, but that was a problem in the original PR which introduced the bug (#18730). I'm commenting here only for cross-reference reference for folks who come across this patch in the future. I believe that the original motivation was http://www.evanjones.ca/java-bytebuffer-leak.html

JoshRosen · 2018-05-15T04:09:23Z

No, I mean that the code here can simply follow the write call as straight through code. We don't need to guard against exceptions here because the duplicate of the buffer is used only by a single thread, so you can omit the try block and just concatenate the try contents to the finally contents. Minor bit but I wanted to comment because I initially was confused about when errors could occur and thread safety / sharing until I realized that the modified state does not escape this method.

…

On Mon, May 14, 2018 at 9:03 PM Wenchen Fan ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala <#21175 (comment)>: > @@ -63,10 +63,15 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - while (bytes.remaining() > 0) { - val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) - bytes.limit(bytes.position() + ioSize) - channel.write(bytes) + val curChunkLimit = bytes.limit() + while (bytes.hasRemaining) { + try { + val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) + bytes.limit(bytes.position() + ioSize) + channel.write(bytes) + } finally { Do you mean this is not a real bug that can cause real workload to fail? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21175 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADGPJvZNC5LYjHl2WZ44YEIBVGLrehEks5tylODgaJpZM4TptO_> .

… not reset the limit value ## What changes were proposed in this pull request? According to the discussion in apache#21175 , this PR proposes 2 improvements: 1. add comments to explain why we call `limit` to write out `ByteBuffer` with slices. 2. remove the `try ... finally` ## How was this patch tested? existing tests Author: Wenchen Fan <wenchen@databricks.com> Closes apache#21327 from cloud-fan/minor.

restore bytes limit value

fae1814

Ngone51 approved these changes Apr 27, 2018

View reviewed changes

add ChunkedByteBufferSuite unit test

623f26d

Merge branch 'master' into bugfix-ChunkedByteBuffer

5ba6867

jinhai-cloud changed the title ~~[SPARK-24107] ChunkedByteBuffer.writeFully method has not reset the limit value~~ [SPARK-24107][CORE] ChunkedByteBuffer.writeFully method has not reset the limit value Apr 27, 2018

Ngone51 reviewed Apr 27, 2018

View reviewed changes

ChunkedByteBufferSuite improve unit test

c585131

Ngone51 reviewed Apr 27, 2018

View reviewed changes

kiszk reviewed Apr 27, 2018

View reviewed changes

WangJinhai02 added 2 commits April 27, 2018 14:30

change assert size

a2a82f1

add SPARK-24107:

217ec9d

Ngone51 reviewed Apr 27, 2018

View reviewed changes

change ByteArrayWritableChannel's size

2bc19a3

maropu reviewed Apr 27, 2018

View reviewed changes

add configure bufferWriteChunkSize

c9a6816

maropu reviewed Apr 27, 2018

View reviewed changes

sc.conf.set spark.buffer.write.chunkSize

fa99a19

jiangxb1987 reviewed Apr 27, 2018

View reviewed changes

add try finally

fb527c8

xuanyuanking reviewed Apr 29, 2018

View reviewed changes

fixed the style

e78ef39

asfgit closed this in 152eaf6 May 2, 2018

dongjoon-hyun mentioned this pull request May 2, 2018

[SPARK-23489][SQL][TEST] HiveExternalCatalogVersionsSuite should verify the downloaded file #21210

Closed

JoshRosen reviewed May 15, 2018

View reviewed changes

cloud-fan mentioned this pull request May 15, 2018

[SPARK-24107][CORE][followup] ChunkedByteBuffer.writeFully method has not reset the limit value #21327

Closed

		@@ -20,12 +20,12 @@ package org.apache.spark.io
		import java.nio.ByteBuffer

		import com.google.common.io.ByteStreams

[SPARK-24107][CORE] ChunkedByteBuffer.writeFully method has not reset the limit value #21175

[SPARK-24107][CORE] ChunkedByteBuffer.writeFully method has not reset the limit value #21175

Uh oh!

Conversation

jinhai-cloud commented Apr 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiszk commented Apr 27, 2018

Uh oh!

jinhai-cloud commented Apr 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ngone51 commented Apr 27, 2018

Uh oh!

Ngone51 left a comment

Choose a reason for hiding this comment

Uh oh!

maropu commented Apr 27, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ngone51 commented Apr 27, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maropu Apr 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maropu Apr 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Apr 27, 2018

Uh oh!

SparkQA commented Apr 27, 2018

Uh oh!

xuanyuanking left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented May 2, 2018

Uh oh!

cloud-fan commented May 2, 2018

Uh oh!

dongjoon-hyun commented May 2, 2018

Uh oh!

jinhai-cloud commented Apr 27, 2018 •

edited

Loading

jinhai-cloud commented Apr 27, 2018 •

edited

Loading

maropu Apr 27, 2018 •

edited

Loading

maropu Apr 27, 2018 •

edited

Loading