Skip to content

Commit ee2bd70

Browse files
Davies LiuJoshRosen
Davies Liu
authored andcommitted
[SPARK-6667] [PySpark] remove setReuseAddress
The reused address on server side had caused the server can not acknowledge the connected connections, remove it. This PR will retry once after timeout, it also add a timeout at client side. Author: Davies Liu <davies@databricks.com> Closes #5324 from davies/collect_hang and squashes the following commits: e5a51a2 [Davies Liu] remove setReuseAddress 7977c2f [Davies Liu] do retry on client side b838f35 [Davies Liu] retry after timeout (cherry picked from commit 0cce545) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
1 parent 1160cc9 commit ee2bd70

File tree

2 files changed

+1
-1
lines changed

2 files changed

+1
-1
lines changed

core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -604,7 +604,6 @@ private[spark] object PythonRDD extends Logging {
604604
*/
605605
private def serveIterator[T](items: Iterator[T], threadName: String): Int = {
606606
val serverSocket = new ServerSocket(0, 1)
607-
serverSocket.setReuseAddress(true)
608607
// Close the socket if no connection in 3 seconds
609608
serverSocket.setSoTimeout(3000)
610609

python/pyspark/rdd.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ def _parse_memory(s):
113113

114114
def _load_from_socket(port, serializer):
115115
sock = socket.socket()
116+
sock.settimeout(3)
116117
try:
117118
sock.connect(("localhost", port))
118119
rf = sock.makefile("rb", 65536)

0 commit comments

Comments
 (0)