Skip to content

Commit 58a4c0c

Browse files
committed
[SPARK-26080][PYTHON] Skips Python resource limit on Windows in Python worker
## What changes were proposed in this pull request? `resource` package is a Unix specific package. See https://docs.python.org/2/library/resource.html and https://docs.python.org/3/library/resource.html. Note that we document Windows support: > Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). This should be backported into branch-2.4 to restore Windows support in Spark 2.4.1. ## How was this patch tested? Manually mocking the changed logics. Closes #23055 from HyukjinKwon/SPARK-26080. Lead-authored-by: hyukjinkwon <gurwls223@apache.org> Co-authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 9cda9a8) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 3ec03ec commit 58a4c0c

File tree

2 files changed

+14
-7
lines changed

2 files changed

+14
-7
lines changed

docs/configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,8 @@ of the most common options to set are:
190190
and it is up to the application to avoid exceeding the overhead memory space
191191
shared with other non-JVM processes. When PySpark is run in YARN or Kubernetes, this memory
192192
is added to executor resource requests.
193+
194+
NOTE: Python memory usage may not be limited on platforms that do not support resource limiting, such as Windows.
193195
</td>
194196
</tr>
195197
<tr>

python/pyspark/worker.py

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,12 @@
2222
import os
2323
import sys
2424
import time
25-
import resource
25+
# 'resource' is a Unix specific module.
26+
has_resource_module = True
27+
try:
28+
import resource
29+
except ImportError:
30+
has_resource_module = False
2631
import socket
2732
import traceback
2833

@@ -268,9 +273,9 @@ def main(infile, outfile):
268273

269274
# set up memory limits
270275
memory_limit_mb = int(os.environ.get('PYSPARK_EXECUTOR_MEMORY_MB', "-1"))
271-
total_memory = resource.RLIMIT_AS
272-
try:
273-
if memory_limit_mb > 0:
276+
if memory_limit_mb > 0 and has_resource_module:
277+
total_memory = resource.RLIMIT_AS
278+
try:
274279
(soft_limit, hard_limit) = resource.getrlimit(total_memory)
275280
msg = "Current mem limits: {0} of max {1}\n".format(soft_limit, hard_limit)
276281
print(msg, file=sys.stderr)
@@ -283,9 +288,9 @@ def main(infile, outfile):
283288
print(msg, file=sys.stderr)
284289
resource.setrlimit(total_memory, (new_limit, new_limit))
285290

286-
except (resource.error, OSError, ValueError) as e:
287-
# not all systems support resource limits, so warn instead of failing
288-
print("WARN: Failed to set memory limit: {0}\n".format(e), file=sys.stderr)
291+
except (resource.error, OSError, ValueError) as e:
292+
# not all systems support resource limits, so warn instead of failing
293+
print("WARN: Failed to set memory limit: {0}\n".format(e), file=sys.stderr)
289294

290295
# initialize global state
291296
taskContext = None

0 commit comments

Comments
 (0)