Skip to content

Commit 9514b82

Browse files
falakiHyukjinKwon
authored andcommitted
[SPARK-29777][SPARKR] SparkR::cleanClosure aggressively removes a function required by user function
### What changes were proposed in this pull request? The implementation for walking through the user function AST and picking referenced variables and functions, had an optimization to skip a branch if it had already seen it. This runs into an interesting problem in the following example ``` df <- createDataFrame(data.frame(x=1)) f1 <- function(x) x + 1 f2 <- function(x) f1(x) + 2 dapplyCollect(df, function(x) { f1(x); f2(x) }) ``` Results in error: ``` org.apache.spark.SparkException: R computation failed with Error in f1(x) : could not find function "f1" Calls: compute -> computeFunc -> f2 ``` ### Why are the changes needed? Bug fix ### Does this PR introduce any user-facing change? No ### How was this patch tested? Unit tests in `test_utils.R` Closes #26429 from falaki/SPARK-29777. Authored-by: Hossein <hossein@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
1 parent ea010a2 commit 9514b82

File tree

2 files changed

+15
-2
lines changed

2 files changed

+15
-2
lines changed

R/pkg/R/utils.R

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -543,10 +543,14 @@ processClosure <- function(node, oldEnv, defVars, checkedFuncs, newEnv) {
543543
funcList <- mget(nodeChar, envir = checkedFuncs, inherits = F,
544544
ifnotfound = list(list(NULL)))[[1]]
545545
found <- sapply(funcList, function(func) {
546-
ifelse(identical(func, obj), TRUE, FALSE)
546+
ifelse(
547+
identical(func, obj) &&
548+
# Also check if the parent environment is identical to current parent
549+
identical(parent.env(environment(func)), func.env),
550+
TRUE, FALSE)
547551
})
548552
if (sum(found) > 0) {
549-
# If function has been examined, ignore.
553+
# If function has been examined ignore
550554
break
551555
}
552556
# Function has not been examined, record it and recursively clean its closure.

R/pkg/tests/fulltests/test_utils.R

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,15 @@ test_that("cleanClosure on R functions", {
110110
actual <- get("y", envir = env, inherits = FALSE)
111111
expect_equal(actual, y)
112112

113+
# Test for combination for nested and sequenctial functions in a closure
114+
f1 <- function(x) x + 1
115+
f2 <- function(x) f1(x) + 2
116+
userFunc <- function(x) { f1(x); f2(x) }
117+
cUserFuncEnv <- environment(cleanClosure(userFunc))
118+
expect_equal(length(cUserFuncEnv), 2)
119+
innerCUserFuncEnv <- environment(cUserFuncEnv$f2)
120+
expect_equal(length(innerCUserFuncEnv), 1)
121+
113122
# Test for function (and variable) definitions.
114123
f <- function(x) {
115124
g <- function(y) { y * 2 }

0 commit comments

Comments
 (0)