Skip to content

Commit 82e461a

Browse files
committed
[SPARK-35381][R] Fix lambda variable name issues in nested higher order functions at R APIs
This PR fixes the same issue as #32424 ```r df <- sql("SELECT array(1, 2, 3) as numbers, array('a', 'b', 'c') as letters") collect(select( df, array_transform("numbers", function(number) { array_transform("letters", function(latter) { struct(alias(number, "n"), alias(latter, "l")) }) }) )) ``` **Before:** ``` ... a, a, b, b, c, c, a, a, b, b, c, c, a, a, b, b, c, c ``` **After:** ``` ... 1, a, 1, b, 1, c, 2, a, 2, b, 2, c, 3, a, 3, b, 3, c ``` To produce the correct results. Yes, it fixes the results to be correct as mentioned above. Manually tested as above, and unit test was added. Closes #32517 from HyukjinKwon/SPARK-35381. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit ecb48cc) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent f88a522 commit 82e461a

File tree

2 files changed

+20
-1
lines changed

2 files changed

+20
-1
lines changed

R/pkg/R/functions.R

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3578,7 +3578,12 @@ unresolved_named_lambda_var <- function(...) {
35783578
"org.apache.spark.sql.Column",
35793579
newJObject(
35803580
"org.apache.spark.sql.catalyst.expressions.UnresolvedNamedLambdaVariable",
3581-
list(...)
3581+
lapply(list(...), function(x) {
3582+
handledCallJStatic(
3583+
"org.apache.spark.sql.catalyst.expressions.UnresolvedNamedLambdaVariable",
3584+
"freshVarName",
3585+
x)
3586+
})
35823587
)
35833588
)
35843589
column(jc)

R/pkg/tests/fulltests/test_sparkSQL.R

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2153,6 +2153,20 @@ test_that("higher order functions", {
21532153
expect_error(array_transform("xs", function(...) 42))
21542154
})
21552155

2156+
test_that("SPARK-34794: lambda vars must be resolved properly in nested higher order functions", {
2157+
df <- sql("SELECT array(1, 2, 3) as numbers, array('a', 'b', 'c') as letters")
2158+
ret <- first(select(
2159+
df,
2160+
array_transform("numbers", function(number) {
2161+
array_transform("letters", function(latter) {
2162+
struct(alias(number, "n"), alias(latter, "l"))
2163+
})
2164+
})
2165+
))
2166+
2167+
expect_equal(1, ret[[1]][[1]][[1]][[1]]$n)
2168+
})
2169+
21562170
test_that("group by, agg functions", {
21572171
df <- read.json(jsonPath)
21582172
df1 <- agg(df, name = "max", age = "sum")

0 commit comments

Comments
 (0)