Skip to content

Commit 24fe7cc

Browse files
committed
[SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR
## What changes were proposed in this pull request? This PR proposes to revive `stringsAsFactors` option in collect API, which was mistakenly removed in 71a138c. Simply, it casts `charactor` to `factor` if it meets the condition, `stringsAsFactors && is.character(vec)` in primitive type conversion. ## How was this patch tested? Unit test in `R/pkg/tests/fulltests/test_sparkSQL.R`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #19551 from HyukjinKwon/SPARK-17902. (cherry picked from commit a83d8d5) Signed-off-by: hyukjinkwon <gurwls223@gmail.com>
1 parent d2dc175 commit 24fe7cc

File tree

2 files changed

+9
-0
lines changed

2 files changed

+9
-0
lines changed

R/pkg/R/DataFrame.R

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1174,6 +1174,9 @@ setMethod("collect",
11741174
vec <- do.call(c, col)
11751175
stopifnot(class(vec) != "list")
11761176
class(vec) <- PRIMITIVE_TYPES[[colType]]
1177+
if (is.character(vec) && stringsAsFactors) {
1178+
vec <- as.factor(vec)
1179+
}
11771180
df[[colIndex]] <- vec
11781181
} else {
11791182
df[[colIndex]] <- col

R/pkg/tests/fulltests/test_sparkSQL.R

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,12 @@ test_that("create DataFrame with different data types", {
483483
expect_equal(collect(df), data.frame(l, stringsAsFactors = FALSE))
484484
})
485485

486+
test_that("SPARK-17902: collect() with stringsAsFactors enabled", {
487+
df <- suppressWarnings(collect(createDataFrame(iris), stringsAsFactors = TRUE))
488+
expect_equal(class(iris$Species), class(df$Species))
489+
expect_equal(iris$Species, df$Species)
490+
})
491+
486492
test_that("SPARK-17811: can create DataFrame containing NA as date and time", {
487493
df <- data.frame(
488494
id = 1:2,

0 commit comments

Comments
 (0)