Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41952: [R] Turn S3 and ZSTD on by default for macOS #42210

Merged
merged 6 commits into from
Jun 23, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion r/R/arrow-info.R
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ some_features_are_off <- function(features) {
# `features` is a named logical vector (as in arrow_info()$capabilities)
# Let's exclude some less relevant ones
# jemalloc is only included because it is sometimes disabled in our build process
blocklist <- c("lzo", "bz2", "brotli", "substrait", "jemalloc")
blocklist <- c("lzo", "bz2", "brotli", "substrait", "jemalloc", "gcs")
# Return TRUE if any of the other features are FALSE
!all(features[setdiff(names(features), blocklist)])
}
Expand Down
26 changes: 21 additions & 5 deletions r/tools/nixlibs.R
Original file line number Diff line number Diff line change
Expand Up @@ -536,7 +536,7 @@ build_libarrow <- function(src_dir, dst_dir) {
}
cleanup(build_dir)

env_var_list <- c(
env_var_list <- list(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't technically necessary, but it was the easiest way to do what I wanted to in is_feature_requested(). Further, it's a bit strange that this thing called X_X_list is not a list.

SOURCE_DIR = src_dir,
BUILD_DIR = build_dir,
DEST_DIR = dst_dir,
Expand Down Expand Up @@ -574,6 +574,14 @@ build_libarrow <- function(src_dir, dst_dir) {
env_var_list <- c(env_var_list, setNames("BUNDLED", env_var))
}
}
# We also _do_ want to enable S3 and ZSTD by default
# so that binaries built on CRAN from source are fully featured
# but defer to the env vars if those are set
env_var_list <- c(
env_var_list,
ARROW_S3 = Sys.getenv("ARROW_S3", "ON"),
ARROW_WITH_ZSTD = Sys.getenv("ARROW_WITH_ZSTD", "ON")
)
}

env_var_list <- with_cloud_support(env_var_list)
Expand Down Expand Up @@ -814,8 +822,16 @@ set_thirdparty_urls <- function(env_var_list) {
env_var_list
}

is_feature_requested <- function(env_varname, default = env_is("LIBARROW_MINIMAL", "false")) {
env_value <- tolower(Sys.getenv(env_varname))
# this is generally about features that people asked for via environment variables, but
# for some cases (like S3 when we override it in this script) we might find those in
# env_var_list
is_feature_requested <- function(env_varname, env_var_list, default = env_is("LIBARROW_MINIMAL", "false")) {
# look in the environment first, but then use the env_var_list if nothing is found
env_var_list_value <- env_var_list[[env_varname]]
if (is.null(env_var_list_value)) {
env_var_list_value <- ""
}
env_value <- tolower(Sys.getenv(env_varname, env_var_list_value))
Comment on lines +830 to +834
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivalent?

Suggested change
env_var_list_value <- env_var_list[[env_varname]]
if (is.null(env_var_list_value)) {
env_var_list_value <- ""
}
env_value <- tolower(Sys.getenv(env_varname, env_var_list_value))
env_value <- tolower(Sys.getenv(env_varname, env_var_list[[env_varname]]))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I feel slightly better having thought the exact same thing. But turns out: no! env_var_list[[env_varname]] is NULL if it doesn't exist (which is what we want) but:

Sys.getenv("env_varname", NULL)
Error in Sys.getenv("env_varname", NULL) : wrong type for argument

Though I only found that out when I pushed (+ got a little too eager and triggered all of crossbow r) and then saw lots of red.

This would be a perfect place for %||% or the native equivalent, but I didn't want to require rlang here (I mean we require it for the package, so maybe that's ok?)

if (identical(env_value, "off")) {
# If e.g. ARROW_MIMALLOC=OFF explicitly, override default
requested <- FALSE
Expand All @@ -828,8 +844,8 @@ is_feature_requested <- function(env_varname, default = env_is("LIBARROW_MINIMAL
}

with_cloud_support <- function(env_var_list) {
arrow_s3 <- is_feature_requested("ARROW_S3")
arrow_gcs <- is_feature_requested("ARROW_GCS")
arrow_s3 <- is_feature_requested("ARROW_S3", env_var_list)
arrow_gcs <- is_feature_requested("ARROW_GCS", env_var_list)

if (arrow_s3 || arrow_gcs) {
# User wants S3 or GCS support.
Expand Down
Loading