Skip to content

Commit

Permalink
Merge pull request #1704 from paul-buerkner/log-lik-zombies
Browse files Browse the repository at this point in the history
Zombie fix attempts
  • Loading branch information
paul-buerkner authored Nov 14, 2024
2 parents 8ed8036 + 2b0bab2 commit ed43494
Show file tree
Hide file tree
Showing 6 changed files with 28 additions and 13 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Package: brms
Encoding: UTF-8
Type: Package
Title: Bayesian Regression Models using 'Stan'
Version: 2.22.5
Date: 2024-11-08
Version: 2.22.6
Date: 2024-11-14
Authors@R:
c(person("Paul-Christian", "Bürkner", email = "paul.buerkner@gmail.com",
role = c("aut", "cre")),
Expand Down
10 changes: 10 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@
* Fit extended-support Beta models via family `xbeta`
thanks to Ioannis Kosmidis. (#1698)

### Bug Fixes

* Avoid the creation of zombie workers when executing `log_lik`
in parallel thanks to Aki Vehtari and Noa Kallioinen.
For now, `log_lik` will use PSOCK clusters if run
in parallel even on Unix systems. To avoid potential speed loss for small
models, `log_lik` will not use `option(mc.cores)` anymore.
These changes may be reverted once the underlying causes of this
issue have been fixed. (#1658)

### Other Changes

* Improve sampling efficiency of `beta_binomial` models. (#1703)
Expand Down
6 changes: 4 additions & 2 deletions R/brmsfit-helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -845,11 +845,13 @@ arg_names <- function(method) {
}

# validate 'cores' argument for use in post-processing functions
validate_cores_post_processing <- function(cores) {
validate_cores_post_processing <- function(cores, use_mc_cores = FALSE) {
if (is.null(cores)) {
if (os_is_windows()) {
if (os_is_windows() || !use_mc_cores) {
# multi cores often leads to a slowdown on windows
# in post-processing functions as discussed in #1129
# multi cores may also lead to zombie workers
# on unix systems as discussed in #1658
cores <- 1L
} else {
cores <- getOption("mc.cores", 1L)
Expand Down
2 changes: 1 addition & 1 deletion R/log_lik.R
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ log_lik.brmsprep <- function(object, cores = NULL, ...) {
object$dpars[[dp]] <- get_dpar(object, dpar = dp)
}
N <- choose_N(object)
out <- plapply(seq_len(N), log_lik_fun, cores = cores, prep = object)
out <- plapply(seq_len(N), log_lik_fun, .cores = cores, prep = object)
out <- do_call(cbind, out)
colnames(out) <- NULL
old_order <- object$old_order
Expand Down
17 changes: 10 additions & 7 deletions R/misc.R
Original file line number Diff line number Diff line change
Expand Up @@ -421,22 +421,25 @@ cblapply <- function(X, FUN, ...) {
}

# parallel lapply sensitive to the operating system
plapply <- function(X, FUN, cores = 1, ...) {
if (cores == 1) {
# args:
# .psock: use a PSOCK cluster? Default is TRUE until
#. the zombie worker issue #1658 has been fully resolved
plapply <- function(X, FUN, .cores = 1, .psock = TRUE, ...) {
if (.cores == 1) {
out <- lapply(X, FUN, ...)
} else {
if (!os_is_windows()) {
out <- parallel::mclapply(X = X, FUN = FUN, mc.cores = cores, ...)
if (!os_is_windows() && !.psock) {
out <- parallel::mclapply(X = X, FUN = FUN, mc.cores = .cores, ...)
} else {
cl <- parallel::makePSOCKcluster(cores)
cl <- parallel::makePSOCKcluster(.cores)
on.exit(parallel::stopCluster(cl))
out <- parallel::parLapply(cl = cl, X = X, fun = FUN, ...)
}
# The version below hopefully prevents the spawning of zombies
# The version below was suggested to prevent the spawning of zombies
# but it does not always succeed in that. It also seems to cause
# other issues as discussed in #1658, so commented out for now.
# cl_type <- ifelse(os_is_windows(), "PSOCK", "FORK")
# cl <- parallel::makeCluster(cores, type = cl_type)
# cl <- parallel::makeCluster(.cores, type = cl_type)
# # Register a cleanup for the cluster in case the function fails
# # Need to wrap in a tryCatch to avoid error if cluster is already stopped
# on.exit(tryCatch(
Expand Down
2 changes: 1 addition & 1 deletion R/posterior_predict.R
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ posterior_predict.brmsprep <- function(object, transform = NULL, sort = FALSE,
pp_fun <- paste0("posterior_predict_", object$family$fun)
pp_fun <- get(pp_fun, asNamespace("brms"))
N <- choose_N(object)
out <- plapply(seq_len(N), pp_fun, cores = cores, prep = object, ...)
out <- plapply(seq_len(N), pp_fun, .cores = cores, prep = object, ...)
if (grepl("_mv$", object$family$fun)) {
out <- do_call(abind, c(out, along = 3))
out <- aperm(out, perm = c(1, 3, 2))
Expand Down

0 comments on commit ed43494

Please sign in to comment.