Description
In the documentation it indicates you don't need .multicombine=T when using foreach with .combine=rbind.
This is incorrect; trying to return an array without .multicombine=T produces an absurdly slow result.
registerDoMC(cores=8)
testFun <- function(multicomb,n=64000) {
out = foreach(com=1:n, .combine=rbind,.multicombine=multicomb) %dopar% {
Sys.sleep(8/n)
if(com==n) {
print(paste("preparing to return last value at",strftime(Sys.time(),format="%H:%M:%S")))
}
return(rnorm(10))
}
print(paste("finished gathering my ",n,"arrays at",strftime(Sys.time(),format="%H:%M:%S")))
nrow(out)
}
testFun(F)
[1] "preparing to return last value at 14:49:18"
[1] "finished gathering my 64000 arrays at 14:50:27"
[1] 64000
testFun(T)
[1] "preparing to return last value at 14:47:10"
[1] "finished gathering my 64000 arrays at 14:47:14"
[1] 64000
Personally I think the result is bad regardless of .multicombine state; 4 seconds to stick 64000 rows together is absurd, even on a raspberry pi. But it gets horrendously bad without .multicombine -in fact for a similar problem (prop trading stuff instead of Sys.sleep) I clock 7 minutes to cons the 64000 rows into a report in the .multicombine=F situation. The actual task only takes 3 minutes. For .multicombine=T this task still takes 19 seconds to cons together the 64000 rows; acceptable for my uses but still nuts. It's a threadripper not a 6809.
FWIIW same thing happens when you ignore .combine and .multicombine and return it as a list. Are you guys doing some giant memory garbage collection before you return? if so that would make sense on fork based multicore doodads.
version
_
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 4.1
year 2024
month 06
day 14
svn rev 86737
language R
version.string R version 4.4.1 (2024-06-14)
nickname Race for Your Life