Skip to content

logSumExp

hb edited this page Mar 3, 2015 · 2 revisions

matrixStats: Benchmark report


logSumExp() benchmarks

This report benchmark the performance of logSumExp() against alternative methods.

Alternative methods

  • logSumExp_R()

where

> logSumExp_R <- function(lx, ...) {
+     iMax <- which.max(lx)
+     log1p(sum(exp(lx[-iMax] - lx[iMax]))) + lx[iMax]
+ }

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (naProb > 0) 
+         x[sample(n, size = naProb * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = "double")
> data <- data[1:4]

Results

n=1000 vector

> x <- data[["n=1000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  1754114  93.7    2637877 140.9  2637877 140.9
Vcells 13889754 106.0   42812957 326.7 68120027 519.8
> stats <- microbenchmark(logSumExp = logSumExp(x), logSumExp_R = logSumExp_R(x), unit = "ms")

Table: Benchmarking of logSumExp() and logSumExp_R() on n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
logSumExp 0.0489 0.0497 0.0551 0.0504 0.0560 0.0974
logSumExp_R 0.0693 0.0708 0.0784 0.0716 0.0791 0.1574
expr min lq mean median uq max
logSumExp 1.000 1.000 1.000 1.00 1.000 1.000
logSumExp_R 1.417 1.426 1.423 1.42 1.412 1.617
Figure: Benchmarking of logSumExp() and logSumExp_R() on n=1000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=10000 vector

> x <- data[["n=10000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  1754134  93.7    2637877 140.9  2637877 140.9
Vcells 13890025 106.0   42812957 326.7 68120027 519.8
> stats <- microbenchmark(logSumExp = logSumExp(x), logSumExp_R = logSumExp_R(x), unit = "ms")

Table: Benchmarking of logSumExp() and logSumExp_R() on n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
logSumExp 0.4785 0.4793 0.5161 0.4816 0.4885 0.7934
logSumExp_R 0.6560 0.6610 0.7260 0.6648 0.7997 1.0228
expr min lq mean median uq max
logSumExp 1.000 1.000 1.000 1.000 1.000 1.000
logSumExp_R 1.371 1.379 1.407 1.381 1.637 1.289
Figure: Benchmarking of logSumExp() and logSumExp_R() on n=10000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=100000 vector

> x <- data[["n=100000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  1754146  93.7    2637877 140.9  2637877 140.9
Vcells 13890033 106.0   42812957 326.7 68120027 519.8
> stats <- microbenchmark(logSumExp = logSumExp(x), logSumExp_R = logSumExp_R(x), unit = "ms")

Table: Benchmarking of logSumExp() and logSumExp_R() on n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
logSumExp 4.785 5.428 6.241 6.292 6.859 9.988
logSumExp_R 6.639 7.613 8.485 8.536 9.023 13.641
expr min lq mean median uq max
logSumExp 1.000 1.000 1.00 1.000 1.000 1.000
logSumExp_R 1.387 1.403 1.36 1.357 1.315 1.366
Figure: Benchmarking of logSumExp() and logSumExp_R() on n=100000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=1000000 vector

> x <- data[["n=1000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  1754158  93.7    2637877 140.9  2637877 140.9
Vcells 13890553 106.0   42812957 326.7 68120027 519.8
> stats <- microbenchmark(logSumExp = logSumExp(x), logSumExp_R = logSumExp_R(x), unit = "ms")

Table: Benchmarking of logSumExp() and logSumExp_R() on n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
logSumExp 44.88 48.65 53.91 52.14 58.00 81.04
logSumExp_R 65.27 72.58 81.72 76.92 84.36 328.78
expr min lq mean median uq max
logSumExp 1.000 1.000 1.000 1.000 1.000 1.000
logSumExp_R 1.454 1.492 1.516 1.475 1.454 4.057
Figure: Benchmarking of logSumExp() and logSumExp_R() on n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7.7          microbenchmark_1.4-2    matrixStats_0.14.0-9000
[4] ggplot2_1.0.0           knitr_1.9.3             R.devices_2.13.0       
[7] R.utils_2.0.0           R.oo_1.19.0             R.methodsS3_1.7.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.4           GenomeInfoDb_1.3.13   formatR_1.0.3        
 [4] plyr_1.8.1            base64enc_0.1-3       tools_3.2.0          
 [7] digest_0.6.8          RSQLite_1.0.0         annotate_1.45.2      
[10] evaluate_0.5.7        gtable_0.1.2          R.cache_0.11.1-9000  
[13] lattice_0.20-30       DBI_0.3.1             parallel_3.2.0       
[16] mvtnorm_1.0-2         proto_0.3-10          R.rsp_0.20.0         
[19] genefilter_1.49.2     stringr_0.6.2         IRanges_2.1.41       
[22] S4Vectors_0.5.21      stats4_3.2.0          grid_3.2.0           
[25] Biobase_2.27.2        AnnotationDbi_1.29.17 XML_3.98-1.1         
[28] survival_2.38-1       multcomp_1.3-9        TH.data_1.0-6        
[31] reshape2_1.4.1        scales_0.2.4          MASS_7.3-39          
[34] splines_3.2.0         BiocGenerics_0.13.6   xtable_1.8-0         
[37] mime_0.2.1            colorspace_1.2-4      labeling_0.3         
[40] sandwich_2.3-2        munsell_0.4.2         Cairo_1.5-6          
[43] zoo_1.7-12           

Total processing time was 21.52 secs.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('logSumExp')

Copyright Henrik Bengtsson. Last updated on 2015-03-02 17:28:10 (-0800 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "" document.getElementsByTagName('head')[0].appendChild(link); </script>

[Benchmark reports](Benchmark reports)

Clone this wiki locally