Skip to content

colRowAnys

hb edited this page Mar 3, 2015 · 2 revisions

matrixStats: Benchmark report


colAnys() and rowAnys() benchmarks

This report benchmark the performance of colAnys() and rowAnys() against alternative methods.

Alternative methods

  • apply() + any()
  • colSums() > 0 or rowSums() > 0

Data

> rmatrix <- function(nrow, ncol, mode = c("logical", "double", "integer", "index"), range = c(-100, 
+     +100), naProb = 0) {
+     mode <- match.arg(mode)
+     n <- nrow * ncol
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else if (mode == "index") {
+         X <- seq_len(n)
+         mode <- "integer"
+     }     else {
+         X <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(X) <- mode
+     if (naProb > 0) 
+         X[sample(n, size = naProb * n)] <- NA
+     dim(X) <- c(nrow, ncol)
+     X
+ }
> rmatrices <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rmatrix(nrow = scale * 1, ncol = scale * 1, ...)
+     data[[2]] <- rmatrix(nrow = scale * 10, ncol = scale * 10, ...)
+     data[[3]] <- rmatrix(nrow = scale * 100, ncol = scale * 1, ...)
+     data[[4]] <- t(data[[3]])
+     data[[5]] <- rmatrix(nrow = scale * 10, ncol = scale * 100, ...)
+     data[[6]] <- t(data[[5]])
+     names(data) <- sapply(data, FUN = function(x) paste(dim(x), collapse = "x"))
+     data
+ }
> data <- rmatrices(mode = "logical")

Results

10x10 matrix

> X <- data[["10x10"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   649471 34.7    1168576  62.5  1168576  62.5
Vcells 12125146 92.6   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648130 34.7    1168576  62.5  1168576  62.5
Vcells 12121512 92.5   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 10x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0023 0.0035 0.0047 0.0050 0.0058 0.0192
3 colSums > 0 0.0058 0.0073 0.0088 0.0089 0.0100 0.0381
2 apply+any 0.0481 0.0520 0.0550 0.0533 0.0550 0.1190
expr min lq mean median uq max
1 colAnys 1.000 1.000 1.000 1.000 1.000 1.00
3 colSums > 0 2.499 2.111 1.853 1.769 1.733 1.98
2 apply+any 20.823 14.994 11.594 10.653 9.532 6.18
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 10x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.0027 0.0037 0.0049 0.0052 0.0058 0.0192
3 rowSums > 0 0.0058 0.0069 0.0088 0.0085 0.0100 0.0381
2 apply+any 0.0481 0.0520 0.0541 0.0535 0.0547 0.1193
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.00 1.000 1.000 1.00
3 rowSums > 0 2.143 1.895 1.81 1.629 1.733 1.98
2 apply+any 17.856 14.207 11.08 10.294 9.466 6.20
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 10x10 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 10x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 2.311 3.466 4.744 5.005 5.775 19.25
rowAnys 2.695 3.658 4.886 5.198 5.775 19.25
expr min lq mean median uq max
colAnys 1.000 1.000 1.00 1.000 1 1
rowAnys 1.166 1.055 1.03 1.039 1 1
Figure: Benchmarking of colAnys() and rowAnys() on 10x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x100 matrix

> X <- data[["100x100"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648192 34.7    1168576  62.5  1168576  62.5
Vcells 12122694 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648186 34.7    1168576  62.5  1168576  62.5
Vcells 12127737 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 100x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0065 0.0092 0.0127 0.0123 0.0150 0.0293
3 colSums > 0 0.0192 0.0358 0.0397 0.0397 0.0429 0.1174
2 apply+any 0.4273 0.6562 0.6697 0.6929 0.7195 0.8284
expr min lq mean median uq max
1 colAnys 1.000 1.000 1.000 1.000 1.000 1.000
3 colSums > 0 2.941 3.875 3.117 3.219 2.859 4.013
2 apply+any 65.287 71.012 52.622 56.246 47.919 28.315
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 100x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.0150 0.0169 0.0191 0.0189 0.0200 0.0354
3 rowSums > 0 0.0385 0.0404 0.0439 0.0418 0.0450 0.0670
2 apply+any 0.3538 0.3657 0.4183 0.3717 0.4333 0.7245
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.000 1.000 1.00 1.000
3 rowSums > 0 2.564 2.386 2.293 2.214 2.25 1.891
2 apply+any 23.563 21.590 21.870 19.704 21.64 20.456
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 100x100 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 100x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 6.545 9.24 12.73 12.32 15.01 29.26
rowAnys 15.014 16.94 19.13 18.86 20.02 35.42
expr min lq mean median uq max
colAnys 1.000 1.000 1.000 1.000 1.000 1.00
rowAnys 2.294 1.833 1.503 1.531 1.333 1.21
Figure: Benchmarking of colAnys() and rowAnys() on 100x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x10 matrix

> X <- data[["1000x10"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648232 34.7    1168576  62.5  1168576  62.5
Vcells 12122935 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648226 34.7    1168576  62.5  1168576  62.5
Vcells 12127978 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 1000x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0023 0.0039 0.0059 0.0062 0.0069 0.0558
3 colSums > 0 0.0162 0.0181 0.0202 0.0212 0.0219 0.0362
2 apply+any 0.2202 0.2229 0.2408 0.2256 0.2371 0.3472
expr min lq mean median uq max
1 colAnys 1.000 1.000 1.000 1.000 1.000 1.0000
3 colSums > 0 6.997 4.699 3.414 3.437 3.166 0.6483
2 apply+any 95.281 57.886 40.711 36.621 34.218 6.2205
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 1000x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.0200 0.0208 0.0229 0.0229 0.0237 0.0466
3 rowSums > 0 0.0385 0.0406 0.0428 0.0433 0.0447 0.0554
2 apply+any 0.2194 0.2227 0.2426 0.2260 0.2379 0.3757
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.000 1.000 1.000 1.000
3 rowSums > 0 1.923 1.954 1.871 1.891 1.886 1.190
2 apply+any 10.961 10.712 10.603 9.865 10.048 8.066
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 1000x10 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 1000x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 2.311 3.85 5.914 6.16 6.93 55.82
rowAnys 20.018 20.79 22.879 22.91 23.68 46.58
expr min lq mean median uq max
colAnys 1.000 1.000 1.000 1.000 1.000 1.0000
rowAnys 8.662 5.399 3.869 3.719 3.417 0.8345
Figure: Benchmarking of colAnys() and rowAnys() on 1000x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

10x1000 matrix

> X <- data[["10x1000"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648262 34.7    1168576  62.5  1168576  62.5
Vcells 12123503 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648256 34.7    1168576  62.5  1168576  62.5
Vcells 12128546 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 10x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0154 0.0179 0.0268 0.0252 0.0329 0.0543
3 colSums > 0 0.0266 0.0300 0.0428 0.0381 0.0524 0.1012
2 apply+any 1.6950 1.7841 2.3478 1.8484 2.6893 6.7325
expr min lq mean median uq max
1 colAnys 1.000 1.000 1.000 1.000 1.000 1.000
3 colSums > 0 1.725 1.677 1.599 1.511 1.591 1.865
2 apply+any 110.069 99.663 87.677 73.302 81.705 124.034
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 10x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.0339 0.0358 0.0430 0.0393 0.0491 0.0731
3 rowSums > 0 0.0474 0.0493 0.0601 0.0549 0.0654 0.1382
2 apply+any 1.6965 1.7806 2.2477 1.8491 2.1284 4.5579
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.000 1.000 1.000 1.000
3 rowSums > 0 1.398 1.376 1.398 1.397 1.333 1.889
2 apply+any 50.078 49.736 52.324 47.092 43.364 62.314
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 10x1000 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 10x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 15.40 17.9 26.78 25.22 32.91 54.28
rowAnys 33.88 35.8 42.96 39.27 49.08 73.14
expr min lq mean median uq max
colAnys 1.0 1 1.000 1.000 1.000 1.000
rowAnys 2.2 2 1.604 1.557 1.491 1.347
Figure: Benchmarking of colAnys() and rowAnys() on 10x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x1000 matrix

> X <- data[["100x1000"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648306 34.7    1168576  62.5  1168576  62.5
Vcells 12123901 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648300 34.7    1168576  62.5  1168576  62.5
Vcells 12173944 92.9   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 100x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0177 0.0223 0.0472 0.0395 0.0679 0.127
3 colSums > 0 0.1290 0.1320 0.1983 0.1825 0.2520 0.390
2 apply+any 3.3456 3.6267 5.0826 4.0397 6.0815 22.921
expr min lq mean median uq max
1 colAnys 1.000 1.000 1.000 1.000 1.000 1.00
3 colSums > 0 7.282 5.914 4.203 4.624 3.708 3.07
2 apply+any 188.923 162.426 107.736 102.379 89.506 180.43
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 100x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.1240 0.1293 0.1549 0.1519 0.1728 0.2221
3 rowSums > 0 0.3345 0.3703 0.4350 0.4177 0.5139 0.6232
2 apply+any 3.3861 4.3966 5.8292 5.3917 6.6093 22.0532
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.000 1.00 1.000 1.000
3 rowSums > 0 2.699 2.863 2.809 2.75 2.973 2.806
2 apply+any 27.317 33.991 37.639 35.50 38.238 99.286
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 100x1000 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 100x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 17.71 22.33 47.18 39.46 67.94 127.0
rowAnys 123.96 129.34 154.87 151.87 172.84 222.1
expr min lq mean median uq max
colAnys 1 1.000 1.000 1.000 1.000 1.000
rowAnys 7 5.793 3.283 3.849 2.544 1.748
Figure: Benchmarking of colAnys() and rowAnys() on 100x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x100 matrix

> X <- data[["1000x100"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648337 34.7    1168576  62.5  1168576  62.5
Vcells 12124353 92.6   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAnys = colAnys(X), `apply+any` = apply(X, MARGIN = 2L, FUN = any), 
+     `colSums > 0` = (colSums(X) > 0L), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   648331 34.7    1168576  62.5  1168576  62.5
Vcells 12174396 92.9   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAnys = rowAnys(X), `apply+any` = apply(X, MARGIN = 1L, FUN = any), 
+     `rowSums > 0` = (rowSums(X) > 0L), unit = "ms")

Table: Benchmarking of colAnys(), apply+any() and colSums > 0() on 1000x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAnys 0.0050 0.0083 0.0208 0.0204 0.0320 0.0589
3 colSums > 0 0.1136 0.1164 0.1682 0.1621 0.2179 0.2768
2 apply+any 1.8809 1.9681 2.5582 2.6518 2.8833 3.8815
expr min lq mean median uq max
1 colAnys 1.00 1.00 1.000 1.000 1.000 1.000
3 colSums > 0 22.69 14.07 8.079 7.943 6.819 4.699
2 apply+any 375.80 237.76 122.897 129.963 90.239 65.901
Table: Benchmarking of rowAnys(), apply+any() and rowSums > 0() on 1000x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAnys 0.1163 0.1201 0.1380 0.1334 0.1386 0.5054
3 rowSums > 0 0.3361 0.3397 0.3917 0.3644 0.4619 0.6113
2 apply+any 1.9367 2.3575 2.8386 2.7830 3.0758 4.3946
expr min lq mean median uq max
1 rowAnys 1.000 1.000 1.000 1.000 1.000 1.000
3 rowSums > 0 2.891 2.829 2.839 2.732 3.333 1.209
2 apply+any 16.659 19.628 20.576 20.864 22.194 8.695
Figure: Benchmarking of colAnys(), apply+any() and colSums > 0() on 1000x100 data as well as rowAnys(), apply+any() and rowSums > 0() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAnys() and rowAnys() on 1000x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAnys 5.005 8.277 20.82 20.4 31.95 58.9
rowAnys 116.258 120.106 137.96 133.4 138.58 505.4
expr min lq mean median uq max
colAnys 1.00 1.00 1.000 1.000 1.000 1.000
rowAnys 23.23 14.51 6.628 6.537 4.337 8.582
Figure: Benchmarking of colAnys() and rowAnys() on 1000x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7.7          microbenchmark_1.4-2    matrixStats_0.14.0-9000
[4] ggplot2_1.0.0           knitr_1.9.3             R.devices_2.13.0       
[7] R.utils_2.0.0           R.oo_1.19.0             R.methodsS3_1.7.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.4         splines_3.2.0       MASS_7.3-39        
 [4] munsell_0.4.2       lattice_0.20-30     colorspace_1.2-4   
 [7] R.cache_0.11.1-9000 multcomp_1.3-9      stringr_0.6.2      
[10] plyr_1.8.1          tools_3.2.0         grid_3.2.0         
[13] gtable_0.1.2        TH.data_1.0-6       survival_2.38-1    
[16] digest_0.6.8        R.rsp_0.20.0        reshape2_1.4.1     
[19] formatR_1.0.3       base64enc_0.1-3     mime_0.2.1         
[22] evaluate_0.5.7      labeling_0.3        sandwich_2.3-2     
[25] scales_0.2.4        mvtnorm_1.0-2       zoo_1.7-12         
[28] Cairo_1.5-6         proto_0.3-10       

Total processing time was 17.31 secs.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('colAnys')

Copyright Henrik Bengtsson. Last updated on 2015-03-02 16:58:09 (-0800 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAADFBMVEX9/v0AAP/9/v3//wBEQjoBAAAABHRSTlP//wD//gy7CwAAAGJJREFUOI3N0rESwCAIA9Ag///PXdoiBk0HhmbNO49DMETQCexNCSyFgdlGoO5DYOr9ThLgPosA7osIQP0sHuDOog8UI/ALa988wzdwXJRctf4s+d36YPTJ6aMd8ux3+QO4ABTtB85yDAh9AAAAAElFTkSuQmCC" document.getElementsByTagName('head')[0].appendChild(link); </script>

[Benchmark reports](Benchmark reports)

Clone this wiki locally