Skip to content

colRowAlls

hb edited this page Mar 3, 2015 · 2 revisions

matrixStats: Benchmark report


colAlls() and rowAlls() benchmarks

This report benchmark the performance of colAlls() and rowAlls() against alternative methods.

Alternative methods

  • apply() + all()
  • colSums() == n or rowSums() == n

Data

> rmatrix <- function(nrow, ncol, mode = c("logical", "double", "integer", "index"), range = c(-100, 
+     +100), naProb = 0) {
+     mode <- match.arg(mode)
+     n <- nrow * ncol
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else if (mode == "index") {
+         X <- seq_len(n)
+         mode <- "integer"
+     }     else {
+         X <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(X) <- mode
+     if (naProb > 0) 
+         X[sample(n, size = naProb * n)] <- NA
+     dim(X) <- c(nrow, ncol)
+     X
+ }
> rmatrices <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rmatrix(nrow = scale * 1, ncol = scale * 1, ...)
+     data[[2]] <- rmatrix(nrow = scale * 10, ncol = scale * 10, ...)
+     data[[3]] <- rmatrix(nrow = scale * 100, ncol = scale * 1, ...)
+     data[[4]] <- t(data[[3]])
+     data[[5]] <- rmatrix(nrow = scale * 10, ncol = scale * 100, ...)
+     data[[6]] <- t(data[[5]])
+     names(data) <- sapply(data, FUN = function(x) paste(dim(x), collapse = "x"))
+     data
+ }
> data <- rmatrices(mode = "logical")

Results

10x10 matrix

> X <- data[["10x10"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647388 34.6    1168576  62.5  1168576  62.5
Vcells 12122077 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   646344 34.6    1168576  62.5  1168576  62.5
Vcells 12119264 92.5   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 10x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0035 0.0050 0.0087 0.0062 0.0081 0.2314
3 colSums==n 0.0065 0.0104 0.0139 0.0127 0.0152 0.1070
2 apply+all 0.0547 0.0585 0.0743 0.0806 0.0866 0.1609
expr min lq mean median uq max
1 colAlls 1.000 1.000 1.000 1.000 1.000 1.0000
3 colSums==n 1.889 2.076 1.611 2.062 1.881 0.4626
2 apply+all 15.776 11.689 8.583 13.092 10.713 0.6955
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 10x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.0035 0.0054 0.0093 0.0065 0.0089 0.2009
3 rowSums==n 0.0069 0.0117 0.0150 0.0133 0.0167 0.1070
2 apply+all 0.0539 0.0810 0.0840 0.0839 0.0885 0.1717
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.000 1.000 1.000 1.0000
3 rowSums==n 1.999 2.178 1.615 2.029 1.891 0.5326
2 apply+all 15.549 15.033 9.047 12.822 9.999 0.8544
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 10x10 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 10x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 3.465 5.006 8.655 6.160 8.085 231.4
rowAlls 3.466 5.391 9.286 6.545 8.855 200.9
expr min lq mean median uq max
colAlls 1 1.000 1.000 1.000 1.000 1.0000
rowAlls 1 1.077 1.073 1.062 1.095 0.8686
Figure: Benchmarking of colAlls() and rowAlls() on 10x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x100 matrix

> X <- data[["100x100"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647218 34.6    1168576  62.5  1168576  62.5
Vcells 12121895 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647212 34.6    1168576  62.5  1168576  62.5
Vcells 12126938 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 100x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0042 0.0054 0.0094 0.0089 0.0108 0.0296
3 colSums==n 0.0189 0.0216 0.0284 0.0258 0.0345 0.1070
2 apply+all 0.3622 0.3761 0.4381 0.4358 0.4500 0.6356
expr min lq mean median uq max
1 colAlls 1.000 1.000 1.000 1.000 1.000 1.00
3 colSums==n 4.454 3.999 3.039 2.913 3.196 3.61
2 apply+all 85.535 69.765 46.807 49.212 41.745 21.44
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 100x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.0158 0.0173 0.0229 0.0208 0.0260 0.0531
3 rowSums==n 0.0397 0.0423 0.0509 0.0477 0.0591 0.0820
2 apply+all 0.3622 0.3753 0.4551 0.4300 0.5124 0.6560
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.00 1.000 1.000 1.000
3 rowSums==n 2.512 2.444 2.22 2.296 2.274 1.544
2 apply+all 22.950 21.665 19.85 20.684 19.718 12.348
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 100x100 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 100x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 4.235 5.391 9.359 8.855 10.78 29.64
rowAlls 15.784 17.324 22.925 20.789 25.98 53.12
expr min lq mean median uq max
colAlls 1.000 1.000 1.000 1.000 1.00 1.000
rowAlls 3.727 3.213 2.449 2.348 2.41 1.792
Figure: Benchmarking of colAlls() and rowAlls() on 100x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x10 matrix

> X <- data[["1000x10"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647258 34.6    1168576  62.5  1168576  62.5
Vcells 12122130 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647252 34.6    1168576  62.5  1168576  62.5
Vcells 12127173 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 1000x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0027 0.0054 0.0092 0.0073 0.010 0.0604
3 colSums==n 0.0169 0.0231 0.0310 0.0331 0.037 0.0550
2 apply+all 0.2213 0.2916 0.3240 0.3311 0.378 0.6167
expr min lq mean median uq max
1 colAlls 1.000 1.000 1.000 1.000 1.000 1.0000
3 colSums==n 6.283 4.285 3.387 4.525 3.692 0.9108
2 apply+all 82.103 54.091 35.377 45.252 37.765 10.2036
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 1000x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.0189 0.0235 0.0319 0.0325 0.0364 0.0597
3 rowSums==n 0.0400 0.0462 0.0615 0.0647 0.0706 0.1413
2 apply+all 0.2283 0.2991 0.3797 0.3886 0.4300 0.9570
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.000 1.000 1.000 1.000
3 rowSums==n 2.123 1.967 1.929 1.988 1.942 2.368
2 apply+all 12.102 12.737 11.916 11.947 11.820 16.038
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 1000x10 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 1000x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 2.696 5.391 9.159 7.316 10.01 60.44
rowAlls 18.863 23.483 31.867 32.529 36.38 59.67
expr min lq mean median uq max
colAlls 1.000 1.000 1.000 1.000 1.000 1.0000
rowAlls 6.997 4.356 3.479 4.446 3.634 0.9873
Figure: Benchmarking of colAlls() and rowAlls() on 1000x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

10x1000 matrix

> X <- data[["10x1000"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647288 34.6    1168576  62.5  1168576  62.5
Vcells 12122697 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647282 34.6    1168576  62.5  1168576  62.5
Vcells 12127740 92.6   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 10x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0162 0.0192 0.0333 0.0362 0.0458 0.0570
3 colSums==n 0.0281 0.0312 0.0555 0.0577 0.0681 0.1132
2 apply+all 1.8347 2.0046 2.6249 2.2674 3.2392 6.4387
expr min lq mean median uq max
1 colAlls 1.000 1.00 1.000 1.000 1.000 1.000
3 colSums==n 1.738 1.62 1.668 1.596 1.487 1.986
2 apply+all 113.470 104.14 78.909 62.658 70.708 113.012
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 10x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.0335 0.0366 0.0527 0.0500 0.0643 0.1005
3 rowSums==n 0.0474 0.0749 0.0838 0.0805 0.1080 0.1636
2 apply+all 1.9367 2.0818 2.8903 3.0681 3.4016 6.8553
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.000 1.000 1.00 1.000
3 rowSums==n 1.414 2.047 1.591 1.608 1.68 1.628
2 apply+all 57.826 56.926 54.883 61.306 52.91 68.229
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 10x1000 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 10x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 16.17 19.25 33.26 36.19 45.81 56.97
rowAlls 33.49 36.57 52.66 50.05 64.29 100.47
expr min lq mean median uq max
colAlls 1.000 1.0 1.000 1.000 1.000 1.000
rowAlls 2.071 1.9 1.583 1.383 1.403 1.764
Figure: Benchmarking of colAlls() and rowAlls() on 10x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x1000 matrix

> X <- data[["100x1000"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647332 34.6    1168576  62.5  1168576  62.5
Vcells 12123092 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647326 34.6    1168576  62.5  1168576  62.5
Vcells 12173135 92.9   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 100x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0185 0.0223 0.0456 0.0312 0.0678 0.1166
3 colSums==n 0.1290 0.1544 0.1951 0.1898 0.2254 0.2810
2 apply+all 3.3795 3.5262 4.6763 4.1238 5.2897 16.5361
expr min lq mean median uq max
1 colAlls 1.000 1.000 1.000 1.000 1.000 1.000
3 colSums==n 6.979 6.913 4.282 6.086 3.327 2.409
2 apply+all 182.884 157.923 102.632 132.250 78.073 141.768
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 100x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.1240 0.1434 0.1636 0.1546 0.1848 0.2625
3 rowSums==n 0.3353 0.3403 0.4177 0.3830 0.4925 0.6048
2 apply+all 3.4896 3.9577 5.3721 5.1284 6.0241 14.8438
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.000 1.000 1.000 1.000
3 rowSums==n 2.705 2.373 2.554 2.478 2.666 2.304
2 apply+all 28.152 27.600 32.844 33.181 32.602 56.539
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 100x1000 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 100x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 18.48 22.33 45.56 31.18 67.75 116.6
rowAlls 123.96 143.40 163.56 154.56 184.78 262.5
expr min lq mean median uq max
colAlls 1.000 1.000 1.00 1.000 1.000 1.000
rowAlls 6.708 6.422 3.59 4.957 2.727 2.251
Figure: Benchmarking of colAlls() and rowAlls() on 100x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x100 matrix

> X <- data[["1000x100"]]
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647363 34.6    1168576  62.5  1168576  62.5
Vcells 12123548 92.5   35610798 271.7 68120027 519.8
> colStats <- microbenchmark(colAlls = colAlls(X), `apply+all` = apply(X, MARGIN = 2L, FUN = all), 
+     `colSums==n` = (colSums(X) == nrow(X)), unit = "ms")
> X <- t(X)
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   647357 34.6    1168576  62.5  1168576  62.5
Vcells 12173591 92.9   35610798 271.7 68120027 519.8
> rowStats <- microbenchmark(rowAlls = rowAlls(X), `apply+all` = apply(X, MARGIN = 1L, FUN = all), 
+     `rowSums==n` = (rowSums(X) == ncol(X)), unit = "ms")

Table: Benchmarking of colAlls(), apply+all() and colSums==n() on 1000x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
1 colAlls 0.0050 0.0096 0.0251 0.0308 0.0379 0.0701
3 colSums==n 0.1143 0.1705 0.2268 0.2115 0.2458 0.7849
2 apply+all 1.9548 2.8966 3.3878 3.4569 3.6802 8.5598
expr min lq mean median uq max
1 colAlls 1.00 1.00 1.000 1.000 1.000 1.0
3 colSums==n 22.84 17.72 9.043 6.869 6.482 11.2
2 apply+all 390.49 300.94 135.078 112.248 97.053 122.2
Table: Benchmarking of rowAlls(), apply+all() and rowSums==n() on 1000x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
1 rowAlls 0.1139 0.1332 0.1538 0.1584 0.1748 0.2240
3 rowSums==n 0.3345 0.3642 0.4209 0.3801 0.4900 0.7903
2 apply+all 1.9275 2.1744 2.9963 2.9761 3.5169 5.9514
expr min lq mean median uq max
1 rowAlls 1.000 1.000 1.000 1.00 1.000 1.000
3 rowSums==n 2.936 2.734 2.736 2.40 2.804 3.527
2 apply+all 16.916 16.325 19.477 18.79 20.123 26.564
Figure: Benchmarking of colAlls(), apply+all() and colSums==n() on 1000x100 data as well as rowAlls(), apply+all() and rowSums==n() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colAlls() and rowAlls() on 1000x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colAlls 5.006 9.625 25.08 30.8 37.92 70.06
rowAlls 113.947 133.195 153.84 158.4 174.77 224.04
expr min lq mean median uq max
colAlls 1.00 1.00 1.000 1.000 1.000 1.000
rowAlls 22.76 13.84 6.134 5.144 4.609 3.198
Figure: Benchmarking of colAlls() and rowAlls() on 1000x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7.7          microbenchmark_1.4-2    matrixStats_0.14.0-9000
[4] ggplot2_1.0.0           knitr_1.9.3             R.devices_2.13.0       
[7] R.utils_2.0.0           R.oo_1.19.0             R.methodsS3_1.7.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.4         splines_3.2.0       MASS_7.3-39        
 [4] munsell_0.4.2       lattice_0.20-30     colorspace_1.2-4   
 [7] R.cache_0.11.1-9000 multcomp_1.3-9      stringr_0.6.2      
[10] plyr_1.8.1          tools_3.2.0         grid_3.2.0         
[13] gtable_0.1.2        TH.data_1.0-6       survival_2.38-1    
[16] digest_0.6.8        R.rsp_0.20.0        reshape2_1.4.1     
[19] formatR_1.0.3       base64enc_0.1-3     mime_0.2.1         
[22] evaluate_0.5.7      labeling_0.3        sandwich_2.3-2     
[25] scales_0.2.4        mvtnorm_1.0-2       zoo_1.7-12         
[28] Cairo_1.5-6         proto_0.3-10       

Total processing time was 18.55 secs.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('colAlls')

Copyright Henrik Bengtsson. Last updated on 2015-03-02 16:57:01 (-0800 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAADFBMVEX9/v0AAP/9/v3//wBEQjoBAAAABHRSTlP//wD//gy7CwAAAGJJREFUOI3N0rESwCAIA9Ag///PXdoiBk0HhmbNO49DMETQCexNCSyFgdlGoO5DYOr9ThLgPosA7osIQP0sHuDOog8UI/ALa988wzdwXJRctf4s+d36YPTJ6aMd8ux3+QO4ABTtB85yDAh9AAAAAElFTkSuQmCC" document.getElementsByTagName('head')[0].appendChild(link); </script>

[Benchmark reports](Benchmark reports)

Clone this wiki locally