-
Notifications
You must be signed in to change notification settings - Fork 0
count
matrixStats: Benchmark report
This report benchmark the performance of count() against alternative methods.
- sum(x == value)
> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+ mode <- match.arg(mode)
+ if (mode == "logical") {
+ X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+ } else {
+ x <- runif(n, min = range[1], max = range[2])
+ }
+ storage.mode(x) <- mode
+ if (naProb > 0)
+ x[sample(n, size = naProb * n)] <- NA
+ x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+ set.seed(seed)
+ data <- list()
+ data[[1]] <- rvector(n = scale * 100, ...)
+ data[[2]] <- rvector(n = scale * 1000, ...)
+ data[[3]] <- rvector(n = scale * 10000, ...)
+ data[[4]] <- rvector(n = scale * 1e+05, ...)
+ data[[5]] <- rvector(n = scale * 1e+06, ...)
+ names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+ data
+ }
> data <- rvectors(mode = mode)> x <- data[["n=1000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753532 93.7 2637877 140.9 2637877 140.9
Vcells 18332974 139.9 35610798 271.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on integer+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 0.0054 | 0.0062 | 0.0094 | 0.0065 | 0.0069 | 0.2806 |
| sum(x == value) | 0.0092 | 0.0100 | 0.0103 | 0.0100 | 0.0106 | 0.0189 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.0000 |
| sum(x == value) | 1.714 | 1.625 | 1.099 | 1.529 | 1.528 | 0.0672 |
| Figure: Benchmarking of count() and sum(x == value)() on integer+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=10000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753454 93.7 2637877 140.9 2637877 140.9
Vcells 18333268 139.9 35610798 271.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on integer+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 0.0177 | 0.0246 | 0.0311 | 0.0293 | 0.0348 | 0.1201 |
| sum(x == value) | 0.0901 | 0.1434 | 0.1403 | 0.1488 | 0.1536 | 0.1909 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.00 | 1.000 | 1.000 | 1.000 | 1.00 |
| sum(x == value) | 5.087 | 5.82 | 4.504 | 5.085 | 4.409 | 1.59 |
| Figure: Benchmarking of count() and sum(x == value)() on integer+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=100000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753466 93.7 2637877 140.9 2637877 140.9
Vcells 18333276 139.9 35610798 271.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on integer+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 0.1494 | 0.1684 | 0.2043 | 0.1917 | 0.2331 | 0.3353 |
| sum(x == value) | 0.9112 | 0.9333 | 1.2753 | 1.3679 | 1.5494 | 1.9317 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| sum(x == value) | 6.101 | 5.542 | 6.242 | 7.136 | 6.647 | 5.761 |
| Figure: Benchmarking of count() and sum(x == value)() on integer+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=1000000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753478 93.7 2637877 140.9 2637877 140.9
Vcells 18333796 139.9 35610798 271.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on integer+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.595 | 1.663 | 2.07 | 1.845 | 2.579 | 2.843 |
| sum(x == value) | 10.362 | 14.014 | 18.27 | 18.179 | 19.452 | 54.041 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.00 |
| sum(x == value) | 6.499 | 8.427 | 8.828 | 9.855 | 7.543 | 19.01 |
| Figure: Benchmarking of count() and sum(x == value)() on integer+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=10000000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753490 93.7 2637877 140.9 2637877 140.9
Vcells 18333804 139.9 35610798 271.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on integer+n=10000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 15.36 | 15.59 | 18.76 | 17.5 | 19.58 | 37.07 |
| sum(x == value) | 143.90 | 171.43 | 190.82 | 186.8 | 204.55 | 443.83 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1 | 1.00 | 1.00 | 1.00 | 1.00 |
| sum(x == value) | 9.366 | 11 | 10.17 | 10.67 | 10.45 | 11.97 |
| Figure: Benchmarking of count() and sum(x == value)() on integer+n=10000000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+ mode <- match.arg(mode)
+ if (mode == "logical") {
+ X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+ } else {
+ x <- runif(n, min = range[1], max = range[2])
+ }
+ storage.mode(x) <- mode
+ if (naProb > 0)
+ x[sample(n, size = naProb * n)] <- NA
+ x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+ set.seed(seed)
+ data <- list()
+ data[[1]] <- rvector(n = scale * 100, ...)
+ data[[2]] <- rvector(n = scale * 1000, ...)
+ data[[3]] <- rvector(n = scale * 10000, ...)
+ data[[4]] <- rvector(n = scale * 1e+05, ...)
+ data[[5]] <- rvector(n = scale * 1e+06, ...)
+ names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+ data
+ }
> data <- rvectors(mode = mode)> x <- data[["n=1000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753502 93.7 2637877 140.9 2637877 140.9
Vcells 23889312 182.3 42812957 326.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on double+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 2 | sum(x == value) | 0.0127 | 0.0135 | 0.0147 | 0.0142 | 0.0150 | 0.0262 |
| 1 | count | 0.0135 | 0.0142 | 0.0168 | 0.0150 | 0.0167 | 0.0920 |
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 2 | sum(x == value) | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| 1 | count | 1.061 | 1.057 | 1.138 | 1.054 | 1.115 | 3.515 |
| Figure: Benchmarking of count() and sum(x == value)() on double+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds. | |||||||
![]() |
> x <- data[["n=10000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753514 93.7 2637877 140.9 2637877 140.9
Vcells 23889565 182.3 42812957 326.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on double+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 0.0393 | 0.0664 | 0.0725 | 0.0781 | 0.0812 | 0.1186 |
| sum(x == value) | 0.0770 | 0.1293 | 0.1284 | 0.1355 | 0.1363 | 0.2560 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| sum(x == value) | 1.961 | 1.948 | 1.772 | 1.734 | 1.678 | 2.159 |
| Figure: Benchmarking of count() and sum(x == value)() on double+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=100000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753526 93.7 2637877 140.9 2637877 140.9
Vcells 23889833 182.3 42812957 326.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on double+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 0.3565 | 0.572 | 0.6148 | 0.6086 | 0.6504 | 1.509 |
| sum(x == value) | 0.7676 | 1.199 | 1.4942 | 1.2921 | 1.3624 | 14.545 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.00 | 1.000 | 1.000 | 1.000 |
| sum(x == value) | 2.153 | 2.096 | 2.43 | 2.123 | 2.095 | 9.641 |
| Figure: Benchmarking of count() and sum(x == value)() on double+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=1000000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753538 93.7 2637877 140.9 2637877 140.9
Vcells 23889841 182.3 42812957 326.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on double+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 3.751 | 4.363 | 5.70 | 6.036 | 6.716 | 10.89 |
| sum(x == value) | 8.060 | 12.240 | 13.91 | 13.683 | 14.530 | 72.72 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| sum(x == value) | 2.149 | 2.805 | 2.441 | 2.267 | 2.164 | 6.676 |
| Figure: Benchmarking of count() and sum(x == value)() on double+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
> x <- data[["n=10000000"]]
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1753550 93.7 2637877 140.9 2637877 140.9
Vcells 23890161 182.3 42812957 326.7 68120027 519.8
> stats <- microbenchmark(count = count(x, value), `sum(x == value)` = sum(x == value), unit = "ms")Table: Benchmarking of count() and sum(x == value)() on double+n=10000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 37.61 | 41.24 | 46.89 | 46.26 | 51.05 | 90.78 |
| sum(x == value) | 90.16 | 106.57 | 116.83 | 114.62 | 128.46 | 152.56 |
| expr | min | lq | mean | median | uq | max |
|---|---|---|---|---|---|---|
| count | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| sum(x == value) | 2.397 | 2.584 | 2.491 | 2.478 | 2.516 | 1.681 |
| Figure: Benchmarking of count() and sum(x == value)() on double+n=10000000 data. Outliers are displayed as crosses. Times are in milliseconds. | ||||||
![]() |
R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] markdown_0.7.7 microbenchmark_1.4-2 matrixStats_0.14.0-9000
[4] ggplot2_1.0.0 knitr_1.9.3 R.devices_2.13.0
[7] R.utils_2.0.0 R.oo_1.19.0 R.methodsS3_1.7.0
loaded via a namespace (and not attached):
[1] Rcpp_0.11.4 GenomeInfoDb_1.3.13 formatR_1.0.3
[4] plyr_1.8.1 base64enc_0.1-3 tools_3.2.0
[7] digest_0.6.8 RSQLite_1.0.0 annotate_1.45.2
[10] evaluate_0.5.7 gtable_0.1.2 R.cache_0.11.1-9000
[13] lattice_0.20-30 DBI_0.3.1 parallel_3.2.0
[16] mvtnorm_1.0-2 proto_0.3-10 R.rsp_0.20.0
[19] genefilter_1.49.2 stringr_0.6.2 IRanges_2.1.41
[22] S4Vectors_0.5.21 stats4_3.2.0 grid_3.2.0
[25] Biobase_2.27.2 AnnotationDbi_1.29.17 XML_3.98-1.1
[28] survival_2.38-1 multcomp_1.3-9 TH.data_1.0-6
[31] reshape2_1.4.1 scales_0.2.4 MASS_7.3-39
[34] splines_3.2.0 BiocGenerics_0.13.6 xtable_1.8-0
[37] mime_0.2.1 colorspace_1.2-4 labeling_0.3
[40] sandwich_2.3-2 munsell_0.4.2 Cairo_1.5-6
[43] zoo_1.7-12 Total processing time was 59.73 secs.
To reproduce this report, do:
html <- matrixStats:::benchmark('count')Copyright Henrik Bengtsson. Last updated on 2015-03-02 17:27:04 (-0800 UTC). Powered by RSP.
<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAADFBMVEX9/v0AAP/9/v3//wBEQjoBAAAABHRSTlP//wD//gy7CwAAAGJJREFUOI3N0rESwCAIA9Ag///PXdoiBk0HhmbNO49DMETQCexNCSyFgdlGoO5DYOr9ThLgPosA7osIQP0sHuDOog8UI/ALa988wzdwXJRctf4s+d36YPTJ6aMd8ux3+QO4ABTtB85yDAh9AAAAAElFTkSuQmCC" document.getElementsByTagName('head')[0].appendChild(link); </script>[Benchmark reports](Benchmark reports)









