Skip to content

madDiff

hb edited this page Mar 3, 2015 · 2 revisions

matrixStats: Benchmark report


madDiff() benchmarks

This report benchmark the performance of madDiff() against alternative methods.

Alternative methods

  • N/A

Data type "integer"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (naProb > 0) 
+         x[sample(n, size = naProb * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]

Results

n=1000 vector

All elements

> x <- data[["n=1000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on integer+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 0.0320 0.0375 0.0541 0.0549 0.0602 0.1548
1 madDiff 0.1136 0.1272 0.2120 0.1879 0.2019 2.0433
2 mad 0.1632 0.1698 0.2458 0.2597 0.2820 0.5586
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 3.554 3.390 3.917 3.425 3.351 13.204
2 mad 5.108 4.523 4.543 4.733 4.680 3.609
Figure: Benchmarking of madDiff(), mad() and diff() on integer+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=10000 vector

All elements

> x <- data[["n=10000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on integer+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 0.2086 0.2131 0.2315 0.2171 0.2258 0.4215
1 madDiff 0.4804 0.4874 0.5105 0.4933 0.5018 0.8003
2 mad 0.8327 0.8404 0.8803 0.8450 0.8571 1.5783
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 2.303 2.287 2.205 2.272 2.223 1.899
2 mad 3.991 3.944 3.802 3.892 3.796 3.744
Figure: Benchmarking of madDiff(), mad() and diff() on integer+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=100000 vector

All elements

> x <- data[["n=100000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on integer+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 1.981 2.458 3.103 2.813 3.283 24.41
1 madDiff 3.914 4.623 5.566 5.062 5.952 23.02
2 mad 7.120 8.308 9.807 9.266 10.783 26.76
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 1.976 1.881 1.794 1.800 1.813 0.943
2 mad 3.595 3.380 3.160 3.295 3.285 1.096
Figure: Benchmarking of madDiff(), mad() and diff() on integer+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=1000000 vector

All elements

> x <- data[["n=1000000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on integer+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 20.45 25.31 30.39 27.47 34.82 58.20
1 madDiff 50.91 58.14 67.17 65.20 73.52 96.57
2 mad 78.32 88.83 101.15 97.60 110.25 151.38
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 2.489 2.297 2.210 2.374 2.111 1.659
2 mad 3.829 3.510 3.328 3.554 3.166 2.601
Figure: Benchmarking of madDiff(), mad() and diff() on integer+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds.

Data type "double"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (naProb > 0) 
+         x[sample(n, size = naProb * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]

Results

n=1000 vector

All elements

> x <- data[["n=1000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on double+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 0.0250 0.0293 0.0345 0.0312 0.0335 0.0793
1 madDiff 0.1621 0.1663 0.1898 0.1702 0.1875 0.4415
2 mad 0.1886 0.1921 0.2247 0.1948 0.2533 0.3707
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 6.477 5.684 5.497 5.457 5.598 5.568
2 mad 7.538 6.566 6.508 6.247 7.563 4.675
Figure: Benchmarking of madDiff(), mad() and diff() on double+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=10000 vector

All elements

> x <- data[["n=10000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on double+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 0.1544 0.1821 0.2385 0.2466 0.2762 0.3391
2 mad 1.0725 1.4882 1.5265 1.5887 1.6630 2.1692
1 madDiff 1.2157 1.2800 1.7161 1.7614 1.8509 6.5377
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
2 mad 6.948 8.173 6.399 6.443 6.021 6.396
1 madDiff 7.875 7.030 7.194 7.144 6.701 19.277
Figure: Benchmarking of madDiff(), mad() and diff() on double+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=100000 vector

All elements

> x <- data[["n=100000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on double+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 1.554 2.167 2.905 2.635 3.408 19.64
1 madDiff 8.938 12.120 14.145 13.881 15.037 34.53
2 mad 11.645 14.376 17.339 17.292 18.811 48.10
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
1 madDiff 5.753 5.594 4.870 5.268 4.413 1.758
2 mad 7.495 6.636 5.969 6.562 5.520 2.449
Figure: Benchmarking of madDiff(), mad() and diff() on double+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=1000000 vector

All elements

> x <- data[["n=1000000"]]
> stats <- microbenchmark(madDiff = madDiff(x), mad = mad(x), diff = diff(x), unit = "ms")

Table: Benchmarking of madDiff(), mad() and diff() on double+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 23.46 27.85 33.77 30.45 33.92 73.98
2 mad 95.18 106.58 121.81 114.99 126.52 417.49
1 madDiff 98.11 110.41 124.72 120.01 131.82 405.31
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
2 mad 4.058 3.827 3.607 3.776 3.730 5.643
1 madDiff 4.183 3.965 3.693 3.941 3.886 5.478
Figure: Benchmarking of madDiff(), mad() and diff() on double+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7.7          microbenchmark_1.4-2    matrixStats_0.14.0-9000
[4] ggplot2_1.0.0           knitr_1.9.3             R.devices_2.13.0       
[7] R.utils_2.0.0           R.oo_1.19.0             R.methodsS3_1.7.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.4           GenomeInfoDb_1.3.13   formatR_1.0.3        
 [4] plyr_1.8.1            base64enc_0.1-3       tools_3.2.0          
 [7] digest_0.6.8          RSQLite_1.0.0         annotate_1.45.2      
[10] evaluate_0.5.7        gtable_0.1.2          R.cache_0.11.1-9000  
[13] lattice_0.20-30       DBI_0.3.1             parallel_3.2.0       
[16] mvtnorm_1.0-2         proto_0.3-10          R.rsp_0.20.0         
[19] genefilter_1.49.2     stringr_0.6.2         IRanges_2.1.41       
[22] S4Vectors_0.5.21      stats4_3.2.0          grid_3.2.0           
[25] Biobase_2.27.2        AnnotationDbi_1.29.17 XML_3.98-1.1         
[28] survival_2.38-1       multcomp_1.3-9        TH.data_1.0-6        
[31] reshape2_1.4.1        scales_0.2.4          MASS_7.3-39          
[34] splines_3.2.0         BiocGenerics_0.13.6   xtable_1.8-0         
[37] mime_0.2.1            colorspace_1.2-4      labeling_0.3         
[40] sandwich_2.3-2        munsell_0.4.2         Cairo_1.5-6          
[43] zoo_1.7-12           

Total processing time was 1.09 mins.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('madDiff')

Copyright Henrik Bengtsson. Last updated on 2015-03-02 17:29:21 (-0800 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAADFBMVEX9/v0AAP/9/v3//wBEQjoBAAAABHRSTlP//wD//gy7CwAAAGJJREFUOI3N0rESwCAIA9Ag///PXdoiBk0HhmbNO49DMETQCexNCSyFgdlGoO5DYOr9ThLgPosA7osIQP0sHuDOog8UI/ALa988wzdwXJRctf4s+d36YPTJ6aMd8ux3+QO4ABTtB85yDAh9AAAAAElFTkSuQmCC" document.getElementsByTagName('head')[0].appendChild(link); </script>

[Benchmark reports](Benchmark reports)

Clone this wiki locally