Skip to content

varDiff

hb edited this page Mar 3, 2015 · 2 revisions

matrixStats: Benchmark report


varDiff() benchmarks

This report benchmark the performance of varDiff() against alternative methods.

Alternative methods

  • N/A

Data type "integer"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (naProb > 0) 
+         x[sample(n, size = naProb * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]

Results

n=1000 vector

All elements

> x <- data[["n=1000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on integer+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 0.0316 0.0341 0.0417 0.0377 0.0520 0.0793
3 diff 0.0316 0.0352 0.0425 0.0389 0.0512 0.0874
1 varDiff 0.0423 0.0466 0.0609 0.0518 0.0722 0.3969
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.0000 1.000
3 diff 1.000 1.034 1.020 1.031 0.9852 1.102
1 varDiff 1.341 1.367 1.461 1.372 1.3889 5.005
Figure: Benchmarking of varDiff(), var() and diff() on integer+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=10000 vector

All elements

> x <- data[["n=10000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on integer+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 0.0982 0.1014 0.1233 0.1055 0.1551 0.2240
1 varDiff 0.1263 0.1290 0.1547 0.1343 0.1798 0.2487
3 diff 0.2110 0.2148 0.2565 0.2181 0.3134 0.4158
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.000 1.000
1 varDiff 1.286 1.271 1.254 1.274 1.159 1.110
3 diff 2.149 2.118 2.079 2.067 2.020 1.856
Figure: Benchmarking of varDiff(), var() and diff() on integer+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=100000 vector

All elements

> x <- data[["n=100000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on integer+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 0.7691 0.9649 1.212 1.213 1.428 1.901
1 varDiff 0.9878 1.0284 1.427 1.547 1.602 2.553
3 diff 2.0356 2.3825 3.122 2.985 3.344 13.797
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.000 1.000
1 varDiff 1.284 1.066 1.177 1.275 1.122 1.343
3 diff 2.647 2.469 2.575 2.460 2.342 7.256
Figure: Benchmarking of varDiff(), var() and diff() on integer+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=1000000 vector

All elements

> x <- data[["n=1000000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on integer+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 9.672 12.62 14.80 14.88 16.45 30.25
1 varDiff 11.899 15.06 19.30 18.38 21.26 36.62
3 diff 21.382 30.02 37.36 33.40 37.69 344.88
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.000 1.00
1 varDiff 1.230 1.193 1.304 1.235 1.293 1.21
3 diff 2.211 2.378 2.525 2.245 2.292 11.40
Figure: Benchmarking of varDiff(), var() and diff() on integer+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds.

Data type "double"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (naProb > 0) 
+         x[sample(n, size = naProb * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n=%d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]

Results

n=1000 vector

All elements

> x <- data[["n=1000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on double+n=1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
3 diff 0.0262 0.0316 0.0393 0.0346 0.0479 0.1132
2 var 0.0281 0.0316 0.0414 0.0370 0.0508 0.0662
1 varDiff 0.0397 0.0420 0.0545 0.0474 0.0664 0.2067
expr min lq mean median uq max
3 diff 1.000 1.000 1.000 1.000 1.000 1.000
2 var 1.073 1.000 1.053 1.067 1.060 0.585
1 varDiff 1.515 1.329 1.385 1.367 1.385 1.827
Figure: Benchmarking of varDiff(), var() and diff() on double+n=1000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=10000 vector

All elements

> x <- data[["n=10000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on double+n=10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 0.0820 0.0847 0.1035 0.0907 0.1313 0.2383
1 varDiff 0.1101 0.1141 0.1286 0.1166 0.1215 0.2029
3 diff 0.1532 0.1669 0.1956 0.1769 0.2412 0.3026
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.0000 1.0000
1 varDiff 1.343 1.348 1.242 1.287 0.9252 0.8514
3 diff 1.869 1.970 1.890 1.951 1.8372 1.2698
Figure: Benchmarking of varDiff(), var() and diff() on double+n=10000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=100000 vector

All elements

> x <- data[["n=100000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on double+n=100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 0.5998 0.6296 0.8676 0.9362 1.010 1.321
1 varDiff 0.7992 1.0149 1.2725 1.2948 1.508 2.005
3 diff 1.6207 2.1910 2.6907 2.4173 2.763 13.277
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.000 1.000
1 varDiff 1.333 1.612 1.467 1.383 1.493 1.518
3 diff 2.702 3.480 3.101 2.582 2.737 10.050
Figure: Benchmarking of varDiff(), var() and diff() on double+n=100000 data. Outliers are displayed as crosses. Times are in milliseconds.

n=1000000 vector

All elements

> x <- data[["n=1000000"]]
> stats <- microbenchmark(varDiff = varDiff(x), var = var(x), diff = diff(x), unit = "ms")

Table: Benchmarking of varDiff(), var() and diff() on double+n=1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
2 var 7.138 7.39 9.509 9.68 10.67 14.78
1 varDiff 10.347 12.24 16.218 15.50 17.09 35.92
3 diff 21.823 26.80 32.124 30.88 35.89 46.96
expr min lq mean median uq max
2 var 1.000 1.000 1.000 1.000 1.000 1.000
1 varDiff 1.450 1.656 1.706 1.601 1.601 2.430
3 diff 3.057 3.627 3.378 3.190 3.362 3.177
Figure: Benchmarking of varDiff(), var() and diff() on double+n=1000000 data. Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R Under development (unstable) (2015-02-27 r67909)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7.7          microbenchmark_1.4-2    matrixStats_0.14.0-9000
[4] ggplot2_1.0.0           knitr_1.9.3             R.devices_2.13.0       
[7] R.utils_2.0.0           R.oo_1.19.0             R.methodsS3_1.7.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.4           GenomeInfoDb_1.3.13   formatR_1.0.3        
 [4] plyr_1.8.1            base64enc_0.1-3       tools_3.2.0          
 [7] digest_0.6.8          RSQLite_1.0.0         annotate_1.45.2      
[10] evaluate_0.5.7        gtable_0.1.2          R.cache_0.11.1-9000  
[13] lattice_0.20-30       DBI_0.3.1             parallel_3.2.0       
[16] mvtnorm_1.0-2         proto_0.3-10          R.rsp_0.20.0         
[19] genefilter_1.49.2     stringr_0.6.2         IRanges_2.1.41       
[22] S4Vectors_0.5.21      stats4_3.2.0          grid_3.2.0           
[25] Biobase_2.27.2        AnnotationDbi_1.29.17 XML_3.98-1.1         
[28] survival_2.38-1       multcomp_1.3-9        TH.data_1.0-6        
[31] reshape2_1.4.1        scales_0.2.4          MASS_7.3-39          
[34] splines_3.2.0         BiocGenerics_0.13.6   xtable_1.8-0         
[37] mime_0.2.1            colorspace_1.2-4      labeling_0.3         
[40] sandwich_2.3-2        munsell_0.4.2         Cairo_1.5-6          
[43] zoo_1.7-12           

Total processing time was 26.27 secs.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('varDiff')

Copyright Henrik Bengtsson. Last updated on 2015-03-02 17:38:42 (-0800 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAADFBMVEX9/v0AAP/9/v3//wBEQjoBAAAABHRSTlP//wD//gy7CwAAAGJJREFUOI3N0rESwCAIA9Ag///PXdoiBk0HhmbNO49DMETQCexNCSyFgdlGoO5DYOr9ThLgPosA7osIQP0sHuDOog8UI/ALa988wzdwXJRctf4s+d36YPTJ6aMd8ux3+QO4ABTtB85yDAh9AAAAAElFTkSuQmCC" document.getElementsByTagName('head')[0].appendChild(link); </script>

[Benchmark reports](Benchmark reports)

Clone this wiki locally