Finalize API for basic statistics functions #32

johnmyleswhite · 2013-12-08T16:28:44Z

This branch restarts the process of adding missing functionality to our basic statistics functions for skipping NA values while calculating statistics. The code is quite repetitive and can be DRY'ed out in a future run.

For now what I'd like to do is agree on what functionality these functions should offer. For now, I've taken every function I'm replacing and added a skipna keyword that allows one to skip over NA values. For skewness and kurtosis, this keyword has to be passed to the function that computes centers when they are not pre-specified, so the center is now also a keyword called m. (FWIW, I'm not a big fan of specifying centers that aren't the mean, so we might take that out. I'd argue it also doesn't belong in Base: neither R nor SciPy offer that functionality. I'm not sure why we do.)

Things we're not doing that R does:

For mean, std and var, R also offers the ability to trim out extreme data points.
For median, R also offers the ability to use the lo or hi median, which is simply the lower or higher value in an array with an even number of elements, instead of their average.

Unlike R, Julia Base expects that we will offer:

Slices to compute column-wise, row-wise and other dimension-wise functions.
Use of non-standard centers

Draft of NA-skipping stats functions

f157e88

simonster mentioned this pull request Jul 2, 2014

Implement reductions with optional skipna argument #101

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalize API for basic statistics functions #32

Finalize API for basic statistics functions #32

johnmyleswhite commented Dec 8, 2013

Finalize API for basic statistics functions #32

Are you sure you want to change the base?

Finalize API for basic statistics functions #32

Conversation

johnmyleswhite commented Dec 8, 2013