Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

Finalize API for basic statistics functions #32

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

johnmyleswhite
Copy link
Member

This branch restarts the process of adding missing functionality to our basic statistics functions for skipping NA values while calculating statistics. The code is quite repetitive and can be DRY'ed out in a future run.

For now what I'd like to do is agree on what functionality these functions should offer. For now, I've taken every function I'm replacing and added a skipna keyword that allows one to skip over NA values. For skewness and kurtosis, this keyword has to be passed to the function that computes centers when they are not pre-specified, so the center is now also a keyword called m. (FWIW, I'm not a big fan of specifying centers that aren't the mean, so we might take that out. I'd argue it also doesn't belong in Base: neither R nor SciPy offer that functionality. I'm not sure why we do.)

Things we're not doing that R does:

  • For mean, std and var, R also offers the ability to trim out extreme data points.
  • For median, R also offers the ability to use the lo or hi median, which is simply the lower or higher value in an array with an even number of elements, instead of their average.

Unlike R, Julia Base expects that we will offer:

  • Slices to compute column-wise, row-wise and other dimension-wise functions.
  • Use of non-standard centers

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant