Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High percentile value raises error #289

Open
schlick opened this issue Aug 14, 2019 · 2 comments
Open

High percentile value raises error #289

schlick opened this issue Aug 14, 2019 · 2 comments

Comments

@schlick
Copy link

schlick commented Aug 14, 2019

Using Math.percentile sometimes fails for a high percentile. Example:

require 'facets/math/percentile'
Math.percentile([1,2,3], 90)
NoMethodError: undefined method `-' for nil:NilClass
	from /Users/michael/.rbenv/versions/2.3.6/lib/ruby/gems/2.3.0/gems/facets-3.1.0/lib/standard/facets/math/percentile.rb:33:in `percentile'
	from (irb):4
	from /Users/michael/.rbenv/versions/2.3.6/bin/irb:11:in `<main>'
@schlick
Copy link
Author

schlick commented Aug 14, 2019

At L29:

s1 = sorted_array[whole]

whole is 3 in the provided example and so returns nil for s1. Then L33 tries to take s0 from nil.

@savfischer
Copy link

savfischer commented Oct 21, 2020

@schlick and all future people who encounter this problem:

The implementation of percentile in facets appears to only account for condition 1 in the description of "Estimation of percentiles" at the NIST link which is "For 0<k<N"

I encountered the same bug you did, which I think is more frequently encountered with small arrays at high percentiles, K is often >= N which is condition 3. In that case the algorithm should just return sorted_array.last

I have no idea how to contribute to this repo or if it is still being maintained, but the below code is I think a full implementation of the NIST algorithm for calculating percentiles. What was very helpful for me was seeing the worked example of this method of calculation on the wikipedia page.

def get_percentile(array, percentile)
  array = array.sort
  percentile = percentile.to_f / 100
  rank = percentile * (array.length + 1)
  if rank >= array.length
    array.last
  elsif rank.truncate == 0
    array.first
  else
    k = rank.truncate
    d = rank % 1
    array[k-1] + (d * (array[k] - array[k - 1]))
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants