Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Experimental changes in rolling_var related to #7900 #7916

Closed
wants to merge 1 commit into from

Conversation

jaimefrio
Copy link
Contributor

Added logic to rolling_var to detect windows where all non-NaN
values are identical.

Need to assess both correctness and performance impact.

@@ -1161,6 +1161,7 @@ def roll_var(ndarray[double_t] input, int win, int minp, int ddof=1):
Numerically stable implementation using Welford's method.
"""
cdef double val, prev, mean_x = 0, ssqdm_x = 0, nobs = 0, delta
cdef double rep = NaN, nrep = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are nobs and nrep doubles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobs was a double before my changes (see here). I seem to recall that I tried changing it to Py_ssize_t and there was some performance degradation, so I let it be. But I may very well be making this up, don't remember the exact details...

Since the effect of this new modifications on performance have to be measured, I'll try things both ways and add some information on the commit message if it stays a double.

@jaimefrio
Copy link
Contributor Author

I have refactored the code quite a bit, see what you think of this, @seth-p. The good news is that the performance differences seems negligible, and may even favor the new implementation, which I really find hard to understand. Following are timings over a random array of size double values, using a window of window items. Timed values are the minimum of enough repeats to take 0.1 sec. total time:

                                       master         new
size:     100, window:    10, time: 4.06e-05 sec. 4.06e-05 sec. 
size:     100, window:    30, time: 4.23e-05 sec. 4.19e-05 sec. 
size:     300, window:    10, time: 4.68e-05 sec. 4.56e-05 sec. 
size:     300, window:    30, time: 4.39e-05 sec. 3.98e-05 sec. 
size:     300, window:   100, time: 5.34e-05 sec. 4.23e-05 sec. 
size:    1000, window:    10, time: 9.48e-05 sec. 8.66e-05 sec. 
size:    1000, window:    30, time: 8.01e-05 sec. 9.36e-05 sec. 
size:    1000, window:   100, time: 7.76e-05 sec. 1.22e-04 sec. 
size:    1000, window:   300, time: 7.72e-05 sec. 1.08e-04 sec. 
size:    3000, window:    10, time: 1.34e-04 sec. 1.63e-04 sec. 
size:    3000, window:    30, time: 1.77e-04 sec. 1.38e-04 sec. 
size:    3000, window:   100, time: 1.67e-04 sec. 1.34e-04 sec. 
size:    3000, window:   300, time: 1.63e-04 sec. 1.59e-04 sec. 
size:    3000, window:  1000, time: 1.29e-04 sec. 1.47e-04 sec. 
size:   10000, window:    10, time: 4.34e-04 sec. 4.12e-04 sec. 
size:   10000, window:    30, time: 4.39e-04 sec. 3.18e-04 sec. 
size:   10000, window:   100, time: 3.55e-04 sec. 2.82e-04 sec. 
size:   10000, window:   300, time: 3.59e-04 sec. 2.89e-04 sec. 
size:   10000, window:  1000, time: 4.66e-04 sec. 3.09e-04 sec. 
size:   10000, window:  3000, time: 4.58e-04 sec. 3.64e-04 sec. 
size:   30000, window:    10, time: 1.25e-03 sec. 1.02e-03 sec. 
size:   30000, window:    30, time: 8.98e-04 sec. 7.80e-04 sec. 
size:   30000, window:   100, time: 8.39e-04 sec. 1.09e-03 sec. 
size:   30000, window:   300, time: 8.08e-04 sec. 8.19e-04 sec. 
size:   30000, window:  1000, time: 1.06e-03 sec. 6.99e-04 sec. 
size:   30000, window:  3000, time: 7.99e-04 sec. 6.68e-04 sec. 
size:   30000, window: 10000, time: 6.61e-04 sec. 6.29e-04 sec. 
size:  100000, window:    10, time: 2.46e-03 sec. 2.25e-03 sec. 
size:  100000, window:    30, time: 2.83e-03 sec. 2.29e-03 sec. 
size:  100000, window:   100, time: 2.63e-03 sec. 2.44e-03 sec. 
size:  100000, window:   300, time: 2.50e-03 sec. 2.54e-03 sec. 
size:  100000, window:  1000, time: 2.41e-03 sec. 2.61e-03 sec. 
size:  100000, window:  3000, time: 2.63e-03 sec. 2.34e-03 sec. 
size:  100000, window: 10000, time: 2.38e-03 sec. 2.24e-03 sec. 
size:  300000, window:    10, time: 8.60e-03 sec. 8.85e-03 sec. 
size:  300000, window:    30, time: 8.44e-03 sec. 7.96e-03 sec. 
size:  300000, window:   100, time: 8.74e-03 sec. 8.11e-03 sec. 
size:  300000, window:   300, time: 8.72e-03 sec. 8.40e-03 sec. 
size:  300000, window:  1000, time: 8.45e-03 sec. 8.09e-03 sec. 
size:  300000, window:  3000, time: 8.76e-03 sec. 8.42e-03 sec. 
size:  300000, window: 10000, time: 9.08e-03 sec. 8.51e-03 sec. 
size: 1000000, window:    10, time: 3.17e-02 sec. 3.06e-02 sec. 
size: 1000000, window:    30, time: 3.21e-02 sec. 2.97e-02 sec. 
size: 1000000, window:   100, time: 3.40e-02 sec. 2.82e-02 sec. 
size: 1000000, window:   300, time: 3.17e-02 sec. 3.05e-02 sec. 
size: 1000000, window:  1000, time: 2.93e-02 sec. 3.12e-02 sec. 
size: 1000000, window:  3000, time: 2.94e-02 sec. 2.75e-02 sec. 
size: 1000000, window: 10000, time: 3.34e-02 sec. 2.81e-02 sec.

@jreback
Copy link
Contributor

jreback commented Aug 5, 2014

can you add a test that validates this new behaviour (well not new, but 'correct')?

@jreback jreback added this to the 0.15.0 milestone Aug 5, 2014
@jaimefrio
Copy link
Contributor Author

It may take a couple of days until I get around to it, but if we think this is worth adding I will definitely write some tests for it.

@jreback
Copy link
Contributor

jreback commented Aug 5, 2014

sure no problem

I think definitely worthwhile avoiding pathological cases (because of rounding / precision issues) if possible (as u have shown)

@seth-p
Copy link
Contributor

seth-p commented Aug 5, 2014

Agreed, I think it's worth handling degenerate cases accurately.

if nobs == 1:
# pathological case
val = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps val = 0 if (nobs > ddof) else NaN?

Added logic to `rolling_var` to detect windows where all non-NaN
values are identical.

Need to assess both correctness and performance impact.
@jreback
Copy link
Contributor

jreback commented Sep 10, 2014

how's this coming?

@jreback
Copy link
Contributor

jreback commented Sep 14, 2014

@jaimefrio status?

@jaimefrio
Copy link
Contributor Author

I was having some rebase problems last night, but it is mostly ready, Will get back to it tonight and push a hopefully final version.

@jaimefrio
Copy link
Contributor Author

Couldn't solve my rebase issues, so created a new PR #8271. Closing this

@jaimefrio jaimefrio closed this Sep 15, 2014
@jaimefrio jaimefrio deleted the zero_var_detection branch September 16, 2014 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants