-
Notifications
You must be signed in to change notification settings - Fork 14
Description
I started to use the library in practice and I found a caveat of our current design. Let's say I want to compute a bias correction and the variance of an estimator. If I naively call resample.jackknife.variance and resample.jackknife.bias_corrected, it computes the jackknife estimates twice (which is expensive). The interface should allow me to reuse precomputed jackknife estimates (I am talking about the jackknife but the same is true for the bootstrap).
I am not sure yet how to best achieve this. Here is my idea so far.
Currently, we have in resample.jackknife the signature def variance(fn, sample). It expects two mandatory arguments and I think that should not change. However, we could make it so that if one passes None for fn, then sample is interpreted as the precomputed replicas. This is not ambiguous, because fn is never None under normal circumstances.
This approach works for all jackknife tools, but resample.bootstrap.confidence_level adds further complications. More precisely, when the "student" and "bca" methods are used, the baseline idea does not work. The "student" method also needs fn(sample) in addition to the replicas, and "bca" also needs fn(sample) and jackknife replicas on top.
I think the basic idea can still work, if we make the call to confidence_interval like this
thetas = bootstrap(my_fn, data)
theta = my_fn(data)
j_thetas = jackknife(my_fn, data)
confidence_interval(None, thetas, ci_method="percentile") # ok, works
confidence_interval(None, (thetas, theta), ci_method="student") # ok, additional information passed as tuple
confidence_invertal(None, (thetas, theta, j_thetas), ci_method="bca") # ok, same
Any thoughts?