better bootstrapping #285

aaronspring · 2019-12-29T23:06:13Z

Description

speed-up in bootstrap functions (bootstrap_func for stats and bootstrap_compute for skill):
- xr.quantile exchanged for dask.map_blocks(np.percentile)
- properly implemented handling for lazy results when chunked inputs
- user gets warned when chunking potentially (un)-necessary

How I got there - 2 working gists:

stats bootstrap: https://gist.github.com/aaronspring/118abd7b9bf81e555b1fced42eef427f
skill bootstrap: https://gist.github.com/aaronspring/4750d2b1eb0468ac59a034ed5a5f136c

Still to do:

notebook demonstrating effective chunking
generalize my_quantile with dim argument

Waiting for PRs to merge first:

Closes #145

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue) slow speed of compute_skill before can be considered a bug
New feature (non-breaking change which adds functionality)

How Has This Been Tested?

pytest

Checklist (while developing)

I have added docstrings to all new functions.
I have commented my code, particularly in hard-to-understand areas
Tests added for pytest, if necessary.
I have updated the sphinx documentation, if necessary.

Pre-Merge Checklist (final steps)

I have rebased onto master or develop (wherever I am merging) and dealt with any conflicts.
I have squashed commits to a reasonable amount, and force-pushed the squashed commits.

aaronspring · 2020-01-01T16:55:57Z

performance increase on my 2-core macbookpro 2018:

asv continuous master AS_fix_bootstrapping

       before           after         ratio
     [f2e95592]       [21f41994]
     <master>         <AS_fix_bootstrapping>
-       3.74±0.2s         463±10ms     0.12  benchmarks_perfect_model.Compute.time_bootstrap_perfect_model

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

This 9x performance increase comes from removing the unnecessary .compute() calls.

aaronspring · 2020-01-02T11:19:57Z

climpred/tests/test_bootstrap.py

+@pytest.mark.parametrize('chunk', [True, False])
+def test_dask_percentile_implemented_faster_xr_quantile(control3d, chunk):
+    chunk_dim, dim = 'x', 'time'
+    # chunk_dim, dim = 'time', 'x' # fails, why?


my_quantile works as expected in bootstrap_compute because it is applied to the first dimension (there bootstrap, here time). but it fails to work on axis!=0 with nans.

this small demo runs without asserion errors:

a.dims ('time', 'lon', 'lat') a.shape q = .05 chunk_dim, dim = 'lon', 'time' ac = a.chunk({chunk_dim: 2}) acp = a.quantile(dim=dim,q=q) acp.shape %time acp2 = my_quantile(ac, dim, q).compute() acp2.shape xr.testing.assert_allclose(acp, acp2) acl = ac.load() %time acp = my_quantile(acl, dim, q).compute() acp.shape xr.testing.assert_allclose(acp, acp2) chunk_dim, dim = 'time', 'lon' acp = a.quantile(dim=dim,q=q) acp.shape ac = a.chunk({chunk_dim: 2}) %time acp2 = my_quantile(ac, dim, q).compute() acp2.shape xr.testing.assert_allclose(acp, acp2) acl = ac.load() %time acp = my_quantile(acl, dim, q).compute() acp.shape xr.testing.assert_allclose(acp, acp2)

climpred/bootstrap.py

aaronspring · 2020-01-04T08:24:57Z

climpred/bootstrap.py

+        return np.percentile(arr, axis=axis, q=q)
+
+
+def my_quantile(ds, dim='bootstrap', q=0.95):


see the tests. this implementation works great if dim is axis=0 and there are no missing values. I dont really get why tests fail.

I don't have time to do this right now. Hopefully you'll be able to work this out when responding to comments. I can look another time after you incorporate the comments.

climpred/bootstrap.py

climpred/comparisons.py

climpred/tests/test_stats.py

climpred/utils.py

bradyrx

A number of refactoring/docstring changes. Going to rebase this and pull it down locally to test timing and the breaking tests.

CHANGELOG.rst

asv_bench/asv.conf.json

asv_bench/benchmarks/benchmarks_perfect_model.py

climpred/bootstrap.py

climpred/utils.py

bradyrx · 2020-01-12T20:22:43Z

So I've mentioned this in individual comments, but right now, this just encourages/enables dask to be used to chunk the original object. This deals with the fact that it's embarassingly parallel in space -- we are computing the same quantities at each grid cell and they don't depend on each other. So for sufficiently large datasets, this speeds things up. That's why you get a speedup with compute_perfect_model with dask. It's one iteration but we're chunking in space.

However, this doesn't address at all the embarassingly parallel iteration problem. I think we need to leverage something more advanced like multiprocessing or something from dask... maybe map_blocks or something similar. @ahuang11 posted a suggested solution for how to parallelize computing multiple things at once for instance.

Andrew asked this here... it may be subtle and we just need to wrap the iterations? I'll test on a small example locally with a different function. https://groups.google.com/forum/#!searchin/xarray/andrew$20huang%7Csort:date/xarray/KBwllWmMY30/v98z03lcAgAJ

But my stats were:

Compute function

Macbook Pro:

No chunking: 24.8s
Chunking (with Client): 11.4s

Casper with 36 cores/workers:

No chunking: 15.7s
Chunking: 17.9s

Bootstrapping:

Macbook Pro:

Chunking: 2min29s

Casper:

Chunking: 2min57s

bradyrx

This looks like a lot of comments but each of them should only take a few seconds. I was really nitpicky here because I think it's important for us to commit really clean code with good docstrings, etc. so we don't have to worry as much about refactoring later (and so it's easy on new contributors)

HOWTOCONTRIBUTE.rst

climpred/tests/test_bootstrap.py

climpred/utils.py

docs/source/examples/efficient_dask.ipynb

bradyrx · 2020-01-29T17:50:57Z

This PR is also ready. Except that I need to looking into rebasing. Basically I implemented the requests. my_quantile works for dim=bootstrapping being axis =0 which is always the case after concat. I see my_q rather as a placeholder until xr.q is daskified and therefore this is ok as is

My benchmark tests are failing but I think it's because it needs to be rebased. Is xr.quantile not daskified on the highest level? As in the xarray folks haven't implemented this?

aaronspring · 2020-01-29T20:36:11Z

nope

aaronspring · 2020-01-30T13:52:11Z

Somehow docs fail in travis but not in rtd check

bradyrx · 2020-01-30T17:28:14Z

Somehow docs fail in travis but not in rtd check

@aaronspring , this is because RTD check just makes sure that they can build. The notebook that broke is pre-compiled so that RTD doesn't need to use the memory to compile it. Travis wipes the notebook clean and tries to compile it to see if we have any breaking code. Looks like there's some breaking code in there that needs to be fixed.

I can look back over this PR later today

bradyrx · 2020-01-30T17:29:15Z

Two things that need to be fixed for travis:

flake8:

climpred/tests/test_checks.py:20:1: E303 too many blank lines (3)

notebook:

CellExecutionError in examples/subseasonal/weekly-subx-example.ipynb:

------------------

fcstds=decode_cf(fcstds,'init')

fcstds['init']=pd.to_datetime(fcstds.init.values.astype(str))

fcstds['init']=pd.to_datetime(fcstds['init'].dt.strftime('%Y%m%d 00:00'))

------------------

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-8-b3735a2f029c> in <module>

      1 fcstds=decode_cf(fcstds,'init')

      2 fcstds['init']=pd.to_datetime(fcstds.init.values.astype(str))

----> 3 fcstds['init']=pd.to_datetime(fcstds['init'].dt.strftime('%Y%m%d 00:00'))

~/miniconda/envs/climpred-dev/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)

    736             result = convert_listlike(arg, format)

    737     elif is_list_like(arg):

--> 738         cache_array = _maybe_cache(arg, format, cache, convert_listlike)

    739         if not cache_array.empty:

    740             result = _convert_and_box_cache(arg, cache_array)

~/miniconda/envs/climpred-dev/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _maybe_cache(arg, format, cache, convert_listlike)

    145     if cache:

    146         # Perform a quicker unique check

--> 147         if not should_cache(arg):

    148             return cache_array

    149 

~/miniconda/envs/climpred-dev/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in should_cache(arg, unique_share, check_count)

    114     assert 0 < unique_share < 1, "unique_share must be in next bounds: (0; 1)"

    115 

--> 116     unique_elements = set(islice(arg, check_count))

    117     if len(unique_elements) > check_count * unique_share:

    118         do_caching = False

TypeError: unhashable type: 'DataArray'

TypeError: unhashable type: 'DataArray'

aaronspring · 2020-01-31T08:45:17Z

xr.quantile is now in xr 0015 daskified.

Do you understand the notebook error? Haven’t touched that...

bradyrx · 2020-01-31T15:48:17Z

xr.quantile is now in xr 0015 daskified.

Great.

Do you understand the notebook error? Haven’t touched that...

I just tried reproducing it locally and it ran fine, so I re-triggered the travis build to see if it was just a weird error. I'll follow up after the build is done...

bradyrx · 2020-01-31T16:03:20Z

That didn't work. @aaronspring try replacing the cell:

fcstds=decode_cf(fcstds,'init')
fcstds['init']=pd.to_datetime(fcstds.init.values.astype(str))
fcstds['init']=pd.to_datetime(fcstds['init'].dt.strftime('%Y%m%d 00:00'))

with:

fcstds=decode_cf(fcstds,'init')

I'm not convinced those pd.to_datetime lines are needed.

aaronspring · 2020-02-03T16:16:53Z

this now passes. fixed with pandas==0.25.3. but there is a bug in bootstrap_hindcast for probabilistic metrics which I only found because of asv. #317

bradyrx · 2020-02-03T16:29:44Z

Thanks @aaronspring! We can keep #317 to another PR. Probably something we will talk about during OSM/your week here.

I'll review/merge later today or tomorrow. Have some deadlines to meet today..

aaronspring · 2020-02-03T16:51:49Z

The implementation of my_quantile is a bit dirty, but tested for the dim=first_axis case and gives an incredible speedup, see https://github.com/bradyrx/climpred/issues/316

I would prefer to implement this sneaky new my_quantile than not having this performance increase.

bradyrx · 2020-02-04T16:56:47Z

This is great @aaronspring, thanks so much. Merging now!

AS added 3 commits December 29, 2019 17:47

bootstrap.stats parallelized properly

fd68215

bootstrap_compute properly parallelized

9da77a2

CL PR_template

21f4199

aaronspring self-assigned this Dec 29, 2019

aaronspring added bug feature request don't merge labels Dec 29, 2019

AS added 4 commits January 1, 2020 19:30

dask demo

28bc13a

dask notebook macbook timings

fd29223

dask nb timings mistral

9c74bea

tried to fix my_quantile

0690392

aaronspring commented Jan 2, 2020

View reviewed changes