Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix decode cf with dask #2047

Merged
merged 2 commits into from
Apr 12, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ Enhancements
Bug fixes
~~~~~~~~~

- Fixed ``decode_cf`` function to operate lazily on dask arrays
(:issue:`1372`). By `Ryan Abernathey <https://github.com/rabernat>`_.
- Fixed labeled indexing with slice bounds given by xarray objects with
datetime64 or timedelta64 dtypes (:issue:`1240`).
By `Stephan Hoyer <https://github.com/shoyer>`_.
Expand Down
7 changes: 4 additions & 3 deletions xarray/conventions.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from .coding import times, variables
from .coding.variables import SerializationWarning
from .core import duck_array_ops, indexing
from .core.pycompat import OrderedDict, basestring, iteritems
from .core.pycompat import OrderedDict, basestring, iteritems, dask_array_type
from .core.variable import IndexVariable, Variable, as_variable


Expand Down Expand Up @@ -490,8 +490,9 @@ def decode_cf_variable(name, var, concat_characters=True, mask_and_scale=True,
del attributes['dtype']
data = BoolTypeArray(data)

return Variable(dimensions, indexing.LazilyOuterIndexedArray(data),
attributes, encoding=encoding)
if not isinstance(data, dask_array_type):
data = indexing.LazilyOuterIndexedArray(data)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the offending line which caused the dask arrays to stop being dask arrays. I have simply bypassed it for dask arrays. This didn't break any tests. But can we think of any unforeseen negative consequences? What does LazilyOuterIndexedArray do here? I have to admit that I don't understand the details of our new indexing wrappers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LazilyOuterIndexedArray makes indexing on non-dask arrays lazy. It's not necessary or useful for dask arrays.

return Variable(dimensions, data, attributes, encoding=encoding)


def decode_cf_variables(variables, attributes, concat_characters=True,
Expand Down
13 changes: 12 additions & 1 deletion xarray/tests/test_conventions.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

from . import (
IndexerMaker, TestCase, assert_array_equal, raises_regex, requires_netCDF4,
requires_netcdftime, unittest)
requires_netcdftime, unittest, requires_dask)
from .test_backends import CFEncodedDataTest

B = IndexerMaker(indexing.BasicIndexer)
Expand Down Expand Up @@ -332,6 +332,17 @@ def test_decode_cf_datetime_transition_to_invalid(self):

assert_array_equal(ds_decoded.time.values, expected)

@requires_dask
def test_decode_cf_with_dask(self):
import dask
original = Dataset({
't': ('t', [0, 1, 2], {'units': 'days since 2000-01-01'}),
'foo': ('t', [0, 0, 0], {'coordinates': 'y', 'units': 'bar'}),
'y': ('t', [5, 10, -999], {'_FillValue': -999})
}).chunk({'t': 1})
decoded = conventions.decode_cf(original)
assert dask.is_dask_collection(decoded.y.data)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an appropriate place to test for this? There is no other dask stuff in test_conventions.py, but it seemed to make sense there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fine place to me.



class CFEncodedInMemoryStore(WritableCFDataStore, InMemoryDataStore):
pass
Expand Down