Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode/Decode complex fill values #363

Merged
merged 5 commits into from
Feb 26, 2019
Merged

Encode/Decode complex fill values #363

merged 5 commits into from
Feb 26, 2019

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Dec 15, 2018

Fixes #244

Provides a simple implementation of encoding/decoding complex fill values and tests that it works in a basic case.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • Docs build locally (e.g., run tox -e docs)
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

Implement a simple test to try constructing Zarr Arrays with complex
data to see what issues crop up so as to fix them.
Provides a simple test to see if Zarr Arrays can handle storing a
variety of complex typed fill values.
Simply places the real and imaginary values into a list with each one
being encoded/decoded using the floating type used to represent them
independently. This keeps the encoded complex fill values human
readable, follows the existing floating value encoding/decoding rules,
and represents the data in a way that JSON can easily represent it.
@jakirkham jakirkham mentioned this pull request Dec 15, 2018
@jakirkham jakirkham added this to the v2.3 milestone Dec 15, 2018
@jakirkham
Copy link
Member Author

Adding to the v2.3 milestone to make it easier to keep track of this. However if we opt for a different approach or decide to pass on complex numbers for now, we can always remove it from the milestone.

Copy link
Contributor

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be great to be able to encode complex data in zarr. I have immediate use cases for this.

I have a few minor suggestions about testing, but I don't consider them to be blockers for merging this.

elif dtype.kind in 'c':
v = (encode_fill_value(v.real, dtype.type().real.dtype),
encode_fill_value(v.imag, dtype.type().imag.dtype))
return v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this encoding / decoding of fill values is necessary is necessary because json does not support serialization of complex data types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right.

This will encode to a JSON list, which is nice for readability. Though we could certainly imagine other ways of handling this (e.g. base64 encoded string).

Any preferences?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems totally reasonable.

a = np.linspace(0, 1, z.shape[0], dtype=dtype)
a -= 1j * a
z[:] = a
assert_array_almost_equal(a, z[:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the special treatment required for fill values, would it make sense to add an explicit check that complex data with missing values is handled correctly? I see that the metadata encoding is checked below, but not the actual filling procedure (as far as I can tell).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate a bit on what kind of representation of missing data we are discussing? We handle NaN currently, but maybe this is not what you are thinking about.

Copy link
Contributor

@rabernat rabernat Dec 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is to make sure that missing data (the kind that triggers the fill_value machinery to be activated) is present in the array. For example, by adding

a[0] = np.nan

However, I might be misunderstanding what fill_value does in zarr. Is fill_value only used to represent uninitialized portions of the array? Or are all nans explicitly translated to fill_value before serialization?

Copy link
Member Author

@jakirkham jakirkham Feb 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK fill_value is only used to handle uninitialized chunks. Am not aware of any special behavior around NaNs in Zarr. Does that help?

Edit: IOW this assert_array_almost_equal line is testing the uninitialized chunks are constructed with the fill_value.

@jakirkham
Copy link
Member Author

Thanks for the feedback @rabernat. Responded inline above. Happy to work with you to address these concerns.

@jakirkham
Copy link
Member Author

Any other thoughts on this? Currently thinking it would be good to merge this by end of week.

@jakirkham jakirkham mentioned this pull request Feb 20, 2019
@jakirkham jakirkham merged commit e20afc2 into zarr-developers:master Feb 26, 2019
@jakirkham jakirkham deleted the encode_complex_fill_value branch February 26, 2019 08:18
@jakirkham
Copy link
Member Author

Please let me know if anything else occurs to you, @rabernat. Happy to address in a follow-up.

@jakirkham
Copy link
Member Author

As this followed PR ( #309 ), it effectively started testing N5Store with complex data, which doesn't work as evidenced by the CI failure. Have added PR ( #411 ) to correct this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants