Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex dtype error #244

Closed
alimanfoo opened this issue Mar 7, 2018 · 4 comments
Closed

Complex dtype error #244

alimanfoo opened this issue Mar 7, 2018 · 4 comments
Labels
bug Potential issues with the zarr-python library
Milestone

Comments

@alimanfoo
Copy link
Member

Minimal, reproducible code sample, a copy-pastable example if possible

zarr.array(1j*np.arange(10))

Problem description

Above code snippet generates an exception:

TypeError: Object of type 'complex128' is not JSON serializable

This is due to the default fill value of zero being cast as complex before attempting to store as fill value in JSON. Workaround is to set fill_value=None when creating the array.

Version and installation information

Please provide the following:

  • Value of zarr.__version__: 2.2.0
  • Value of numcodecs.__version__: 0.5.4
  • Version of Python interpreter: 3.6
  • Operating system (Linux/Windows/Mac): Linux
  • How Zarr was installed (e.g., "using pip into virtual environment", or "using conda"): pip into venv
@alimanfoo alimanfoo added the bug Potential issues with the zarr-python library label Mar 7, 2018
@chairmank
Copy link

To fix this problem, what do you think about evolving the specification

https://zarr.readthedocs.io/en/stable/spec/v2.html#fill-value-encoding

so that the fill value in the metadata JSON is a base64-encoded binary representation of a value with the specified data type? For a given NumPy-style type string (even for structs), we can unambiguously decode the binary representation. To distinguish it from legacy fill_value specifications, this binary representation could be packaged in a special JSON object. Examples are below.

For backwards compatibility, continue to parse JSON numeric literals:

{
    "dtype": "<f8",
    "fill_value": 0.9868011474609375,
 }

For backwards compatibility, continue to recognize strings like "NaN" as special values:

{
    "dtype": "<f8",
    "fill_value": "NaN",
 }

New style binary representation of fill value. "AAAAAAAA+H8=" is the base64 encoding of '\x00\x00\x00\x00\x00\x00\xf8\x7f which is NaN as a little-endian float64.

{
    "dtype": "<f8",
    "fill_value": {
      "data": "AAAAAAAA+H8="
    }
 }

@chairmank
Copy link

@jakirkham suggested a similar idea on #216

@jakirkham
Copy link
Member

jakirkham commented Dec 10, 2018

As a potentially quicker fix, we could do something similar to what we do with datetime/timedelta types and cast the complex values to something else (maybe a record with floats).

zarr.array((1j*np.arange(10)).view(np.dtype([("real", np.float64), ("imag", np.float64)])))

Edit: Should add this makes the fill value less human readable (it is effectively base64 encoded as that's how structured arrays are handled currently). Also there may need to be some explicit casting of the data afterwards to actually use it as complex. So while this is a viable workaround, it's probably less preferable to directly handling the encoding/decoding problem.

@jakirkham
Copy link
Member

After giving this some thought, figured a reasonable solution would be to just make the fill value for complex numbers a list with two floats (e.g. real and imaginary). This keeps the fill values human readable, is supported by JSON, and is pretty easy for us to handle (as it can leverage how we handle float value encoding/decoding currently).

Went ahead and added PR ( #363 ) to demonstrate this approach for discussion. Though would be perfectly happy just moving ahead with that approach if it is agreeable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

3 participants