Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NaN attributes in datasets #152

Merged
merged 4 commits into from
Feb 8, 2023

Conversation

kwilcox
Copy link
Member

@kwilcox kwilcox commented Feb 7, 2023

Datasets with NaN value attributes currently raise a ValueError: Out of range float values are not JSON compliant error. This adds a starlette.responses.JsonResponse subclass that explicitly allows nan values. The class is taken straight from starlette and only changes the allow_nan parameter.

Datasets with NaN value attributes currently raise a
`ValueError: Out of range float values are not JSON
compliant` error. This adds a custom starlette
JsonResponse class that explicitly allows nan values.
@jhamman
Copy link
Contributor

jhamman commented Feb 7, 2023

Thanks @kwilcox - I'll let @abkfenris make the final call but this looks good to me.

@kwilcox
Copy link
Member Author

kwilcox commented Feb 7, 2023

Tests pass for me locally (admittedly on python 3.11), I'll poke around! Helps to be on the right branch... looking...

@abkfenris
Copy link
Member

Tests pass for me locally (admittedly on python 3.11), I'll poke around! Helps to be on the right branch... looking...

And that's why I refreshed the PR before replying after a call went a lot longer than expected when I started to write a response.

It looks like it's hitting a different metadata test, which I'm guessing is also having a hard time with NaNs, and might require a deeper dive into the Zarr spec: https://github.com/xarray-contrib/xpublish/blob/ce9fa67b132cb78b795b9d1b18dbf7728d7f545a/tests/test_rest_api.py#L281-L284

@kwilcox I'd also love to hear how you're using xpublish at some point.

@kwilcox
Copy link
Member Author

kwilcox commented Feb 7, 2023

Well, the tests are passing. I didn't consult the Zarr spec on what to do with NaN values. I'm not a huge fan of the JSON response format this PR implements now... the JSON returned isn't valid JSON (it uses JavaScript NaN objects instead of null). I'll take a look at the Zarr docs and see what I can come up with for a better solution. I do have a bunch of datasets with NaN attribute values.

@abkfenris
Copy link
Member

I was really hoping to quickly give high fives and approve it, but it looks like you found a really messy messy spot for a PR!

It sounds like there is a difference between what Zarr-Python does (mainly due to standard lib JSON), and the Zarr spec. It sounds like in the Zarr spec they are not currently happy with the handling of NaNs in attributes. Some Zarr-Python encoded datasets aren't openable in languages that have more restrictive JSON parsers,

A few entry points to the attribute encoding discussion that I found:

@mwengren
Copy link

mwengren commented Feb 8, 2023

@kwilcox I'd also love to hear how you're using xpublish at some point

Me too!

Copy link
Member

@abkfenris abkfenris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this matches up with how ds.to_zarr() formats .zmetadata, and even if it's not the most compatible, it's a known incompatibility with v2, so that works for me.

@jhamman it might be worth pulling Zarr support out into it's own plugin. Then we can tinker with compatibility options and follow v3 changes at a different pace.

@abkfenris
Copy link
Member

Thanks Kyle!

@abkfenris abkfenris merged commit f9b3b87 into xpublish-community:main Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants