Skip to content

compatibility with zarr dtypes refactor #10333

Open
@d-v-b

Description

@d-v-b

What is your issue?

This is an issue to track compatibility between xarray and the in-progress zarr-python data types refactoring effort.

We are working on a new data type model for zarr-python. Why? Zarr-python 2 used numpy dtypes internally, and zarr v2 (the format) also used the numpy data type model. Fitting the spec heavily to numpy proved problematic for zarr implementations in other languages.

Zarr v3 introduced a new data type model that looks much less like numpy dtypes. The v3 spec defines fewer dtypes than numpy supports, for example, and the v3 dtypes model doesn't track endianness. So we shipped zarr-python 3 with zarr v3 support for only the data types described in the zarr v3 spec, which left out some important numpy data types:

type string code zarr v3 spec
fixed-length ascii strings S PR
fixed-length unicode strings U PR
datetime64 M numpy.datetime64
timedelta64 m numpy.timedelta64
fixed-length raw byes V None yet
structured data types V None yet

Support for these missing numpy data types is being added in this PR against zarr-python. It's turned into quite an effort. In parallel with the zarr-python implementation, we are also writing up language-agnostic specs for these data types, so that other zarr implementations can easily support them. See the third column of the table.

I opened a compatibility PR against xarray that sources zarr-python from the new dtypes branch. When the compatibility PR indicates that all tests are passing, and when we are satisfied that there are no remaining questions relating to the impact of zarr-python's new dtype model and xarray, then we can close this issue.

We are looking to release this functionality in zarr-python 3.1, but I can't give a timeline for that yet. Until then, I'm happy to answer any questions people have about this effort.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs triageIssue that has not been reviewed by xarray team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions