Skip to content

Backwards compatibility for reading Array.filters and Array.codecs #2194

Closed
@TomAugspurger

Description

@TomAugspurger

Zarr version

v3

Numcodecs version

n/a

Python Version

n/a

Operating System

n/a

Installation

n/a

Description

As part of getting xarray ready for zarr v3, I'm looking at how to handle the codec and filter API.

The primary / first place this is accessed is https://github.com/pydata/xarray/blob/1c6300c415efebac15f5ee668a3ef6419dbeab63/xarray/backends/zarr.py#L555-L556, which just reads the values of .filters and .compressor to place them in the DataArray.encoding. A few questions:

  1. I'd like to add a .codecs property to the CodecPipeline ABC. This is fine for the BatchedCodecPipeline which AFAICT is the only actual codec pipeline. Does anyone foresee an issue with that? I'm not sure why that class is abstract and loadable through the config.
  2. Is it fair to say that filters is the same as array_array_codecs?
  3. Is it fair to say that compressor is the same as array_bytes_codecs?

There's also https://github.com/pydata/xarray/blob/1c6300c415efebac15f5ee668a3ef6419dbeab63/xarray/backends/zarr.py#L79, which accesses Codec.codec_id. I'm not sure yet about how to handle that, but right now the best is maybe .to_dict()["name"] (or we could have .to_dict() access a code_id)?

Steps to reproduce

n/a

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions