Skip to content

Buffer uses signed bytes with v2 compressors #2735

Closed
@QuLogic

Description

@QuLogic

Zarr version

3.0.1

Numcodecs version

0.15.0

Python Version

3.13.1

Operating System

Fedora Rawhide

Installation

Fedora package

Description

I'm looking at cgohlke/imagecodecs#123 and after fixing some imports and setting zarr_format=2, I can run many more tests, but several are failing with mismatched types, namely that imagecodecs compressors are expecting uint8_t, but are getting signed char.

I have traced this to Buffer requiring dtype='b', along with casts in cpu.Buffer.from_bytes.

If I modify those checks/casts to use the unsigned dtype='B', then I can get imagecodecs tests to pass.

I see this came in with GPU support in #1967. Was this actually intentional, or was it more that no-one noticed that the NumPy b dtype is a signed byte? It would seem odd to me that bytes would be treated as signed as in regular Python they are treated as unsigned (e.g., b'\xff'[0] == 255, not -1).

If this is intentional, then it seems like something that should be documented in the migration guide that would break compressors for zarr_format=2.

Steps to reproduce

Install imagecodecs, modify tests to use zarr.storage.MemoryStorage instead of zarr.MemoryStorage, and set zarr_format=2, then run its tests.

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions