Description
Zarr version
3.0.1
Numcodecs version
0.15.0
Python Version
3.12.8
Operating System
Mac
Installation
micromamba create -n zarr3 python 'zarr>=3'
Description
When writing an float array with fill_value=np.nan
to a version 2 store, according to the spec, the fill value has to be encoded as a string (i.e.: "NaN"
). This has to be the case, as JSON doesn't support a NaN literal. However, when doing so using zarr>=3
, the fill value is not encoded as string. The resulting .zarray
is thus not a valid JSON and can't be read using other JSON parsers.
This behaviour could be a regession of #412.
The bug was originally found by @lkluft.
Steps to reproduce
Using this Python script as check_nan_encoding.py
:
import zarr
import numpy as np
import numcodecs
import sys
print("zarr version", zarr.__version__)
print("numcodecs version", numcodecs.__version__)
print("python version", sys.version)
def make_array(**kwargs):
if zarr.__version__ >= "3":
return zarr.create_array(zarr_format=2, **kwargs)
else:
return zarr.create(**kwargs)
def to_str(buffer):
if zarr.__version__ >= "3":
return buffer.to_bytes().decode("utf-8")
else:
return buffer.decode("utf-8")
def get_fill_value_line(store):
return [line.strip(" ,")
for line in to_str(store[".zarray"]).split("\n")
if "fill_value" in line][0]
store = {}
z = make_array(
store=store,
shape=(1,),
chunks=(1,),
dtype="f4",
fill_value=np.nan,
)
print(get_fill_value_line(store))
When running in a zarr>=3
environment:
micromamba create -n zarr3 python 'zarr>=3'
micromamba run -n zarr3 python3 check_nan_encoding.py
the following is printed:
zarr version 3.0.1
numcodecs version 0.15.0
python version 3.12.8 | packaged by conda-forge | (main, Dec 5 2024, 14:19:53) [Clang 18.1.8 ]
"fill_value": NaN
Which is not a valid JSON encoding.
When running the same code in a zarr>=2,<3
environment:
micromamba env create -n zarr2 -c conda-forge python 'zarr>=2,<3'
micromamba run -n zarr2 python3 check_nan_encoding.py
zarr version 2.18.4
numcodecs version 0.15.0
python version 3.13.1 | packaged by conda-forge | (main, Jan 13 2025, 09:45:31) [Clang 18.1.8 ]
"fill_value": "NaN"
Which is fine according to the spec.
Additional output
No response