-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Labels
bugSomething isn't workingSomething isn't workingreferences generationReading byte ranges from archival filesReading byte ranges from archival files
Description
I encountered this bug when working on a custom GRIB parser in https://github.com/MeteoSwiss/icon-ch-vzarr.
When trying to serialize virtual datasets as kerchunk references, the custom codec information is silently dropped. The array metadata looks like this (note the codecs
entry):
ArrayV3Metadata(shape=(1, 1147980),
data_type=Float64(endianness='little'),
chunk_grid=RegularChunkGrid(chunk_shape=(1, 1147980)),
chunk_key_encoding=DefaultChunkKeyEncoding(name='default',
separator='/'),
fill_value=np.float64(0.0),
codecs=(EccodesCodec(),),
attributes={'_earthkit': {'b64message': 'R1JJQv//AAIAAAAAAAAAqgAAABUBANcA/w8BAQfpAQEUAAABAQAAABwCAP4AB+kBARU0AwAAAAAAAAAAAAAAAAEAAAAjAwAAEYRMAAAAZQYAAAEBF2Q9oldJWbZE0lSjzW4rwAAAACIEAAAAAAAAAgCXAAAAAAAAAABnAAAAAAL///////8AAAAVBQAAAfAAAEOIgACACgAAAAAAAAAGBv8AAAAFBzc3Nzc=',
'bitsPerValue': 16},
'long_name': '2m Temperature',
'standard_name': 'air_temperature',
'units': 'K'},
dimension_names=('valid_time', 'values'),
zarr_format=3,
node_type='array',
storage_transformers=())
but then the codecs (filter/compressors) information is not found in the references (it's null
):
{"version":1,"refs":{".zgroup":"{\"zarr_format\":2}",".zattrs":"{}","T_2M\/0.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010120",3424926372,2296130],"T_2M\/1.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010121",3427222332,2296130],"T_2M\/2.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010122",3424926372,2296130],"T_2M\/3.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010123",3424926372,2296130],"T_2M\/.zarray":"{\"shape\":[4,1147980],\"chunks\":[1,1147980],\"dtype\":\"<f8\",\"fill_value\":0.0,\"order\":\"C\",\"filters\":null,\"dimension_separator\":\".\",\"compressor\":null,\"attributes\":{},\"zarr_format\":2}","T_2M\/.zattrs":"{\"standard_name\":\"air_temperature\",\"long_name\":\"2m Temperature\",\"units\":\"K\",\"_earthkit\":{\"bitsPerValue\":16,\"b64message\":\"R1JJQv\/\/AAIAAAAAAAAAqgAAABUBANcA\/w8BAQfpAQEUAAABAQAAABwCAP4AB+kBARU0AwAAAAAAAAAAAAAAAAEAAAAjAwAAEYRMAAAAZQYAAAEBF2Q9oldJWbZE0lSjzW4rwAAAACIEAAAAAAAAAgCXAAAAAAAAAABnAAAAAAL\/\/\/\/\/\/\/8AAAAVBQAAAfAAAEOIgACACgAAAAAAAAAGBv8AAAAFBzc3Nzc=\"},\"_ARRAY_DIMENSIONS\":[\"valid_time\",\"values\"]}","CLCL\/0.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010120",3721779870,2296130],"CLCL\/1.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010121",3724075830,2296130],"CLCL\/2.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010122",3721779870,2296130],"CLCL\/3.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010123",3721779870,2296130],"CLCL\/.zarray":"{\"shape\":[4,1147980],\"chunks\":[1,1147980],\"dtype\":\"<f8\",\"fill_value\":0.0,\"order\":\"C\",\"filters\":null,\"dimension_separator\":\".\",\"compressor\":null,\"attributes\":{},\"zarr_format\":2}","CLCL\/.zattrs":"{\"standard_name\":\"unknown\",\"long_name\":\"Cloud Cover (800 hPa - Soil)\",\"units\":\"%\",\"_earthkit\":{\"bitsPerValue\":16,\"b64message\":\"R1JJQv\/\/AAIAAAAAAAAAqgAAABUBANcA\/w8BAQfpAQEUAAABAQAAABwCAP4AB+kBARU0BAAAAAAAAAAAAAAAAAEAAAAjAwAAEYRMAAAAZQYAAAEBF2Q9oldJWbZE0lSjzW4rwAAAACIEAAAAAAYWAgCXAAAAAAAAAABkAAABOIABAAAAAAAAAAAVBQAAAfAAAEOIgACACgAAAAAAAAAGBv8AAAAFBzc3Nzc=\"},\"_ARRAY_DIMENSIONS\":[\"valid_time\",\"values\"]}","TOT_PREC\/0.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010120",3792671464,194],"TOT_PREC\/1.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010121",3794978942,194],"TOT_PREC\/2.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010122",3792683114,194],"TOT_PREC\/3.0":["\/store_new\/mch\/msopr\/osm\/KENDA-CH1\/ANA25\/det\/iaf2025010123",3792680428,194],"TOT_PREC\/.zarray":"{\"shape\":[4,1147980],\"chunks\":[1,1147980],\"dtype\":\"<f8\",\"fill_value\":0.0,\"order\":\"C\",\"filters\":null,\"dimension_separator\":\".\",\"compressor\":null,\"attributes\":{},\"zarr_format\":2}","TOT_PREC\/.zattrs":"{\"standard_name\":\"unknown\",\"long_name\":\"Total Precipitation (Accumulation)\",\"units\":\"kg m-2\",\"_earthkit\":{\"bitsPerValue\":0,\"b64message\":\"R1JJQv\/\/AAIAAAAAAAAAwgAAABUBANcA\/w8BAQfpAQEUAAABAQAAABwCAP4AB+kBARU0BQAAAAAAAAAAAAAAAAEAAAAjAwAAEYRMAAAAZQYAAAEBF2Q9oldJWbZE0lSjzW4rwAAAADoEAAAACAE0AgCXAAAAAAAAAAABAAAAAAD\/\/\/\/\/\/\/8H6QEBFAAAAQAAAAABAgAAAAAA\/wAAAAAAAAAVBQAAAfAAAEOIgACACgAAAAAAAAAGBv8AAAAFBzc3Nzc=\"},\"_ARRAY_DIMENSIONS\":[\"valid_time\",\"values\"]}","valid_time\/0":"base64:AAAAAAAAAAABAAAAAAAAAAIAAAAAAAAAAwAAAAAAAAA=","valid_time\/.zarray":"{\"shape\":[4],\"chunks\":[4],\"dtype\":\"<i8\",\"fill_value\":null,\"order\":\"C\",\"filters\":null,\"dimension_separator\":\".\",\"compressor\":null,\"attributes\":{},\"zarr_format\":2}","valid_time\/.zattrs":"{\"units\":\"hours since 2025-01-01 20:00:00\",\"calendar\":\"proleptic_gregorian\",\"_ARRAY_DIMENSIONS\":[\"valid_time\"]}"}}
Issue seems to be here:
VirtualiZarr/virtualizarr/utils.py
Line 138 in f3149d6
v2_codecs = [ |
if my custom codec is a subclass of
ArrayBytesCodec
it will be excluded. There's also a TODO left there which may refer to this.
I added a reproducible example here (run it with uv run virtualize_kenda.py
): https://gist.github.com/frazane/d26fd8925aea11cadf5bb012d81c5c2e
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingreferences generationReading byte ranges from archival filesReading byte ranges from archival files