Closed
Description
With input.decode_cf
set to false (the default), nc2zarr loads encoded data into memory. The first time a variable is written, its encoded data is correctly written as-is even if it contains the scaling_factor
or add_offset
attributes. But when nc2zarr (actually the xarray.to_zarr()
method) attempts to append a new time step to existing data, it uses the existing variable encoding information to encode the already encoded data before writing it. This leads to corrupt datasets if scaling_factor
or add_offset
attributes are used.
The actual issue has its root cause in xarray.to_zarr()
method that behaves inconsistently. A new encode_cf
keyword argument would also help.