Skip to content

Document how chunking interacts with NetCDF compression #6940

@trexfeathers

Description

@trexfeathers

📚 Documentation

Some brief documentation should be written concerning how chunking can affect NetCDF compression. I assuming this is because each chunk is compressed independently, with no re-use of compression tables between chunks? Not sure.

Hard to know where this ought to be placed. Would be made easier after #6868

import dask.array as da
import iris
from iris.util import make_gridcube


N_X = 2560
N_Y = 1920

# Repeating identical 'rows'.
data = da.broadcast_to(da.arange(N_X), (N_Y, N_X))

x_chunked = make_gridcube(N_X, N_Y)
x_chunked.data = data.rechunk([-1, 20])
y_chunked = make_gridcube(N_X, N_Y)
y_chunked.data = data.rechunk([20, -1])
no_chunked = make_gridcube(N_X, N_Y)
no_chunked.data = data.rechunk([-1, -1])

iris.save(x_chunked, "x_chunked.nc", zlib=True)
iris.save(y_chunked, "y_chunked.nc", zlib=True)
iris.save(no_chunked, "no_chunked.nc", zlib=True)
  • x_chunked.nc: 2.5M
  • y_chunked.nc: 1.9M
  • no_chunked.nc: 149K

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions