zarr memory benchmark

Discussion: zarr-developers/zarr-python#2904

This benchmark writes a (100, 1000, 1000) ndarray of float32 data split into 10 chunks along the first dimension.

Component	Shape	nbytes
Chunk	`(10, 1000, 1000)`	40,000,000
Array	`(100, 1000, 1000)`	400,000,000

Observations

Read compressed

Peak memory usage is about 1.1 GiB.

Questions: what's the memory overhead of zstd? Do we know the uncompressed size? Can we tell zstd that?

Can we effectively readinto the decompression buffer? Maybe...

Why does buf.as_numpy_array apparently allocate memory?

Read Uncompressed

For this special case, the peak memory usage ought to be the size of the ndarray. Currently, it's about 2x.

This is probably because LocalStore uses path.read_bytes, and then we put that into an array using prototype.buffer.from_bytes. See here.

We would optimially use readinto into the memory backing the out ndarray. With enough effort that's probably doable. Given how rare uncompressed data is in practice, it might not be worthwhile.

Profiles

SOL

As a test for what's possible, sol.py implements basic reads for compressed and uncompressed data.

read uncompressed: 381.5 MiB (~1x the size of the array. Best we can do.)
read compressed: 734.1 MiB (size of the array + size of the compressed data. Best we can do.)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
reports		reports
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
sol.py		sol.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

zarr memory benchmark

Observations

Read compressed

Read Uncompressed

Profiles

SOL

About

Uh oh!

Releases

Packages

Languages

TomAugspurger/zarr-python-memory-benchmark

Folders and files

Latest commit

History

Repository files navigation

zarr memory benchmark

Observations

Read compressed

Read Uncompressed

Profiles

SOL

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages