Open
Description
Problem description
Having checksums for individual chunks is good for verifying the integrity of the data we're loading. The existing mechanisms for checksumming data are inadequate for various reasons:
- Checksum of the entire array's data: This does not work for loading a subset of the data.
- Checksum of each individual chunk recorded by a filter as part of the chunk: This does not protect against chunks being swapped, and does not help for building a persistent cache for previously read chunks.
Recording the checksums in the .zarray file could work, but may be problematic for larger data sets.
see also: