Scale/Offset (and maybe other?) Filters

(This is on the [list for V1](#1))

HDF5 has the ability to read and write data using scale-offset filters to remove the precision of the data stored and use this in a filter/compression pipeline.  From the [h5py documentation](https://docs.h5py.org/en/stable/high/dataset.html#scale-offset-filter) it is not obvious that a reading library like `pyfive` has to do anything to read this data - if all that has happened is that the precision has been cut, and then the data compressed, a decompression pipeline doesn't need to do anything new.

However that  doesn't sound like a scale-offset filter at all, and if we look a bit further under the hood, e.g. [hdf5 docs](https://support.hdfgroup.org/documentation/hdf5/latest/_h5_d__u_g.html#subsubsec_dataset_filters_scale) , we find that the algorithm is a bit more complicated, and that the minimum-bits and the minimum value need to be stored with the compressed data for decompression and post-decompression.

Next steps for doing this would be to get some tests, with some test integer and floating point data (definitely both) to compress. We'd need to see what was in the filter pipeline message, and understand what the `H5Z_FILTER_SCALEOFFSET` filter looks like.  Registered filter details can be found [here](https://github.com/HDFGroup/hdf5_plugins/blob/master/docs/RegisteredFilterPlugins.md). 

Note: Should we consider the n-bit filter(`H5Z_FILTER_NBIT`) at the same time? Will we get it for free during the implementation?

Note: In implementation, also consider to what extent we want to support the [hdf5plugin](https://hdf5plugin.readthedocs.io/en/stable/usage.html)  (or something similar) and any blosc extras so we can deal with any blosc filtering operation?  

Note: We don't think NetCDF4 supports any of this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale/Offset (and maybe other?) Filters #89

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scale/Offset (and maybe other?) Filters #89

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions