Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 104 additions & 2 deletions doc/p5dump.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@ p5dump
******

``pyfive`` includes a command line tool ``p5dump`` which can be used to dump the contents of an HDF5 file to the
terminal. This is similar to the ``ncdump`` tool included with the NetCDF library, or the ``h5dump`` tool included
terminal (e.g ``p5dump myfile.hdf5``). This is similar to the ``ncdump`` tool included with the NetCDF library, or the ``h5dump`` tool included
with the HDF5 library, but like the rest of pyfive, is implemented in pure Python without any dependencies on the
HDF5 C library.

It is not identical to either of these tools, though the default output is very close to that of ``ncdump``.
When called with `-s` (e.g ``p5dump -s myfile.hdf5``, or simply ``p5dump myfile.hdf5``) the output provides extra information for chunked
When called with `-s` (e.g ``p5dump -s myfile.hdf5``) the output provides extra information for chunked
datasets, including the locations of the start and end of the chunk index b-tree
and the location of the first data chunk for that variable. This extra information is useful for understanding
the performance of data access for chunked variables, particularly when accessing data in object stores such as
Expand All @@ -16,4 +16,106 @@ performance may be sub-optimal - in this situation, if you have control over the
consider using the ``h5repack`` tool from the standard HDF5 distribution to make a copy of the file with the
chunk index and attributes stored contiguously. All tools which read HDF5 files will benefit from this.

A ``p5dump`` example:

.. code-block:: console

$ p5dump myfile.hdf5
File: myfile.hdf5 {
dimensions:
lon = 8;
bounds2 = 2;
lat = 5;
variables:
float64 lat_bnds(lat, bounds2) ;
float32 bounds2(bounds2) ;
float64 lat(lat) ;
lat:units = "degrees_north" ;
lat:standard_name = "latitude" ;
lat:bounds = "lat_bnds" ;
float64 lon_bnds(lon, bounds2) ;
float64 lon(lon) ;
lon:units = "degrees_east" ;
lon:standard_name = "longitude" ;
lon:bounds = "lon_bnds" ;
float64 time ;
time:units = "days since 2018-12-01" ;
time:standard_name = "time" ;
float64 q(lat, lon) ;
q:project = "research" ;
q:standard_name = "specific_humidity" ;
q:units = "1" ;
q:coordinates = "time" ;
q:cell_methods = "area: mean" ;
// global attributes:
q:Conventions = "CF-1.12" ;
}

With the ``-s`` option, extra attributes are displayed: *_Storage*,
*_n_chunks*, *_chunk_shape*, *_btree_range*, *_first_chunk*, and
*_compression*:

.. code-block:: console

$ p5dump -s myfile.hdf5
File: example_field_0.nc {
dimensions:
lat = 5;
lon = 8;
bounds2 = 2;
variables:
float64 lat_bnds(lat, bounds2) ;
lat_bnds:_Storage = "Chunked" ;
lat_bnds:_n_chunks = 1 ;
lat_bnds:_chunk_shape = (5, 2) ;
lat_bnds:_btree_range = (6144, 6144) ;
lat_bnds:_first_chunk = 10808 ;
lat_bnds:_compression = "gzip(4)" ;
float32 bounds2(bounds2) ;
bounds2:_Storage = "Contiguous" ;
float64 lat(lat) ;
lat:units = "degrees_north" ;
lat:standard_name = "latitude" ;
lat:bounds = "lat_bnds" ;
lat:_Storage = "Chunked" ;
lat:_n_chunks = 1 ;
lat:_chunk_shape = (5,) ;
lat:_btree_range = (10836, 10836) ;
lat:_first_chunk = 12932 ;
lat:_compression = "gzip(4)" ;
float64 lon_bnds(lon, bounds2) ;
lon_bnds:_Storage = "Chunked" ;
lon_bnds:_n_chunks = 1 ;
lon_bnds:_chunk_shape = (8, 2) ;
lon_bnds:_btree_range = (12959, 12959) ;
lon_bnds:_first_chunk = 15575 ;
lon_bnds:_compression = "gzip(4)" ;
float64 lon(lon) ;
lon:units = "degrees_east" ;
lon:standard_name = "longitude" ;
lon:bounds = "lon_bnds" ;
lon:_Storage = "Chunked" ;
lon:_n_chunks = 1 ;
lon:_chunk_shape = (8,) ;
lon:_btree_range = (15621, 15621) ;
lon:_first_chunk = 17717 ;
lon:_compression = "gzip(4)" ;
float64 time ;
time:units = "days since 2018-12-01" ;
time:standard_name = "time" ;
time:_Storage = "Contiguous" ;
float64 q(lat, lon) ;
q:project = "research" ;
q:standard_name = "specific_humidity" ;
q:units = "1" ;
q:coordinates = "time" ;
q:cell_methods = "area: mean" ;
q:_Storage = "Chunked" ;
q:_n_chunks = 4 ;
q:_chunk_shape = (3, 4) ;
q:_btree_range = (20663, 20663) ;
q:_first_chunk = 17755 ;
q:_compression = "gzip(4)" ;
// global attributes:
q:Conventions = "CF-1.12" ;
}
Loading