Skip to content

Should open_dataset recommend using open_datatree for HDF5 files? #9891

Open
@kafitzgerald

Description

@kafitzgerald

What is your issue?

For context, I'm at the Pangeo hack day following AGU w/ the DataTree group and in getting started noticed that open_dataset is a bit quiet about not fully reading in the file metadata for HDF5 files.

open_datatree now does this nicely or you can add in a groups keyword, but it could be nice to push users in that direction and let them know the groups aren't being read by default.

Not sure on implementation and/or if this is necessarily desirable in all cases, but just a thought from the perspective of someone new to DataTree.

reproducible example:

import geocat.datafiles as gdf
import xarray as xr

dt = xr.open_datatree(gdf.get('hdf_files/3B-MO.MS.MRG.3IMERG.20140701-S000000-E235959.07.V03D.HDF5'))
dt

ds = xr.open_dataset(gdf.get('hdf_files/3B-MO.MS.MRG.3IMERG.20140701-S000000-E235959.07.V03D.HDF5'))
ds

open_dataset only reads in the attribute info and doesn't let me know that there is more there.

open_datatree successfully reads the groups, data variables, etc. as expected.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions