Intake, catalogs, and datatree #134
Description
Thanks @TomNicholas and sorry for creating issue noise. I guess I got a bit carried away with these comments in the readme:
- Has functions for mapping user-supplied functions over every node in the tree,
- Automatically dispatches some of xarray.Dataset's API over every node in the tree (such as .isel),
I was thinking that maybe the datatree abstraction could be a more formalised and ultimately 'xarray native' approach to the the problems that have been tackled by e.g. intake-esm and intake-thredds. Leaves in the tree could compositions over netcdf files, which may be aggregated JSON indexes. I guess I was thinking that some sort of formalism over a nested datastructure could help in dask computational graph composition. I have run into issues where the scheduler gets overloaded, or just takes forever to start for calculations across large datasets composed with i.e. mf_opendataset
I wonder if @andersy005, @mdurant or @rsignell have any experience or thoughts about if it makes any sense for interface between this library and intake?
Originally posted by @pbranson in #97 (comment)