This repository was archived by the owner on May 11, 2019. It is now read-only.

Description
I think we should discuss whether or not to use xarray as the common interface for all of the benchmarks we evaluate as part of this project. There are pros and cons to using/not-using xarray. I bring this up because I noticed the direct use of h5py in #4.
Pros include:
- Xarray provides a common interface wherein we can build real world science problems without writing custom interfaces to each storage api (thats what xarray does)
- Within pangeo, we are promoting the use of high-level data-structures (typically xarray but Iris as well)
Cons include:
- There are some known performance problems with xarray backends, some of which are not particularly storage format specific and can potentially be side-stepped by using the lower level storage api.
- Using xarray assumes that we have implemented each api in a fair / equivalent way. We may introduce bias into one backend because of an incomplete/ill-performing implementation.
My vote would be to use xarray until we see it necessary to have more fine-grained tests. I think this will make implementation of real-world workflows easier and will be useful to us xarray developers in understanding chokepoints in the backends that we currently support.