Closed
Description
I just want to do a small recap of the current proposals for the class AbstractDataStore refactor discussed with @shoyer, @jhamman, and @alexamici.
Proposal 1: Store returns:
- xr.Variables with the list of filters to apply to every variable
- dataset attributes
- encodings
Xarray applies to every variable only the filters selected by the backend before building the xr.Dataset.
Proposal 2: Store returns:
- xr.Variables with all needed filters applied (configured by xarray),
- dataset attributes
- encodings
Xarray builds the xr.Dataset
Proposal 3: Store returns:
- xr.Dataset
Before going on I'd like to collect pros and cons. For my understanding:
Proposal 1
pros:
- the backend is free to decide which representation to provide.
- more control on the backend (? not necessary true, the backend can decide to apply all the filters internally and provide xarray and empty list of filters to be applied)
- enable / disable filters logic would be in xarray.
- all the filters (applied by xarray) should have a similar interface.
- maybe registered filters could be used by other backends
cons:
- confusing backend-xarray interface.
- more difficult to define interfaces. More conflicts (registered filters with the same name...)
- need more structure to define this interface, more code to maintain.
Proposal 2
pros:
- interface backend-xarray is clearer / backend and xarray have well different defined tasks.
- interface would be minimal and easier to implement
- no intermediate representations
- less code to maintain
cons:
- less control on filters.
- more complex explicit definition of the interface (every filter must understand what
decode_times
means in their case) - more complexity inside the filters
The minimal interface would be something like that:
class AbstractBackEnd:
def __init__(self, path, encode_times=True, ..., **kwargs): # signature of open_dataset
raise NotImplementedError
def get_variables():
"""Return a dictionary of variable name and xr.Variable"""
raise NotImplementedError
def get_attrs():
"""returns """
raise NotImplementedError
def get_encoding():
""" """
raise NotImplementedError
def close(self):
pass
Proposal 3
pros w.r.t. porposal 2:
- decode_coordinates is done by the backend as the other filters.
cons?
Any suggestions?