-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add new backend api documentation #4810
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aurghs
I have mostly minor suggestions.
One high-level comment is that it would be nice to have an example implementation with pseudo-code. That would make it easier to understand.
class YourBackendArray(BackendEntrypoint):
def open_dataset(...):
ds = ...
# example built-in decoder use
return ds
open_dataset_parameters = ...
def guess_can_open(filename_obj):
# do checks
return True
doc/internals.rst
Outdated
|
||
- Create a class that inherits from Xarray py:class:`~xarray.backend.commonBackendEntrypoint` | ||
- Implement the method ``open_dataset`` that returns an instance of :py:class:`~xarray.Dataset` | ||
- Declare such a class as an external plugin in your setup.py. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to the entrypoints documentation here.
- Declare such a class as an external plugin in your setup.py. | |
- Declare such a class as an external plugin in your ``setup.py``. |
EDIT: I see it's linked below. Won't hurt to add a link here also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that an example would go a long way. It could also be a good idea to point to the simplest of the implemented backends.
I tried to look at some of the backends that are now available, e.g. pydap and one thing that's missing from the description here is the DataStore
.
Not for here but I did also not find much on what a LazilyOuterIndexedArray
is etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the need for examples, but overall I think this is pretty well structured. In case it is of use, I find this pretty helpful: https://documentation.divio.com/how-to-guides/
I think it would be useful to have a comparison between the old and new API, but I'm not sure if the old API ever was part of the public API (if not this is not necessary).
BackendArray
, CachingFileManager
, LazilyOuterIndexedArray
, and LazilyVectorizedIndexedArray
are not yet public API (I think?), so if we want to change that I think we should make sure their purpose is properly documented (it looks like that's what you were going to do, though).
This makes internals
a fairly long document, so we should think about splitting this into different pages (not in this PR). I think @andersy005 started something like that in #4835.
doc/internals.rst
Outdated
to integrate any code in Xarray; all you need to do is approaching the | ||
following steps: | ||
|
||
- Create a class that inherits from Xarray py:class:`~xarray.backend.commonBackendEntrypoint` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Create a class that inherits from Xarray py:class:`~xarray.backend.commonBackendEntrypoint` | |
- Create a class that inherits from :py:class:`~xarray.backends.common.BackendEntrypoint` |
also, I think I read somewhere that the official spelling of the package is xarray
(I can't remember where, though). If that is still valid, I think we should make sure the documentation follows that.
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: keewis <keewis@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: keewis <keewis@users.noreply.github.com>
…cumentation-draft � Conflicts: � doc/api-hidden.rst � doc/internals.rst
@jp-dark - Given your recent experience developing a backend using this new API, it would be great if you could provide some comments on this PR. |
@jhamman As I'm currently implementing xarray backends for reading weather radar data, I've already closely followed the adventures of @aurghs and @alexamici. To the current state of the backend documentation I can say that it is very well written and understandable (also for non native speakers) and I could easily follow the train of thought. From my perspective: I would not try to show extensive examples in the beginning, but would extend/rearrange documentation after first users issued problems/asked questions. So that only things get special documentation which are not immediately clear from users perspective. This is a really remarkable extension which will boost usage of xarray. Thanks! |
If you support more complex indexing as `explicit indexing` or | ||
`numpy indexing`, you can have a look to the implemetation of Zarr backend and Scipy backend, | ||
currently available in :py:mod:`~xarray.backends` module. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be useful to add a discussion here on how a raw key is transformed to an ExplicitIndexer
when querying an xarray DataArray
with a third-party backend. This would be especially useful for picking good test cases for querying DataArray
s and debugging problems in edge-cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to add something about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at what to add here. I started writing some more details, but I think It would become a little bit too long.
Today we should merge. So I suggest to merge it as it is, and I'll add later a dedicated section.
doc/internals.rst
Outdated
|
||
- Implement a function that returns an instance :py:class:``Dataset`` | ||
|
||
- Create a `BackendEntrypoint`` instance with your function as input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Create a `BackendEntrypoint`` instance with your function as input. | |
- Create a ``BackendEntrypoint`` instance with your function as input. |
doc/internals.rst
Outdated
|
||
**Output** | ||
|
||
```BackendEntrypoint`.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```BackendEntrypoint`.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset` | |
``BackendEntrypoint.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset` |
doc/internals.rst
Outdated
While ``open_dataset`` is mandatory, ``open_dataset_parameters`` and ``guess_can_open`` are optional. | ||
|
||
|
||
BackendEntrypoint.open_dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider documenting the interface as a class? That might make it easier to document (e.g., in the form of docstrings).
Co-authored-by: Stephan Hoyer <shoyer@google.com>
Co-authored-by: Stephan Hoyer <shoyer@google.com>
Co-authored-by: Julia Dark <github@email.jpdark.com>
Co-authored-by: Julia Dark <github@email.jpdark.com>
…indow * upstream/master: add polyval to polyfit see also (pydata#5020) mention map_blocks in the docstring of apply_ufunc (pydata#5011) Switch backend API to v2 (pydata#4989) WIP: add new backend api documentation (pydata#4810) pin netCDF4=1.5.3 in min-all-deps (pydata#4982) fix matplotlib errors for single level discrete colormaps (pydata#4256) Adapt exception handling in CFTimeIndex.__sub__ and __rsub__ (pydata#5006) Update options.py (pydata#5000) Adjust tests to use updated pandas syntax for offsets (pydata#4537) add a combine_attrs parameter to Dataset.merge (pydata#4895) Support for dask.graph_manipulation (pydata#4965) raise on passing axis to Dataset.reduce methods (pydata#4940)
…-tasks * upstream/master: Fix regression in decoding large standard calendar times (pydata#5050) Fix sticky sidebar responsiveness on small screens (pydata#5039) Flexible indexes refactoring notes (pydata#4979) add a install xarray step to the upstream-dev CI (pydata#5044) Adds Dataset.query() method, analogous to pandas DataFrame.query() (pydata#4984) run tests on python 3.9 (pydata#5040) Add date attribute to datetime accessor (pydata#4994) 📚 New theme & rearrangement of the docs (pydata#4835) upgrade ci-trigger to the most recent version (pydata#5037) GH5005 fix documentation on open_rasterio (pydata#5021) GHA for automatically canceling previous CI runs (pydata#5025) Implement GroupBy.__getitem__ (pydata#3691) conventions: decode unsigned integers to signed if _Unsigned=false (pydata#4966) Added support for numpy.bool_ (pydata#4986) Add additional str accessor methods for DataArray (pydata#4622) add polyval to polyfit see also (pydata#5020) mention map_blocks in the docstring of apply_ufunc (pydata#5011) Switch backend API to v2 (pydata#4989) WIP: add new backend api documentation (pydata#4810) pin netCDF4=1.5.3 in min-all-deps (pydata#4982)
add backend documentation
rename
store_spec
infilename_or_obj
in backend entrypoint methodguess_can_open
Related Update Documentation for backend Implementation #4803