Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement load_dataset() and load_dataarray() #2917

Merged
merged 5 commits into from
May 16, 2019

Conversation

dnowacki-usgs
Copy link
Contributor

Implement load_dataset() and load_dataarray()

BUG: Fixes #2887 by adding @shoyer solution for load_dataset and load_dataarray, wrappers around open_dataset and open_dataarray which open, load, and close the file and return the Dataset/DataArray
TST: Add tests for sequentially opening and writing to files using new functions
DOC: Add to whats-new.rst. Also a tiny change to the open_dataset docstring

Updates formatting to use .format() instead of % operator. Changed all instances of % to .format() and added test for using tuple as key, which errored using % operator.
@dnowacki-usgs dnowacki-usgs changed the title Load dataset Implement load_dataset() and load_dataarray() Apr 25, 2019
xarray/backends/api.py Show resolved Hide resolved
xarray/backends/api.py Outdated Show resolved Hide resolved
@shoyer
Copy link
Member

shoyer commented Apr 25, 2019

What do you think un-deprecating xarray.tutoral.load_dataset as part of this change? It might be useful to have that stick around for consistency.

@dnowacki-usgs
Copy link
Contributor Author

Seems like undeprecating it makes sense as a function with identical functionality is being added here. What if we just imported the new load_dataset as xarray.tutorial.load_dataset for the purposes of the tutorial?

@shoyer
Copy link
Member

shoyer commented Apr 29, 2019

What if we just imported the new load_dataset as xarray.tutorial.load_dataset for the purposes of the tutorial?

tutorial.load_dataset() is a little different. It downloads the indicated file from GitHub to open instead of opening it directly.

@dnowacki-usgs
Copy link
Contributor Author

Forgive me, I don't understand what you mean by download vs open. I see that tutorial.load_dataset() is different from this PR in that it keeps the file handle open instead of closing it. Is that the distinction you are referring to? Otherwise all the gallery/examples I have checked call tutorial.load_dataset() at the beginning and that's it. Unless I'm missing something?

xarray/xarray/tutorial.py

Lines 92 to 107 in 6d93a95

def load_dataset(*args, **kwargs):
"""
`load_dataset` will be removed a future version of xarray. The current
behavior of this function can be achived by using
`tutorial.open_dataset(...).load()`.
See Also
--------
open_dataset
"""
warnings.warn(
"load_dataset` will be removed in a future version of xarray. The "
"current behavior of this function can be achived by using "
"`tutorial.open_dataset(...).load()`.",
DeprecationWarning, stacklevel=2)
return open_dataset(*args, **kwargs).load()

@shoyer
Copy link
Member

shoyer commented Apr 30, 2019

Forgive me, I don't understand what you mean by download vs open. I see that tutorial.load_dataset() is different from this PR in that it keeps the file handle open instead of closing it. Is that the distinction you are referring to?

The difference I was referring to is that xarray.tutorial.load_dataset() needs to call xarray.tutorial.open_dataset() internally, not xarray.open_dataset(). But otherwise it can look exactly like xarray.load_dataset, e.g., it should be:

# in tutorial.py
def load_dataset(*args, **kwargs):
    # this calls tutorial.open_dataset, not xarray.open_dataset
    with open_dataset(*args, **kwargs) as ds:
        return ds.load() 

@dnowacki-usgs
Copy link
Contributor Author

Got it, thanks. I can make this change, assuming the dask argument used for deprecating it in the first place isn't stronger than maintaining this consistency.

@shoyer
Copy link
Member

shoyer commented May 1, 2019

I think the dask argument is really an argument that tutorial.open_dataset() needs to exist. But if we're also including a xarray.load_dataset() function, then for consistency it makes sense to have both tutorial options, too.

@dnowacki-usgs
Copy link
Contributor Author

Sounds good. This latest commit un-deprecates tutorial.load_dataset() and makes it parallel to xarray.load_dataset() in terms of functionality.

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this to api.rst, after open_dataset/open_dataarray?

xarray/backends/api.py Outdated Show resolved Hide resolved
doc/whats-new.rst Outdated Show resolved Hide resolved
BUG: Fixes pydata#2887 by adding @shoyer solution for load_dataset and load_dataarray, wrappers around open_dataset and open_dataarray which open, load, and close the file and return the Dataset/DataArray
TST: Add tests for sequentially opening and writing to files using new functions
DOC: Add to whats-new.rst. Also a tiny change to the open_dataset docstring

Update docstrings and check for cache in kwargs

Undeprecate load_dataset

Add to api.rst, fix whats-new.rst typo, raise error instead of warning
@dcherian
Copy link
Contributor

LGTM. Thanks @dnowacki-usgs

@shoyer shoyer merged commit bd78b7f into pydata:master May 16, 2019
@shoyer
Copy link
Member

shoyer commented May 16, 2019

LGTM, too. Thanks @dnowacki-usgs !

dcherian added a commit to dcherian/xarray that referenced this pull request May 29, 2019
* upstream/master:
  cfgrib is now part of conda-forge (pydata#2992)
  Add fill_value for concat and auto_combine (pydata#2964)
  Remove deprecated pytest.config usages (pydata#2988)
  Add transpose_coords option to DataArray.transpose (pydata#2556)
  Fix rolling.constuct() example (pydata#2967)
  Implement load_dataset() and load_dataarray() (pydata#2917)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Safely open / close netCDF files without resource locking
4 participants