Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docstring for compute and persist #8903

Merged
merged 9 commits into from
Apr 2, 2024
18 changes: 16 additions & 2 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -1126,6 +1126,8 @@ def load(self, **kwargs) -> Self:
"""Manually trigger loading of this array's data from disk or a
remote source into memory and return this array.

Unlike compute, the original dataset is modified and returned.
saschahofmann marked this conversation as resolved.
Show resolved Hide resolved

Normally, it should not be necessary to call this method in user code,
because all xarray functions should either work on deferred data or
load data automatically. However, this method can be necessary when
Expand All @@ -1148,8 +1150,9 @@ def load(self, **kwargs) -> Self:

def compute(self, **kwargs) -> Self:
"""Manually trigger loading of this array's data from disk or a
remote source into memory and return a new array. The original is
left unaltered.
remote source into memory and return a new array.

Unlike load, the original is left unaltered.

Normally, it should not be necessary to call this method in user code,
because all xarray functions should either work on deferred data or
Expand All @@ -1161,6 +1164,11 @@ def compute(self, **kwargs) -> Self:
**kwargs : dict
Additional keyword arguments passed on to ``dask.compute``.

Returns
-------
object : DataArray
New object with the data and all coordinates as in-memory arrays.

See Also
--------
dask.compute
Expand All @@ -1174,12 +1182,18 @@ def persist(self, **kwargs) -> Self:
This keeps them as dask arrays but encourages them to keep data in
memory. This is particularly useful when on a distributed machine.
When on a single machine consider using ``.compute()`` instead.
Unlike load but like compute, the original dataset is left unaltered.
saschahofmann marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
**kwargs : dict
Additional keyword arguments passed on to ``dask.persist``.

Returns
-------
object : DataArray
New object with all dask-backed data and coordinates as persisted dask arrays.

See Also
--------
dask.persist
Expand Down
11 changes: 11 additions & 0 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1005,6 +1005,11 @@ def compute(self, **kwargs) -> Self:
**kwargs : dict
Additional keyword arguments passed on to ``dask.compute``.

Returns
-------
object : Dataset
New object with lazy data variables and coordinates as in-memory arrays.

See Also
--------
dask.compute
Expand Down Expand Up @@ -1037,12 +1042,18 @@ def persist(self, **kwargs) -> Self:
operation keeps the data as dask arrays. This is particularly useful
when using the dask.distributed scheduler and you want to load a large
amount of data into distributed memory.
Unlike load but like compute, the original dataset is left unaltered.
saschahofmann marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
**kwargs : dict
Additional keyword arguments passed on to ``dask.persist``.

Returns
-------
object : Dataset
New object with all dask-backed coordinates and data variables as persisted dask arrays.

See Also
--------
dask.persist
Expand Down
Loading