Skip to content

Commit 0b6045b

Browse files
committed
small changes.
1 parent 8df0ca4 commit 0b6045b

File tree

3 files changed

+13
-13
lines changed

3 files changed

+13
-13
lines changed

doc/dask.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -87,10 +87,8 @@ for the full disclaimer). By default, :py:meth:`~xarray.open_mfdataset` will chu
8787
netCDF file into a single Dask array; again, supply the ``chunks`` argument to
8888
control the size of the resulting Dask arrays. In more complex cases, you can
8989
open each file individually using :py:meth:`~xarray.open_dataset` and merge the result, as
90-
described in :ref:`combining data`. If you have a distributed cluster running,
91-
passing the keyword argument ``parallel=True`` to :py:meth:`~xarray.open_mfdataset`
92-
will speed up the reading of large multi-file datasets by executing those read tasks
93-
in parallel using ``dask.delayed``.
90+
described in :ref:`combining data`. Passing the keyword argument ``parallel=True`` to :py:meth:`~xarray.open_mfdataset` will speed up the reading of large multi-file datasets by
91+
executing those read tasks in parallel using ``dask.delayed``.
9492

9593
You'll notice that printing a dataset still shows a preview of array values,
9694
even if they are actually Dask arrays. We can do this quickly with Dask because
@@ -157,6 +155,12 @@ explicit conversion step. One notable exception is indexing operations: to
157155
enable label based indexing, xarray will automatically load coordinate labels
158156
into memory.
159157

158+
.. tip::
159+
160+
By default, dask uses its multi-threaded scheduler, which distributes work across
161+
multiple cores and allows for processing some datasets that do not fit into memory.
162+
For running across a cluster, `setup the distributed scheduler <https://docs.dask.org/en/latest/setup.html>`_.
163+
160164
The easiest way to convert an xarray data structure from lazy Dask arrays into
161165
*eager*, in-memory NumPy arrays is to use the :py:meth:`~xarray.Dataset.load` method:
162166

@@ -417,7 +421,3 @@ With analysis pipelines involving both spatial subsetting and temporal resamplin
417421

418422
6. The dask `diagnostics <https://docs.dask.org/en/latest/understanding-performance.html>`_ can be
419423
useful in identifying performance bottlenecks.
420-
421-
7. Installing the optional `bottleneck <https://github.com/kwgoodman/bottleneck>`_ library
422-
will result in greatly reduced memory usage when using :py:meth:`~xarray.Dataset.rolling`
423-
on dask arrays,

doc/index.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ intuitive, more concise, and less error-prone developer experience.
1111
The package includes a large and growing library of domain-agnostic functions
1212
for advanced analytics and visualization with these data structures.
1313

14-
Xarray is particularly tailored to working with netCDF_ files, which were the
14+
Xarray is inspired by and borrows heavily from pandas_, the popular data
15+
analysis package focused on labelled tabular data.
16+
It is particularly tailored to working with netCDF_ files, which were the
1517
source of xarray's data model, and integrates tightly with dask_ for parallel
1618
computing.
17-
It is inspired by and borrows heavily from pandas_, the popular data
18-
analysis package focused on labelled tabular data.
1919

2020
.. _NumPy: http://www.numpy.org
2121
.. _pandas: http://pandas.pydata.org

xarray/core/dataset.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -613,7 +613,7 @@ def sizes(self) -> Mapping[Hashable, int]:
613613
"""
614614
return self.dims
615615

616-
def load(self: T, **kwargs) -> T:
616+
def load(self, **kwargs) -> "Dataset":
617617
"""Manually trigger loading and/or computation of this dataset's data
618618
from disk or a remote source into memory and return this dataset.
619619
Unlike compute, the original dataset is modified and returned.
@@ -771,7 +771,7 @@ def _dask_postpersist(dsk, info, *args):
771771

772772
return Dataset._construct_direct(variables, *args)
773773

774-
def compute(self: T, **kwargs) -> T:
774+
def compute(self, **kwargs) -> "Dataset":
775775
"""Manually trigger loading and/or computation of this dataset's data
776776
from disk or a remote source into memory and return a new dataset.
777777
Unlike load, the original dataset is left unaltered.

0 commit comments

Comments
 (0)