Skip to content

Commit

Permalink
Merge branch 'main' into groupby-dask
Browse files Browse the repository at this point in the history
* main:
  Add `DataTree.persist` (#9682)
  Typing annotations for arithmetic overrides (e.g., DataArray + Dataset) (#9688)
  Raise `ValueError` for unmatching chunks length in `DataArray.chunk()` (#9689)
  Fix inadvertent deep-copying of child data in DataTree (#9684)
  new blank whatsnew (#9679)
  v2024.10.0 release summary (#9678)
  drop the length from `numpy`'s fixed-width string dtypes (#9586)
  fixing behaviour for group parameter in `open_datatree` (#9666)
  Use zarr v3 dimension_names (#9669)
  fix(zarr): use inplace array.resize for zarr 2 and 3 (#9673)
  implement `dask` methods on `DataTree` (#9670)
  support `chunks` in `open_groups` and `open_datatree` (#9660)
  Compatibility for zarr-python 3.x (#9552)
  Update to_dataframe doc to match current behavior (#9662)
  Reduce graph size through writing indexes directly into graph for ``map_blocks`` (#9658)
  • Loading branch information
dcherian committed Oct 29, 2024
2 parents f826b65 + 0c6cded commit 00ef8c5
Show file tree
Hide file tree
Showing 33 changed files with 2,462 additions and 437 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci-additional.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ jobs:
shell: bash -l {0}
env:
CONDA_ENV_FILE: ci/requirements/environment.yml
PYTHON_VERSION: "3.11"
PYTHON_VERSION: "3.12"

steps:
- uses: actions/checkout@v4
Expand Down
6 changes: 3 additions & 3 deletions ci/install-upstream-wheels.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,15 +45,15 @@ python -m pip install \
--pre \
--upgrade \
pyarrow
# manually install `pint` to pull in new dependencies
python -m pip install --upgrade pint
# manually install `pint`, `donfig`, and `crc32c` to pull in new dependencies
python -m pip install --upgrade pint donfig crc32c
python -m pip install \
--no-deps \
--upgrade \
git+https://github.com/dask/dask \
git+https://github.com/dask/dask-expr \
git+https://github.com/dask/distributed \
git+https://github.com/zarr-developers/zarr.git@main \
git+https://github.com/zarr-developers/zarr \
git+https://github.com/Unidata/cftime \
git+https://github.com/pypa/packaging \
git+https://github.com/hgrecco/pint \
Expand Down
5 changes: 5 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -656,6 +656,7 @@ This interface echoes that of ``xarray.Dataset``.
DataTree.has_attrs
DataTree.is_empty
DataTree.is_hollow
DataTree.chunksizes

Dictionary Interface
--------------------
Expand Down Expand Up @@ -968,6 +969,10 @@ DataTree methods
DataTree.to_dict
DataTree.to_netcdf
DataTree.to_zarr
DataTree.chunk
DataTree.load
DataTree.compute
DataTree.persist

.. ..
Expand Down
3 changes: 2 additions & 1 deletion doc/user-guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -823,8 +823,9 @@ For example:
.. ipython:: python
import zarr
from numcodecs.blosc import Blosc
compressor = zarr.Blosc(cname="zstd", clevel=3, shuffle=2)
compressor = Blosc(cname="zstd", clevel=3, shuffle=2)
ds.to_zarr("foo.zarr", encoding={"foo": {"compressor": compressor}})
.. note::
Expand Down
92 changes: 63 additions & 29 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,34 +14,18 @@ What's New
np.random.seed(123456)
.. _whats-new.2024.09.1:
.. _whats-new.2024.10.1:

v2024.09.1 (unreleased)
-----------------------
v.2024.10.1 (unreleased)
------------------------

New Features
~~~~~~~~~~~~
- ``DataTree`` related functionality is now exposed in the main ``xarray`` public
API. This includes: ``xarray.DataTree``, ``xarray.open_datatree``, ``xarray.open_groups``,
``xarray.map_over_datasets``, ``xarray.group_subtrees``,
``xarray.register_datatree_accessor`` and ``xarray.testing.assert_isomorphic``.
By `Owen Littlejohns <https://github.com/owenlittlejohns>`_,
`Eni Awowale <https://github.com/eni-awowale>`_,
`Matt Savoie <https://github.com/flamingbear>`_,
`Stephan Hoyer <https://github.com/shoyer>`_ and
`Tom Nicholas <https://github.com/TomNicholas>`_.
- A migration guide for users of the prototype `xarray-contrib/datatree repository <https://github.com/xarray-contrib/datatree>`_ has been added, and can be found in the `DATATREE_MIGRATION_GUIDE.md` file in the repository root.
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- Added zarr backends for :py:func:`open_groups` (:issue:`9430`, :pull:`9469`).
By `Eni Awowale <https://github.com/eni-awowale>`_.
- Added :py:meth:`DataTree.persist` method (:issue:`9675`, :pull:`9682`).
By `Sam Levang <https://github.com/slevang>`_.
- Support lazy grouping by dask arrays, and allow specifying ordered groups with ``UniqueGrouper(labels=["a", "b", "c"])``
(:issue:`2852`, :issue:`757`).
By `Deepak Cherian <https://github.com/dcherian>`_.
- Added support for vectorized interpolation using additional interpolators
from the ``scipy.interpolate`` module (:issue:`9049`, :pull:`9526`).
By `Holly Mandel <https://github.com/hollymandel>`_.
- Implement handling of complex numbers (netcdf4/h5netcdf) and enums (h5netcdf) (:issue:`9246`, :issue:`3297`, :pull:`9509`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.

Breaking changes
~~~~~~~~~~~~~~~~
Expand All @@ -55,6 +39,62 @@ Deprecations
provide expected group labels using the ``labels`` kwarg to a grouper object such as
:py:class:`grouper.UniqueGrouper` or :py:class:`grouper.BinGrouper`.

Bug fixes
~~~~~~~~~

- Fix inadvertent deep-copying of child data in DataTree.
By `Stephan Hoyer <https://github.com/shoyer>`_.

Documentation
~~~~~~~~~~~~~


Internal Changes
~~~~~~~~~~~~~~~~
- ``persist`` methods now route through the :py:class:`xr.core.parallelcompat.ChunkManagerEntrypoint` (:pull:`9682`).
By `Sam Levang <https://github.com/slevang>`_.

.. _whats-new.2024.10.0:

v2024.10.0 (Oct 24th, 2024)
---------------------------

This release brings official support for `xarray.DataTree`, and compatibility with zarr-python v3!

Aside from these two huge features, it also improves support for vectorised interpolation and fixes various bugs.

Thanks to the 31 contributors to this release:
Alfonso Ladino, DWesl, Deepak Cherian, Eni, Etienne Schalk, Holly Mandel, Ilan Gold, Illviljan, Joe Hamman, Justus Magin, Kai Mühlbauer, Karl Krauth, Mark Harfouche, Martey Dodoo, Matt Savoie, Maximilian Roos, Patrick Hoefler, Peter Hill, Renat Sibgatulin, Ryan Abernathey, Spencer Clark, Stephan Hoyer, Tom Augspurger, Tom Nicholas, Vecko, Virgile Andreani, Yvonne Fröhlich, carschandler, joseph nowak, mgunyho and owenlittlejohns

New Features
~~~~~~~~~~~~
- ``DataTree`` related functionality is now exposed in the main ``xarray`` public
API. This includes: ``xarray.DataTree``, ``xarray.open_datatree``, ``xarray.open_groups``,
``xarray.map_over_datasets``, ``xarray.group_subtrees``,
``xarray.register_datatree_accessor`` and ``xarray.testing.assert_isomorphic``.
By `Owen Littlejohns <https://github.com/owenlittlejohns>`_,
`Eni Awowale <https://github.com/eni-awowale>`_,
`Matt Savoie <https://github.com/flamingbear>`_,
`Stephan Hoyer <https://github.com/shoyer>`_,
`Tom Nicholas <https://github.com/TomNicholas>`_,
`Justus Magin <https://github.com/keewis>`_, and
`Alfonso Ladino <https://github.com/aladinor>`_.
- A migration guide for users of the prototype `xarray-contrib/datatree repository <https://github.com/xarray-contrib/datatree>`_ has been added, and can be found in the `DATATREE_MIGRATION_GUIDE.md` file in the repository root.
By `Tom Nicholas <https://github.com/TomNicholas>`_.
- Support for Zarr-Python 3 (:issue:`95515`, :pull:`9552`).
By `Tom Augspurger <https://github.com/TomAugspurger>`_,
`Ryan Abernathey <https://github.com/rabernat>`_ and
`Joe Hamman <https://github.com/jhamman>`_.
- Added zarr backends for :py:func:`open_groups` (:issue:`9430`, :pull:`9469`).
By `Eni Awowale <https://github.com/eni-awowale>`_.
- Added support for vectorized interpolation using additional interpolators
from the ``scipy.interpolate`` module (:issue:`9049`, :pull:`9526`).
By `Holly Mandel <https://github.com/hollymandel>`_.
- Implement handling of complex numbers (netcdf4/h5netcdf) and enums (h5netcdf) (:issue:`9246`, :issue:`3297`, :pull:`9509`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
- Fix passing missing arguments to when opening hdf5 and netCDF4 datatrees
(:issue:`9427`, :pull: `9428`).
By `Alfonso Ladino <https://github.com/aladinor>`_.

Bug fixes
~~~~~~~~~
Expand All @@ -78,6 +118,8 @@ Bug fixes
<https://github.com/josephnowak>`_.
- Fix binning by multiple variables where some bins have no observations. (:issue:`9630`).
By `Deepak Cherian <https://github.com/dcherian>`_.
- Fix issue where polyfit wouldn't handle non-dimension coordinates. (:issue:`4375`, :pull:`9369`)
By `Karl Krauth <https://github.com/Karl-Krauth>`_.

Documentation
~~~~~~~~~~~~~
Expand All @@ -88,12 +130,9 @@ Documentation
By `Owen Littlejohns <https://github.com/owenlittlejohns>`_, `Matt Savoie <https://github.com/flamingbear>`_, and
`Tom Nicholas <https://github.com/TomNicholas>`_.



Internal Changes
~~~~~~~~~~~~~~~~


.. _whats-new.2024.09.0:

v2024.09.0 (Sept 11, 2024)
Expand Down Expand Up @@ -161,17 +200,12 @@ Bug fixes
date "0001-01-01". (:issue:`9108`, :pull:`9116`) By `Spencer Clark
<https://github.com/spencerkclark>`_ and `Deepak Cherian
<https://github.com/dcherian>`_.
- Fix issue where polyfit wouldn't handle non-dimension coordinates. (:issue:`4375`, :pull:`9369`)
By `Karl Krauth <https://github.com/Karl-Krauth>`_.
- Fix issue with passing parameters to ZarrStore.open_store when opening
datatree in zarr format (:issue:`9376`, :pull:`9377`).
By `Alfonso Ladino <https://github.com/aladinor>`_
- Fix deprecation warning that was raised when calling ``np.array`` on an ``xr.DataArray``
in NumPy 2.0 (:issue:`9312`, :pull:`9393`)
By `Andrew Scherer <https://github.com/andrew-s28>`_.
- Fix passing missing arguments to when opening hdf5 and netCDF4 datatrees
(:issue:`9427`, :pull: `9428`).
By `Alfonso Ladino <https://github.com/aladinor>`_.
- Fix support for using ``pandas.DateOffset``, ``pandas.Timedelta``, and
``datetime.timedelta`` objects as ``resample`` frequencies
(:issue:`9408`, :pull:`9413`).
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ accel = ["scipy", "bottleneck", "numbagg", "numba>=0.54", "flox", "opt_einsum"]
complete = ["xarray[accel,etc,io,parallel,viz]"]
dev = [
"hypothesis",
"jinja2",
"mypy",
"pre-commit",
"pytest",
Expand All @@ -49,7 +50,7 @@ dev = [
"sphinx_autosummary_accessors",
"xarray[complete]",
]
io = ["netCDF4", "h5netcdf", "scipy", 'pydap; python_version<"3.10"', "zarr<3", "fsspec", "cftime", "pooch"]
io = ["netCDF4", "h5netcdf", "scipy", 'pydap; python_version<"3.10"', "zarr", "fsspec", "cftime", "pooch"]
etc = ["sparse"]
parallel = ["dask[complete]"]
viz = ["cartopy", "matplotlib", "nc-time-axis", "seaborn"]
Expand Down Expand Up @@ -124,6 +125,7 @@ module = [
"nc_time_axis.*",
"netCDF4.*",
"netcdftime.*",
"numcodecs.*",
"opt_einsum.*",
"pint.*",
"pooch.*",
Expand Down
Loading

0 comments on commit 00ef8c5

Please sign in to comment.