|
| 1 | +.. include:: ../common_links.inc |
| 2 | + |
| 3 | +====================== |
| 4 | +Iris ❤️ :term:`Xarray` |
| 5 | +====================== |
| 6 | + |
| 7 | +There is a lot of overlap between Iris and :term:`Xarray`, but some important |
| 8 | +differences too. Below is a summary of the most important differences, so that |
| 9 | +you can be prepared, and to help you choose the best package for your use case. |
| 10 | + |
| 11 | +Overall Experience |
| 12 | +------------------ |
| 13 | + |
| 14 | +Iris is the more specialised package, focussed on making it as easy |
| 15 | +as possible to work with meteorological and climatological data. Iris |
| 16 | +is built to natively handle many key concepts, such as the CF conventions, |
| 17 | +coordinate systems and bounded coordinates. Iris offers a smaller toolkit of |
| 18 | +operations compared to Xarray, particularly around API for sophisticated |
| 19 | +computation such as array manipulation and multi-processing. |
| 20 | + |
| 21 | +Xarray's more generic data model and community-driven development give it a |
| 22 | +richer range of operations and broader possible uses. Using Xarray |
| 23 | +specifically for meteorology/climatology may require deeper knowledge |
| 24 | +compared to using Iris, and you may prefer to add Xarray plugins |
| 25 | +such as :ref:`cfxarray` to get the best experience. Advanced users can likely |
| 26 | +achieve better performance with Xarray than with Iris. |
| 27 | + |
| 28 | +Conversion |
| 29 | +---------- |
| 30 | +There are multiple ways to convert between Iris and Xarray objects. |
| 31 | + |
| 32 | +* Xarray includes the :meth:`~xarray.DataArray.to_iris` and |
| 33 | + :meth:`~xarray.DataArray.from_iris` methods - detailed in the |
| 34 | + `Xarray IO notes on Iris`_. Since Iris evolves independently of Xarray, be |
| 35 | + vigilant for concepts that may be lost during the conversion. |
| 36 | +* Because both packages are closely linked to the :term:`NetCDF Format`, it is |
| 37 | + feasible to save a NetCDF file using one package then load that file using |
| 38 | + the other package. This will be lossy in places, as both Iris and Xarray |
| 39 | + are opinionated on how certain NetCDF concepts relate to their data models. |
| 40 | +* The Iris development team are exploring an improved 'bridge' between the two |
| 41 | + packages. Follow the conversation on GitHub: `iris#4994`_. This project is |
| 42 | + expressly intended to be as lossless as possible. |
| 43 | + |
| 44 | +Regridding |
| 45 | +---------- |
| 46 | +Iris and Xarray offer a range of regridding methods - both natively and via |
| 47 | +additional packages such as `iris-esmf-regrid`_ and `xESMF`_ - which overlap |
| 48 | +in places |
| 49 | +but tend to cover a different set of use cases (e.g. Iris handles unstructured |
| 50 | +meshes but offers access to fewer ESMF methods). The behaviour of these |
| 51 | +regridders also differs slightly (even between different regridders attached to |
| 52 | +the same package) so the appropriate package to use depends highly on the |
| 53 | +particulars of the use case. |
| 54 | + |
| 55 | +Plotting |
| 56 | +-------- |
| 57 | +Xarray and Iris have a large overlap of functionality when creating |
| 58 | +:term:`Matplotlib` plots and both support the plotting of multidimensional |
| 59 | +coordinates. This means the experience is largely similar using either package. |
| 60 | + |
| 61 | +Xarray supports further plotting backends through external packages (e.g. Bokeh through `hvPlot`_) |
| 62 | +and, if a user is already familiar with `pandas`_, the interface should be |
| 63 | +familiar. It also supports some different plot types to Iris, and therefore can |
| 64 | +be used for a wider variety of plots. It also has benefits regarding "out of |
| 65 | +the box", quick customisations to plots. However, if further customisation is |
| 66 | +required, knowledge of matplotlib is still required. |
| 67 | + |
| 68 | +In both cases, :term:`Cartopy` is/can be used. Iris does more work |
| 69 | +automatically for the user here, creating Cartopy |
| 70 | +:class:`~cartopy.mpl.geoaxes.GeoAxes` for latitude and longitude coordinates, |
| 71 | +whereas the user has to do this manually in Xarray. |
| 72 | + |
| 73 | +Statistics |
| 74 | +---------- |
| 75 | +Both libraries are quite comparable with generally similar capabilities, |
| 76 | +performance and laziness. Iris offers more specificity in some cases, such as |
| 77 | +some more specific unique functions and masked tolerance in most statistics. |
| 78 | +Xarray seems more approachable however, with some less unique but more |
| 79 | +convenient solutions (these tend to be wrappers to :term:`Dask` functions). |
| 80 | + |
| 81 | +Laziness and Multi-Processing with :term:`Dask` |
| 82 | +----------------------------------------------- |
| 83 | +Iris and Xarray both support lazy data and out-of-core processing through |
| 84 | +utilisation of Dask. |
| 85 | + |
| 86 | +While both Iris and Xarray expose :term:`NumPy` conveniences at the API level |
| 87 | +(e.g. the `ndim()` method), only Xarray exposes Dask conveniences. For example |
| 88 | +:attr:`xarray.DataArray.chunks`, which gives the user direct control |
| 89 | +over the underlying Dask array chunks. The Iris API instead takes control of |
| 90 | +such concepts and user control is only possible by manipulating the underlying |
| 91 | +Dask array directly (accessed via :meth:`iris.cube.Cube.core_data`). |
| 92 | + |
| 93 | +:class:`xarray.DataArray`\ s comply with `NEP-18`_, allowing NumPy arrays to be |
| 94 | +based on them, and they also include the necessary extra members for Dask |
| 95 | +arrays to be based on them too. Neither of these is currently possible with |
| 96 | +Iris :class:`~iris.cube.Cube`\ s, although an ambition for the future. |
| 97 | + |
| 98 | +NetCDF File Control |
| 99 | +------------------- |
| 100 | +(More info: :term:`NetCDF Format`) |
| 101 | + |
| 102 | +Unlike Iris, Xarray generally provides full control of major file structures, |
| 103 | +i.e. dimensions + variables, including their order in the file. It mostly |
| 104 | +respects these in a file input, and can reproduce them on output. |
| 105 | +However, attribute handling is not so complete: like Iris, it interprets and |
| 106 | +modifies some recognised aspects, and can add some extra attributes not in the |
| 107 | +input. |
| 108 | + |
| 109 | +.. todo: |
| 110 | + More detail on dates and fill values (@pp-mo suggestion). |
| 111 | +
|
| 112 | +Handling of dates and fill values have some special problems here. |
| 113 | + |
| 114 | +Ultimately, nearly everything wanted in a particular desired result file can |
| 115 | +be achieved in Xarray, via provided override mechanisms (`loading keywords`_ |
| 116 | +and the '`encoding`_' dictionaries). |
| 117 | + |
| 118 | +Missing Data |
| 119 | +------------ |
| 120 | +Xarray uses :data:`numpy.nan` to represent missing values and this will support |
| 121 | +many simple use cases assuming the data are floats. Iris enables more |
| 122 | +sophisticated missing data handling by representing missing values as masks |
| 123 | +(:class:`numpy.ma.MaskedArray` for real data and :class:`dask.array.Array` |
| 124 | +for lazy data) which allows data to be any data type and to include either/both |
| 125 | +a mask and :data:`~numpy.nan`\ s. |
| 126 | + |
| 127 | +.. _cfxarray: |
| 128 | + |
| 129 | +`cf-xarray`_ |
| 130 | +------------- |
| 131 | +Iris has a data model entirely based on :term:`CF Conventions`. Xarray has a |
| 132 | +data model based on :term:`NetCDF Format` with cf-xarray acting as translation |
| 133 | +into CF. Xarray/cf-xarray methods can be |
| 134 | +called and data accessed with CF like arguments (e.g. axis, standard name) and |
| 135 | +there are some CF specific utilities (similar |
| 136 | +to Iris utilities). Iris tends to cover more of and be stricter about CF. |
| 137 | + |
| 138 | + |
| 139 | +.. seealso:: |
| 140 | + |
| 141 | + * `Xarray IO notes on Iris`_ |
| 142 | + * `Xarray notes on other NetCDF libraries`_ |
| 143 | + |
| 144 | +.. _Xarray IO notes on Iris: https://docs.xarray.dev/en/stable/user-guide/io.html#iris |
| 145 | +.. _Xarray notes on other NetCDF libraries: https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#what-other-netcdf-related-python-libraries-should-i-know-about |
| 146 | +.. _loading keywords: https://docs.xarray.dev/en/stable/generated/xarray.open_dataset.html#xarray.open_dataset |
| 147 | +.. _encoding: https://docs.xarray.dev/en/stable/user-guide/io.html#writing-encoded-data |
| 148 | +.. _xESMF: https://github.com/pangeo-data/xESMF/ |
| 149 | +.. _seaborn: https://seaborn.pydata.org/ |
| 150 | +.. _hvPlot: https://hvplot.holoviz.org/ |
| 151 | +.. _pandas: https://pandas.pydata.org/ |
| 152 | +.. _NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html |
| 153 | +.. _cf-xarray: https://github.com/xarray-contrib/cf-xarray |
| 154 | +.. _iris#4994: https://github.com/SciTools/iris/issues/4994 |
0 commit comments