Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some docs updates #2746

Merged
merged 15 commits into from
Mar 12, 2019
2 changes: 2 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Documentation
* :doc:`reshaping`
* :doc:`combining`
* :doc:`time-series`
* :doc:`weather-climate`
* :doc:`pandas`
* :doc:`io`
* :doc:`dask`
Expand All @@ -70,6 +71,7 @@ Documentation
reshaping
combining
time-series
weather-climate
pandas
io
dask
Expand Down
13 changes: 8 additions & 5 deletions doc/io.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
.. _io:

Serialization and IO
====================
Reading and writing files
=========================

xarray supports direct serialization and IO to several file formats, from
simple :ref:`io.pickle` files to the more flexible :ref:`io.netcdf`
format.
format (recommended).

.. ipython:: python
:suppress:
Expand Down Expand Up @@ -739,11 +739,14 @@ options are listed on the PseudoNetCDF page.
.. _PseudoNetCDF: http://github.com/barronh/PseudoNetCDF


Formats supported by Pandas
---------------------------
CSV and other formats supported by Pandas
-----------------------------------------

For more options (tabular formats and CSV files in particular), consider
exporting your objects to pandas and using its broad range of `IO tools`_.
For CSV files, one might also consider `xarray_extras`_.

.. _xarray_extras: https://xarray-extras.readthedocs.io/en/latest/api/csv.html

.. _IO tools: http://pandas.pydata.org/pandas-docs/stable/io.html

Expand Down
4 changes: 4 additions & 0 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ For more extensive plotting applications consider the following projects:
data structures for building even complex visualizations easily." Includes
native support for xarray objects.

- `hvplot <https://hvplot.pyviz.org/>`_: ``hvplot`` makes it very easy to produce
dynamic plots (backed by ``Holoviews`` or ``Geoviews``) by adding a ``hvplot``
accessor to DataArrays.

- `Cartopy <http://scitools.org.uk/cartopy/>`_: Provides cartographic
tools.

Expand Down
1 change: 1 addition & 0 deletions doc/related-projects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Geosciences
- `aospy <https://aospy.readthedocs.io>`_: Automated analysis and management of gridded climate data.
- `infinite-diff <https://github.com/spencerahill/infinite-diff>`_: xarray-based finite-differencing, focused on gridded climate/meterology data
- `marc_analysis <https://github.com/darothen/marc_analysis>`_: Analysis package for CESM/MARC experiments and output.
- `MetPy <https://unidata.github.io/MetPy/dev/index.html>`_: A collection of tools in Python for reading, visualizing, and performing calculations with weather data.
- `MPAS-Analysis <http://mpas-analysis.readthedocs.io>`_: Analysis for simulations produced with Model for Prediction Across Scales (MPAS) components and the Accelerated Climate Model for Energy (ACME).
- `OGGM <http://oggm.org/>`_: Open Global Glacier Model
- `Oocgcm <https://oocgcm.readthedocs.io/>`_: Analysis of large gridded geophysical datasets
Expand Down
137 changes: 0 additions & 137 deletions doc/time-series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -212,140 +212,3 @@ Data that has indices outside of the given ``tolerance`` are set to ``NaN``.

For more examples of using grouped operations on a time dimension, see
:ref:`toy weather data`.


.. _CFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `Timestamp-valid range`_
(approximately between years 1678 and 2262).

.. note::

As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
will be used to represent times (either in indexes, as a
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be
represented with the ``np.datetime64[ns]`` data type, enabling the use of a
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]``
and their full set of associated features.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. ipython:: python

from itertools import product
from cftime import DatetimeNoLeap
dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

xarray also includes a :py:func:`~xarray.cftime_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:

.. ipython:: python

dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_ using strictly `ISO 8601-format`_ partial
datetime strings:

.. ipython:: python

da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))

- Access of basic datetime components via the ``dt`` accessor (in this case
just "year", "month", "day", "hour", "minute", "second", "microsecond",
"season", "dayofyear", and "dayofweek"):

.. ipython:: python

da.time.dt.year
da.time.dt.month
da.time.dt.season
da.time.dt.dayofyear
da.time.dt.dayofweek

- Group-by operations based on datetime accessor attributes (e.g. by month of
the year):

.. ipython:: python

da.groupby('time.month').sum()

- Interpolation using :py:class:`cftime.datetime` objects:

.. ipython:: python

da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])

- Interpolation using datetime strings:

.. ipython:: python

da.interp(time=['0001-01-15', '0001-02-15'])

- Differentiation:

.. ipython:: python

da.differentiate('time')

- Serialization:

.. ipython:: python

da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')

- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python

da.resample(time='81T', closed='right', label='right', base=3).mean()

.. note::


For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:

modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(range(24), [('time', modern_times)])
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex

However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
and silent errors due to the difference in calendar types between the dates
encoded in your data and the dates stored in memory.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
160 changes: 160 additions & 0 deletions doc/weather-climate.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
.. _weather-climate:

Weather and climate data
========================

.. ipython:: python
:suppress:

import xarray as xr

``xarray`` can leverage metadata that follows the `Climate and Forecast (CF) conventions`_ if present. Examples include automatic labelling of plots with descriptive names and units if proper metadata is present (see :ref:`plotting`) and support for non-standard calendars used in climate science through the ``cftime`` module (see :ref:`CFTimeIndex`). There are also a number of geosciences-focused projects that build on xarray (see :ref:`related-projects`).

.. _Climate and Forecast (CF) conventions: http://cfconventions.org

.. _metpy_accessor:

CF-compliant coordinate variables
---------------------------------

`MetPy`_ adds a ``metpy`` accessor that allows accessing coordinates with appropriate CF metadata using generic names ``x``, ``y``, ``vertical`` and ``time``. There is also a `cartopy_crs` attribute that provides projection information, parsed from the appropriate CF metadata, as a `Cartopy`_ projection object. See `their documentation`_ for more information.

.. _`MetPy`: https://unidata.github.io/MetPy/dev/index.html
.. _`their documentation`: https://unidata.github.io/MetPy/dev/tutorials/xarray_tutorial.html#coordinates
dcherian marked this conversation as resolved.
Show resolved Hide resolved
dcherian marked this conversation as resolved.
Show resolved Hide resolved
.. _`Cartopy`: https://scitools.org.uk/cartopy/docs/latest/crs/projections.html

.. _CFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `Timestamp-valid range`_
(approximately between years 1678 and 2262).

.. note::

As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
will be used to represent times (either in indexes, as a
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be
represented with the ``np.datetime64[ns]`` data type, enabling the use of a
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]``
and their full set of associated features.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. ipython:: python

from itertools import product
from cftime import DatetimeNoLeap
dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

xarray also includes a :py:func:`~xarray.cftime_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:

.. ipython:: python

dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_ using strictly `ISO 8601-format`_ partial
datetime strings:

.. ipython:: python

da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))

- Access of basic datetime components via the ``dt`` accessor (in this case
just "year", "month", "day", "hour", "minute", "second", "microsecond",
"season", "dayofyear", and "dayofweek"):

.. ipython:: python

da.time.dt.year
da.time.dt.month
da.time.dt.season
da.time.dt.dayofyear
da.time.dt.dayofweek

- Group-by operations based on datetime accessor attributes (e.g. by month of
the year):

.. ipython:: python

da.groupby('time.month').sum()

- Interpolation using :py:class:`cftime.datetime` objects:

.. ipython:: python

da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])

- Interpolation using datetime strings:

.. ipython:: python

da.interp(time=['0001-01-15', '0001-02-15'])

- Differentiation:

.. ipython:: python

da.differentiate('time')

- Serialization:

.. ipython:: python

da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')

- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python

da.resample(time='81T', closed='right', label='right', base=3).mean()

.. note::


For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:

modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(range(24), [('time', modern_times)])
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex

However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
and silent errors due to the difference in calendar types between the dates
encoded in your data and the dates stored in memory.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
2 changes: 1 addition & 1 deletion doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Bug fixes
from higher frequencies to lower frequencies. Datapoints outside the bounds
of the original time coordinate are now filled with NaN (:issue:`2197`). By
`Spencer Clark <https://github.com/spencerkclark>`_.
- Line plots with the `x` argument set to a non-dimensional coord now plot the correct data for 1D DataArrays.
- Line plots with the ``x`` argument set to a non-dimensional coord now plot the correct data for 1D DataArrays.
(:issue:`27251). By `Tom Nicholas <http://github.com/TomNicholas>`_.
- Subtracting a scalar ``cftime.datetime`` object from a
:py:class:`CFTimeIndex` now results in a :py:class:`pandas.TimedeltaIndex`
Expand Down
Loading