Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into yohai-ds_scatter
Browse files Browse the repository at this point in the history
* upstream/master:
  Rework whats-new for 0.12
  Add whats-new for 0.12.1
  Release 0.12.0
  enable loading remote hdf5 files (pydata#2782)
  Push back finalizing deprecations for 0.12 (pydata#2809)
  Drop failing tests writing multi-dimensional arrays as attributes (pydata#2810)
  some docs updates (pydata#2746)
  Add support for cftime.datetime coordinates with coarsen (pydata#2778)
  Don't use deprecated np.asscalar() (pydata#2800)
  Improve name concat (pydata#2792)
  Add `Dataset.drop_dims` (pydata#2767)
  Quarter offset implemented (base is now latest pydata-master). (pydata#2721)
  Add use_cftime option to open_dataset (pydata#2759)
  Bugfix/reduce no axis (pydata#2769)
  'standard' now refers to 'gregorian' in cftime_range (pydata#2771)
  • Loading branch information
dcherian committed Mar 18, 2019
2 parents 8cd8722 + a5ca64a commit ee662b4
Show file tree
Hide file tree
Showing 36 changed files with 1,568 additions and 428 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ pip-log.txt
.tox
nosetests.xml
.cache
.mypy_cache
.ropeproject/
.tags*
.testmon*
Expand Down
1 change: 1 addition & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ Dataset contents
Dataset.swap_dims
Dataset.expand_dims
Dataset.drop
Dataset.drop_dims
Dataset.set_coords
Dataset.reset_coords

Expand Down
7 changes: 7 additions & 0 deletions doc/data-structures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,13 @@ operations keep around coordinates:
list(ds[['x']])
list(ds.drop('temperature'))
To remove a dimension, you can use :py:meth:`~xarray.Dataset.drop_dims` method.
Any variables using that dimension are dropped:

.. ipython:: python
ds.drop_dims('time')
As an alternate to dictionary-like modifications, you can use
:py:meth:`~xarray.Dataset.assign` and :py:meth:`~xarray.Dataset.assign_coords`.
These methods return a new dataset with additional (or replaced) or values:
Expand Down
2 changes: 2 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Documentation
* :doc:`reshaping`
* :doc:`combining`
* :doc:`time-series`
* :doc:`weather-climate`
* :doc:`pandas`
* :doc:`io`
* :doc:`dask`
Expand All @@ -70,6 +71,7 @@ Documentation
reshaping
combining
time-series
weather-climate
pandas
io
dask
Expand Down
11 changes: 9 additions & 2 deletions doc/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -229,8 +229,8 @@ arrays). However, you can do normal indexing with dimension names:
Using indexing to *assign* values to a subset of dataset (e.g.,
``ds[dict(space=0)] = 1``) is not yet supported.

Dropping labels
---------------
Dropping labels and dimensions
------------------------------

The :py:meth:`~xarray.Dataset.drop` method returns a new object with the listed
index labels along a dimension dropped:
Expand All @@ -241,6 +241,13 @@ index labels along a dimension dropped:
``drop`` is both a ``Dataset`` and ``DataArray`` method.

Use :py:meth:`~xarray.Dataset.drop_dims` to drop a full dimension from a Dataset.
Any variables with these dimensions are also dropped:

.. ipython:: python
ds.drop_dims('time')
.. _masking with where:

Expand Down
13 changes: 8 additions & 5 deletions doc/io.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
.. _io:

Serialization and IO
====================
Reading and writing files
=========================

xarray supports direct serialization and IO to several file formats, from
simple :ref:`io.pickle` files to the more flexible :ref:`io.netcdf`
format.
format (recommended).

.. ipython:: python
:suppress:
Expand Down Expand Up @@ -739,11 +739,14 @@ options are listed on the PseudoNetCDF page.
.. _PseudoNetCDF: http://github.com/barronh/PseudoNetCDF


Formats supported by Pandas
---------------------------
CSV and other formats supported by Pandas
-----------------------------------------

For more options (tabular formats and CSV files in particular), consider
exporting your objects to pandas and using its broad range of `IO tools`_.
For CSV files, one might also consider `xarray_extras`_.

.. _xarray_extras: https://xarray-extras.readthedocs.io/en/latest/api/csv.html

.. _IO tools: http://pandas.pydata.org/pandas-docs/stable/io.html

Expand Down
4 changes: 4 additions & 0 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ For more extensive plotting applications consider the following projects:
data structures for building even complex visualizations easily." Includes
native support for xarray objects.

- `hvplot <https://hvplot.pyviz.org/>`_: ``hvplot`` makes it very easy to produce
dynamic plots (backed by ``Holoviews`` or ``Geoviews``) by adding a ``hvplot``
accessor to DataArrays.

- `Cartopy <http://scitools.org.uk/cartopy/>`_: Provides cartographic
tools.

Expand Down
1 change: 1 addition & 0 deletions doc/related-projects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Geosciences
- `aospy <https://aospy.readthedocs.io>`_: Automated analysis and management of gridded climate data.
- `infinite-diff <https://github.com/spencerahill/infinite-diff>`_: xarray-based finite-differencing, focused on gridded climate/meterology data
- `marc_analysis <https://github.com/darothen/marc_analysis>`_: Analysis package for CESM/MARC experiments and output.
- `MetPy <https://unidata.github.io/MetPy/dev/index.html>`_: A collection of tools in Python for reading, visualizing, and performing calculations with weather data.
- `MPAS-Analysis <http://mpas-analysis.readthedocs.io>`_: Analysis for simulations produced with Model for Prediction Across Scales (MPAS) components and the Accelerated Climate Model for Energy (ACME).
- `OGGM <http://oggm.org/>`_: Open Global Glacier Model
- `Oocgcm <https://oocgcm.readthedocs.io/>`_: Analysis of large gridded geophysical datasets
Expand Down
137 changes: 0 additions & 137 deletions doc/time-series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -212,140 +212,3 @@ Data that has indices outside of the given ``tolerance`` are set to ``NaN``.
For more examples of using grouped operations on a time dimension, see
:ref:`toy weather data`.


.. _CFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `Timestamp-valid range`_
(approximately between years 1678 and 2262).

.. note::

As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
will be used to represent times (either in indexes, as a
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be
represented with the ``np.datetime64[ns]`` data type, enabling the use of a
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]``
and their full set of associated features.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. ipython:: python
from itertools import product
from cftime import DatetimeNoLeap
dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
xarray also includes a :py:func:`~xarray.cftime_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:

.. ipython:: python
dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_ using strictly `ISO 8601-format`_ partial
datetime strings:

.. ipython:: python
da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))
- Access of basic datetime components via the ``dt`` accessor (in this case
just "year", "month", "day", "hour", "minute", "second", "microsecond",
"season", "dayofyear", and "dayofweek"):

.. ipython:: python
da.time.dt.year
da.time.dt.month
da.time.dt.season
da.time.dt.dayofyear
da.time.dt.dayofweek
- Group-by operations based on datetime accessor attributes (e.g. by month of
the year):

.. ipython:: python
da.groupby('time.month').sum()
- Interpolation using :py:class:`cftime.datetime` objects:

.. ipython:: python
da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])
- Interpolation using datetime strings:

.. ipython:: python
da.interp(time=['0001-01-15', '0001-02-15'])
- Differentiation:

.. ipython:: python
da.differentiate('time')
- Serialization:

.. ipython:: python
da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')
- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python
da.resample(time='81T', closed='right', label='right', base=3).mean()
.. note::


For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:
modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(range(24), [('time', modern_times)])
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex
However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
and silent errors due to the difference in calendar types between the dates
encoded in your data and the dates stored in memory.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
Loading

0 comments on commit ee662b4

Please sign in to comment.