Skip to content

Commit eac662b

Browse files
committed
Merge remote-tracking branch 'upstream/master' into jbrockmendel-less24024b
2 parents a3c42f0 + d1b2a52 commit eac662b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+3097
-1870
lines changed

.travis.yml

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -36,14 +36,6 @@ matrix:
3636
env:
3737
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="not slow and not network"
3838

39-
- dist: trusty
40-
env:
41-
- JOB="2.7, locale, slow, old NumPy" ENV_FILE="ci/deps/travis-27-locale.yaml" LOCALE_OVERRIDE="zh_CN.UTF-8" PATTERN="slow"
42-
addons:
43-
apt:
44-
packages:
45-
- language-pack-zh-hans
46-
4739
- dist: trusty
4840
env:
4941
- JOB="2.7" ENV_FILE="ci/deps/travis-27.yaml" PATTERN="not slow"
@@ -60,14 +52,6 @@ matrix:
6052
env:
6153
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36.yaml" PATTERN="not slow and not network" PANDAS_TESTING_MODE="deprecate" COVERAGE=true
6254

63-
- dist: trusty
64-
env:
65-
- JOB="3.7, NumPy dev" ENV_FILE="ci/deps/travis-37-numpydev.yaml" PATTERN="not slow and not network" TEST_ARGS="-W error" PANDAS_TESTING_MODE="deprecate"
66-
addons:
67-
apt:
68-
packages:
69-
- xsel
70-
7155
# In allow_failures
7256
- dist: trusty
7357
env:

asv_bench/benchmarks/period.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
from pandas import (DataFrame, Series, Period, PeriodIndex, date_range,
2-
period_range)
1+
from pandas import (
2+
DataFrame, Period, PeriodIndex, Series, date_range, period_range)
33

44

55
class PeriodProperties(object):
@@ -94,7 +94,7 @@ def time_value_counts(self, typ):
9494
class Indexing(object):
9595

9696
def setup(self):
97-
self.index = PeriodIndex(start='1985', periods=1000, freq='D')
97+
self.index = period_range(start='1985', periods=1000, freq='D')
9898
self.series = Series(range(1000), index=self.index)
9999
self.period = self.index[500]
100100

ci/azure/posix.yml

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,21 +20,36 @@ jobs:
2020
CONDA_PY: "27"
2121
PATTERN: "not slow and not network"
2222

23-
py37_locale:
24-
ENV_FILE: ci/deps/azure-37-locale.yaml
25-
CONDA_PY: "37"
26-
PATTERN: "not slow and not network"
23+
py27_locale_slow_old_np:
24+
ENV_FILE: ci/deps/azure-27-locale.yaml
25+
CONDA_PY: "27"
26+
PATTERN: "slow"
2727
LOCALE_OVERRIDE: "zh_CN.UTF-8"
28+
EXTRA_APT: "language-pack-zh-hans"
2829

2930
py36_locale_slow:
3031
ENV_FILE: ci/deps/azure-36-locale_slow.yaml
3132
CONDA_PY: "36"
3233
PATTERN: "not slow and not network"
3334
LOCALE_OVERRIDE: "it_IT.UTF-8"
3435

36+
py37_locale:
37+
ENV_FILE: ci/deps/azure-37-locale.yaml
38+
CONDA_PY: "37"
39+
PATTERN: "not slow and not network"
40+
LOCALE_OVERRIDE: "zh_CN.UTF-8"
41+
42+
py37_np_dev:
43+
ENV_FILE: ci/deps/azure-37-numpydev.yaml
44+
CONDA_PY: "37"
45+
PATTERN: "not slow and not network"
46+
TEST_ARGS: "-W error"
47+
PANDAS_TESTING_MODE: "deprecate"
48+
EXTRA_APT: "xsel"
49+
3550
steps:
3651
- script: |
37-
if [ "$(uname)" == "Linux" ]; then sudo apt-get install -y libc6-dev-i386; fi
52+
if [ "$(uname)" == "Linux" ]; then sudo apt-get install -y libc6-dev-i386 $EXTRA_APT; fi
3853
echo "Installing Miniconda"
3954
ci/incremental/install_miniconda.sh
4055
export PATH=$HOME/miniconda3/bin:$PATH
File renamed without changes.
File renamed without changes.

doc/source/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3997,6 +3997,7 @@ objects.
39973997
api.extensions.register_index_accessor
39983998
api.extensions.ExtensionDtype
39993999
api.extensions.ExtensionArray
4000+
arrays.PandasArray
40004001

40014002
.. This is to prevent warnings in the doc build. We don't want to encourage
40024003
.. these methods.

doc/source/basics.rst

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,10 @@ the **array** property
7171
s.array
7272
s.index.array
7373
74-
Depending on the data type (see :ref:`basics.dtypes`), :attr:`~Series.array`
75-
be either a NumPy array or an :ref:`ExtensionArray <extending.extension-type>`.
74+
:attr:`~Series.array` will always be an :class:`~pandas.api.extensions.ExtensionArray`.
75+
The exact details of what an ``ExtensionArray`` is and why pandas uses them is a bit
76+
beyond the scope of this introduction. See :ref:`basics.dtypes` for more.
77+
7678
If you know you need a NumPy array, use :meth:`~Series.to_numpy`
7779
or :meth:`numpy.asarray`.
7880

@@ -81,10 +83,30 @@ or :meth:`numpy.asarray`.
8183
s.to_numpy()
8284
np.asarray(s)
8385
84-
For Series and Indexes backed by NumPy arrays (like we have here), this will
85-
be the same as :attr:`~Series.array`. When the Series or Index is backed by
86-
a :class:`~pandas.api.extension.ExtensionArray`, :meth:`~Series.to_numpy`
87-
may involve copying data and coercing values.
86+
When the Series or Index is backed by
87+
an :class:`~pandas.api.extension.ExtensionArray`, :meth:`~Series.to_numpy`
88+
may involve copying data and coercing values. See :ref:`basics.dtypes` for more.
89+
90+
:meth:`~Series.to_numpy` gives some control over the ``dtype`` of the
91+
resulting :class:`ndarray`. For example, consider datetimes with timezones.
92+
NumPy doesn't have a dtype to represent timezone-aware datetimes, so there
93+
are two possibly useful representations:
94+
95+
1. An object-dtype :class:`ndarray` with :class:`Timestamp` objects, each
96+
with the correct ``tz``
97+
2. A ``datetime64[ns]`` -dtype :class:`ndarray`, where the values have
98+
been converted to UTC and the timezone discarded
99+
100+
Timezones may be preserved with ``dtype=object``
101+
102+
.. ipython:: python
103+
104+
ser = pd.Series(pd.date_range('2000', periods=2, tz="CET"))
105+
ser.to_numpy(dtype=object)
106+
107+
Or thrown away with ``dtype='datetime64[ns]'``
108+
109+
ser.to_numpy(dtype="datetime64[ns]")
88110

89111
:meth:`~Series.to_numpy` gives some control over the ``dtype`` of the
90112
resulting :class:`ndarray`. For example, consider datetimes with timezones.
@@ -109,7 +131,7 @@ Or thrown away with ``dtype='datetime64[ns]'``
109131

110132
Getting the "raw data" inside a :class:`DataFrame` is possibly a bit more
111133
complex. When your ``DataFrame`` only has a single data type for all the
112-
columns, :attr:`DataFrame.to_numpy` will return the underlying data:
134+
columns, :meth:`DataFrame.to_numpy` will return the underlying data:
113135

114136
.. ipython:: python
115137
@@ -136,8 +158,9 @@ drawbacks:
136158

137159
1. When your Series contains an :ref:`extension type <extending.extension-type>`, it's
138160
unclear whether :attr:`Series.values` returns a NumPy array or the extension array.
139-
:attr:`Series.array` will always return the actual array backing the Series,
140-
while :meth:`Series.to_numpy` will always return a NumPy array.
161+
:attr:`Series.array` will always return an ``ExtensionArray``, and will never
162+
copy data. :meth:`Series.to_numpy` will always return a NumPy array,
163+
potentially at the cost of copying / coercing values.
141164
2. When your DataFrame contains a mixture of data types, :attr:`DataFrame.values` may
142165
involve copying data and coercing values to a common dtype, a relatively expensive
143166
operation. :meth:`DataFrame.to_numpy`, being a method, makes it clearer that the

doc/source/dsintro.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -146,11 +146,15 @@ If you need the actual array backing a ``Series``, use :attr:`Series.array`.
146146
147147
s.array
148148
149-
Again, this is often a NumPy array, but may instead be a
150-
:class:`~pandas.api.extensions.ExtensionArray`. See :ref:`basics.dtypes` for more.
151149
Accessing the array can be useful when you need to do some operation without the
152150
index (to disable :ref:`automatic alignment <dsintro.alignment>`, for example).
153151

152+
:attr:`Series.array` will always be an :class:`~pandas.api.extensions.ExtensionArray`.
153+
Briefly, an ExtensionArray is a thin wrapper around one or more *concrete* arrays like a
154+
:class:`numpy.ndarray`. Pandas knows how to take an ``ExtensionArray`` and
155+
store it in a ``Series`` or a column of a ``DataFrame``.
156+
See :ref:`basics.dtypes` for more.
157+
154158
While Series is ndarray-like, if you need an *actual* ndarray, then use
155159
:meth:`Series.to_numpy`.
156160

doc/source/whatsnew/v0.21.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -654,7 +654,7 @@ Previous Behavior:
654654

655655
.. code-block:: ipython
656656
657-
In [1]: pi = pd.PeriodIndex(start='2000-01-01', freq='D', periods=10)
657+
In [1]: pi = pd.period_range(start='2000-01-01', freq='D', periods=10)
658658
659659
In [2]: s = pd.Series(np.arange(10), index=pi)
660660
@@ -674,7 +674,7 @@ New Behavior:
674674

675675
.. ipython:: python
676676
677-
pi = pd.PeriodIndex(start='2000-01-01', freq='D', periods=10)
677+
pi = pd.period_range(start='2000-01-01', freq='D', periods=10)
678678
679679
s = pd.Series(np.arange(10), index=pi)
680680

doc/source/whatsnew/v0.24.0.rst

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,11 @@ If you need an actual NumPy array, use :meth:`Series.to_numpy` or :meth:`Index.t
6565
idx.to_numpy()
6666
pd.Series(idx).to_numpy()
6767
68-
For Series and Indexes backed by normal NumPy arrays, this will be the same thing (and the same
69-
as ``.values``).
68+
For Series and Indexes backed by normal NumPy arrays, :attr:`Series.array` will return a
69+
new :class:`arrays.PandasArray`, which is a thin (no-copy) wrapper around a
70+
:class:`numpy.ndarray`. :class:`arrays.PandasArray` isn't especially useful on its own,
71+
but it does provide the same interface as any extension array defined in pandas or by
72+
a third-party library.
7073

7174
.. ipython:: python
7275
@@ -75,7 +78,7 @@ as ``.values``).
7578
ser.to_numpy()
7679
7780
We haven't removed or deprecated :attr:`Series.values` or :attr:`DataFrame.values`, but we
78-
recommend and using ``.array`` or ``.to_numpy()`` instead.
81+
highly recommend and using ``.array`` or ``.to_numpy()`` instead.
7982

8083
See :ref:`Dtypes <basics.dtypes>` and :ref:`Attributes and Underlying Data <basics.attrs>` for more.
8184

@@ -1148,7 +1151,7 @@ Deprecations
11481151
- Timezone converting a tz-aware ``datetime.datetime`` or :class:`Timestamp` with :class:`Timestamp` and the ``tz`` argument is now deprecated. Instead, use :meth:`Timestamp.tz_convert` (:issue:`23579`)
11491152
- :func:`pandas.api.types.is_period` is deprecated in favor of `pandas.api.types.is_period_dtype` (:issue:`23917`)
11501153
- :func:`pandas.api.types.is_datetimetz` is deprecated in favor of `pandas.api.types.is_datetime64tz` (:issue:`23917`)
1151-
- Creating a :class:`TimedeltaIndex` or :class:`DatetimeIndex` by passing range arguments `start`, `end`, and `periods` is deprecated in favor of :func:`timedelta_range` and :func:`date_range` (:issue:`23919`)
1154+
- Creating a :class:`TimedeltaIndex`, :class:`DatetimeIndex`, or :class:`PeriodIndex` by passing range arguments `start`, `end`, and `periods` is deprecated in favor of :func:`timedelta_range`, :func:`date_range`, or :func:`period_range` (:issue:`23919`)
11521155
- Passing a string alias like ``'datetime64[ns, UTC]'`` as the ``unit`` parameter to :class:`DatetimeTZDtype` is deprecated. Use :class:`DatetimeTZDtype.construct_from_string` instead (:issue:`23990`).
11531156
- In :meth:`Series.where` with Categorical data, providing an ``other`` that is not present in the categories is deprecated. Convert the categorical to a different dtype or add the ``other`` to the categories first (:issue:`24077`).
11541157
- :meth:`Series.clip_lower`, :meth:`Series.clip_upper`, :meth:`DataFrame.clip_lower` and :meth:`DataFrame.clip_upper` are deprecated and will be removed in a future version. Use ``Series.clip(lower=threshold)``, ``Series.clip(upper=threshold)`` and the equivalent ``DataFrame`` methods (:issue:`24203`)
@@ -1310,6 +1313,7 @@ Datetimelike
13101313
- Bug in :meth:`Series.combine_first` not properly aligning categoricals, so that missing values in ``self`` where not filled by valid values from ``other`` (:issue:`24147`)
13111314
- Bug in :func:`DataFrame.combine` with datetimelike values raising a TypeError (:issue:`23079`)
13121315
- Bug in :func:`date_range` with frequency of ``Day`` or higher where dates sufficiently far in the future could wrap around to the past instead of raising ``OutOfBoundsDatetime`` (:issue:`14187`)
1316+
- Bug in :func:`period_range` ignoring the frequency of ``start`` and ``end`` when those are provided as :class:`Period` objects (:issue:`20535`).
13131317
- Bug in :class:`PeriodIndex` with attribute ``freq.n`` greater than 1 where adding a :class:`DateOffset` object would return incorrect results (:issue:`23215`)
13141318
- Bug in :class:`Series` that interpreted string indices as lists of characters when setting datetimelike values (:issue:`23451`)
13151319
- Bug in :class:`Timestamp` constructor which would drop the frequency of an input :class:`Timestamp` (:issue:`22311`)
@@ -1406,6 +1410,7 @@ Conversion
14061410
^^^^^^^^^^
14071411

14081412
- Bug in :meth:`DataFrame.combine_first` in which column types were unexpectedly converted to float (:issue:`20699`)
1413+
- Bug in :meth:`DataFrame.clip` in which column types are not preserved and casted to float (:issue:`24162`)
14091414

14101415
Strings
14111416
^^^^^^^

0 commit comments

Comments
 (0)