Skip to content

Commit 1221ccd

Browse files
authored
Merge branch 'master' into weighted-roll-var
2 parents 4b3e5eb + f4b4ec2 commit 1221ccd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+847
-759
lines changed

ci/deps/azure-37-locale.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ dependencies:
1717
- openpyxl
1818
- pytables
1919
- python-dateutil
20-
- python=3.7.3
20+
- python=3.7.*
2121
- pytz
2222
- s3fs
2323
- scipy

ci/deps/azure-37-numpydev.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name: pandas-dev
22
channels:
33
- defaults
44
dependencies:
5-
- python=3.7.3
5+
- python=3.7.*
66
- pytz
77
- Cython>=0.28.2
88
# universal

ci/deps/travis-37.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ channels:
44
- conda-forge
55
- c3i_test
66
dependencies:
7-
- python=3.7.3
7+
- python=3.7.*
88
- botocore>=1.11
99
- cython>=0.28.2
1010
- numpy

doc/source/user_guide/io.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3572,7 +3572,7 @@ Closing a Store and using a context manager:
35723572
Read/write API
35733573
''''''''''''''
35743574

3575-
``HDFStore`` supports an top-level API using ``read_hdf`` for reading and ``to_hdf`` for writing,
3575+
``HDFStore`` supports a top-level API using ``read_hdf`` for reading and ``to_hdf`` for writing,
35763576
similar to how ``read_csv`` and ``to_csv`` work.
35773577

35783578
.. ipython:: python
@@ -3687,7 +3687,7 @@ Hierarchical keys
36873687
Keys to a store can be specified as a string. These can be in a
36883688
hierarchical path-name like format (e.g. ``foo/bar/bah``), which will
36893689
generate a hierarchy of sub-stores (or ``Groups`` in PyTables
3690-
parlance). Keys can be specified with out the leading '/' and are **always**
3690+
parlance). Keys can be specified without the leading '/' and are **always**
36913691
absolute (e.g. 'foo' refers to '/foo'). Removal operations can remove
36923692
everything in the sub-store and **below**, so be *careful*.
36933693

@@ -3825,7 +3825,7 @@ data.
38253825

38263826
A query is specified using the ``Term`` class under the hood, as a boolean expression.
38273827

3828-
* ``index`` and ``columns`` are supported indexers of a ``DataFrames``.
3828+
* ``index`` and ``columns`` are supported indexers of ``DataFrames``.
38293829
* if ``data_columns`` are specified, these can be used as additional indexers.
38303830

38313831
Valid comparison operators are:
@@ -3917,7 +3917,7 @@ Use boolean expressions, with in-line function evaluation.
39173917
39183918
store.select('dfq', "index>pd.Timestamp('20130104') & columns=['A', 'B']")
39193919
3920-
Use and inline column reference
3920+
Use inline column reference.
39213921

39223922
.. ipython:: python
39233923
@@ -4593,8 +4593,8 @@ Performance
45934593
write chunksize (default is 50000). This will significantly lower
45944594
your memory usage on writing.
45954595
* You can pass ``expectedrows=<int>`` to the first ``append``,
4596-
to set the TOTAL number of expected rows that ``PyTables`` will
4597-
expected. This will optimize read/write performance.
4596+
to set the TOTAL number of rows that ``PyTables`` will expect.
4597+
This will optimize read/write performance.
45984598
* Duplicate rows can be written to tables, but are filtered out in
45994599
selection (with the last items being selected; thus a table is
46004600
unique on major, minor pairs)

doc/source/whatsnew/v0.25.1.rst

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,7 @@ Bug fixes
2525
Categorical
2626
^^^^^^^^^^^
2727

28-
-
29-
-
28+
- Bug in :meth:`Categorical.fillna` would replace all values, not just those that are ``NaN`` (:issue:`26215`)
3029
-
3130

3231
Datetimelike
@@ -83,7 +82,7 @@ Indexing
8382
^^^^^^^^
8483

8584
- Bug in partial-string indexing returning a NumPy array rather than a ``Series`` when indexing with a scalar like ``.loc['2015']`` (:issue:`27516`)
86-
- Break reference cycle involving :class:`Index` to allow garbage collection of :class:`Index` objects without running the GC. (:issue:`27585`)
85+
- Break reference cycle involving :class:`Index` and other index classes to allow garbage collection of index objects without running the GC. (:issue:`27585`, :issue:`27840`)
8786
- Fix regression in assigning values to a single column of a DataFrame with a ``MultiIndex`` columns (:issue:`27841`).
8887
-
8988

@@ -105,7 +104,7 @@ I/O
105104
^^^
106105

107106
- Avoid calling ``S3File.s3`` when reading parquet, as this was removed in s3fs version 0.3.0 (:issue:`27756`)
108-
-
107+
- Better error message when a negative header is passed in :func:`pandas.read_csv` (:issue:`27779`)
109108
-
110109

111110
Plotting
@@ -127,9 +126,9 @@ Reshaping
127126
^^^^^^^^^
128127

129128
- A ``KeyError`` is now raised if ``.unstack()`` is called on a :class:`Series` or :class:`DataFrame` with a flat :class:`Index` passing a name which is not the correct one (:issue:`18303`)
130-
- Bug in :meth:`DataFrame.crosstab` when ``margins`` set to ``True`` and ``normalize`` is not ``False``, an error is raised. (:issue:`27500`)
129+
- Bug in :meth:`DataFrame.crosstab` when ``margins`` set to ``True`` and ``normalize`` is not ``False``, an error is raised. (:issue:`27500`)
131130
- :meth:`DataFrame.join` now suppresses the ``FutureWarning`` when the sort parameter is specified (:issue:`21952`)
132-
-
131+
- Bug in :meth:`DataFrame.join` raising with readonly arrays (:issue:`27943`)
133132

134133
Sparse
135134
^^^^^^

doc/source/whatsnew/v1.0.0.rst

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,27 +21,27 @@ including other versions of pandas.
2121
Enhancements
2222
~~~~~~~~~~~~
2323

24-
.. _whatsnew_1000.enhancements.other:
25-
2624
-
2725
-
2826

27+
.. _whatsnew_1000.enhancements.other:
28+
2929
Other enhancements
3030
^^^^^^^^^^^^^^^^^^
3131

32-
.. _whatsnew_1000.api_breaking:
33-
3432
- Implemented :meth:`pandas.core.window.Window.var` and :meth:`pandas.core.window.Window.std` functions (:issue:`26597`)
3533
-
3634

35+
.. _whatsnew_1000.api_breaking:
36+
3737
Backwards incompatible API changes
3838
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3939

40-
.. _whatsnew_1000.api.other:
41-
4240
- :class:`pandas.core.groupby.GroupBy.transform` now raises on invalid operation names (:issue:`27489`).
4341
-
4442

43+
.. _whatsnew_1000.api.other:
44+
4545
Other API changes
4646
^^^^^^^^^^^^^^^^^
4747

@@ -87,6 +87,7 @@ Bug fixes
8787
Categorical
8888
^^^^^^^^^^^
8989

90+
- Added test to assert the :func:`fillna` raises the correct ValueError message when the value isn't a value from categories (:issue:`13628`)
9091
-
9192
-
9293

@@ -165,6 +166,7 @@ Plotting
165166

166167
- Bug in :meth:`Series.plot` not able to plot boolean values (:issue:`23719`)
167168
-
169+
- Bug in :meth:`DataFrame.plot` producing incorrect legend markers when plotting multiple series on the same axis (:issue:`18222`)
168170
- Bug in :meth:`DataFrame.plot` when ``kind='box'`` and data contains datetime or timedelta data. These types are now automatically dropped (:issue:`22799`)
169171

170172
Groupby/resample/rolling

environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ channels:
55
dependencies:
66
# required
77
- numpy>=1.15
8-
- python=3.7.3
8+
- python=3
99
- python-dateutil>=2.6.1
1010
- pytz
1111

pandas/_libs/hashtable.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ cdef class Int64Factorizer:
108108
def get_count(self):
109109
return self.count
110110

111-
def factorize(self, int64_t[:] values, sort=False,
111+
def factorize(self, const int64_t[:] values, sort=False,
112112
na_sentinel=-1, na_value=None):
113113
"""
114114
Factorize values with nans replaced by na_sentinel

pandas/_libs/tslibs/nattype.pyx

Lines changed: 74 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,9 @@ cdef class _NaT(datetime):
9292
# int64_t value
9393
# object freq
9494

95+
# higher than np.ndarray and np.matrix
96+
__array_priority__ = 100
97+
9598
def __hash__(_NaT self):
9699
# py3k needs this defined here
97100
return hash(self.value)
@@ -103,61 +106,102 @@ cdef class _NaT(datetime):
103106
if ndim == -1:
104107
return _nat_scalar_rules[op]
105108

106-
if ndim == 0:
109+
elif util.is_array(other):
110+
result = np.empty(other.shape, dtype=np.bool_)
111+
result.fill(_nat_scalar_rules[op])
112+
return result
113+
114+
elif ndim == 0:
107115
if is_datetime64_object(other):
108116
return _nat_scalar_rules[op]
109117
else:
110118
raise TypeError('Cannot compare type %r with type %r' %
111119
(type(self).__name__, type(other).__name__))
120+
112121
# Note: instead of passing "other, self, _reverse_ops[op]", we observe
113122
# that `_nat_scalar_rules` is invariant under `_reverse_ops`,
114123
# rendering it unnecessary.
115124
return PyObject_RichCompare(other, self, op)
116125

117126
def __add__(self, other):
127+
if self is not c_NaT:
128+
# cython __radd__ semantics
129+
self, other = other, self
130+
118131
if PyDateTime_Check(other):
119132
return c_NaT
120-
133+
elif PyDelta_Check(other):
134+
return c_NaT
135+
elif is_datetime64_object(other) or is_timedelta64_object(other):
136+
return c_NaT
121137
elif hasattr(other, 'delta'):
122138
# Timedelta, offsets.Tick, offsets.Week
123139
return c_NaT
124-
elif getattr(other, '_typ', None) in ['dateoffset', 'series',
125-
'period', 'datetimeindex',
126-
'datetimearray',
127-
'timedeltaindex',
128-
'timedeltaarray']:
129-
# Duplicate logic in _Timestamp.__add__ to avoid needing
130-
# to subclass; allows us to @final(_Timestamp.__add__)
131-
return NotImplemented
132-
return c_NaT
140+
141+
elif is_integer_object(other) or util.is_period_object(other):
142+
# For Period compat
143+
# TODO: the integer behavior is deprecated, remove it
144+
return c_NaT
145+
146+
elif util.is_array(other):
147+
if other.dtype.kind in 'mM':
148+
# If we are adding to datetime64, we treat NaT as timedelta
149+
# Either way, result dtype is datetime64
150+
result = np.empty(other.shape, dtype="datetime64[ns]")
151+
result.fill("NaT")
152+
return result
153+
154+
return NotImplemented
133155

134156
def __sub__(self, other):
135157
# Duplicate some logic from _Timestamp.__sub__ to avoid needing
136158
# to subclass; allows us to @final(_Timestamp.__sub__)
159+
cdef:
160+
bint is_rsub = False
161+
162+
if self is not c_NaT:
163+
# cython __rsub__ semantics
164+
self, other = other, self
165+
is_rsub = True
166+
137167
if PyDateTime_Check(other):
138-
return NaT
168+
return c_NaT
139169
elif PyDelta_Check(other):
140-
return NaT
170+
return c_NaT
171+
elif is_datetime64_object(other) or is_timedelta64_object(other):
172+
return c_NaT
173+
elif hasattr(other, 'delta'):
174+
# offsets.Tick, offsets.Week
175+
return c_NaT
141176

142-
elif getattr(other, '_typ', None) == 'datetimeindex':
143-
# a Timestamp-DatetimeIndex -> yields a negative TimedeltaIndex
144-
return -other.__sub__(self)
177+
elif is_integer_object(other) or util.is_period_object(other):
178+
# For Period compat
179+
# TODO: the integer behavior is deprecated, remove it
180+
return c_NaT
145181

146-
elif getattr(other, '_typ', None) == 'timedeltaindex':
147-
# a Timestamp-TimedeltaIndex -> yields a negative TimedeltaIndex
148-
return (-other).__add__(self)
182+
elif util.is_array(other):
183+
if other.dtype.kind == 'm':
184+
if not is_rsub:
185+
# NaT - timedelta64 we treat NaT as datetime64, so result
186+
# is datetime64
187+
result = np.empty(other.shape, dtype="datetime64[ns]")
188+
result.fill("NaT")
189+
return result
190+
191+
# timedelta64 - NaT we have to treat NaT as timedelta64
192+
# for this to be meaningful, and the result is timedelta64
193+
result = np.empty(other.shape, dtype="timedelta64[ns]")
194+
result.fill("NaT")
195+
return result
196+
197+
elif other.dtype.kind == 'M':
198+
# We treat NaT as a datetime, so regardless of whether this is
199+
# NaT - other or other - NaT, the result is timedelta64
200+
result = np.empty(other.shape, dtype="timedelta64[ns]")
201+
result.fill("NaT")
202+
return result
149203

150-
elif hasattr(other, 'delta'):
151-
# offsets.Tick, offsets.Week
152-
neg_other = -other
153-
return self + neg_other
154-
155-
elif getattr(other, '_typ', None) in ['period', 'series',
156-
'periodindex', 'dateoffset',
157-
'datetimearray',
158-
'timedeltaarray']:
159-
return NotImplemented
160-
return NaT
204+
return NotImplemented
161205

162206
def __pos__(self):
163207
return NaT

pandas/compat/chainmap.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,3 @@ def __delitem__(self, key):
1515
del mapping[key]
1616
return
1717
raise KeyError(key)
18-
19-
# override because the m parameter is introduced in Python 3.4
20-
def new_child(self, m=None):
21-
if m is None:
22-
m = {}
23-
return self.__class__(m, *self.maps)

pandas/core/algorithms.py

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,11 @@
2828
is_complex_dtype,
2929
is_datetime64_any_dtype,
3030
is_datetime64_ns_dtype,
31-
is_datetime64tz_dtype,
3231
is_datetimelike,
3332
is_extension_array_dtype,
3433
is_float_dtype,
3534
is_integer,
3635
is_integer_dtype,
37-
is_interval_dtype,
3836
is_list_like,
3937
is_numeric_dtype,
4038
is_object_dtype,
@@ -183,8 +181,6 @@ def _reconstruct_data(values, dtype, original):
183181

184182
if is_extension_array_dtype(dtype):
185183
values = dtype.construct_array_type()._from_sequence(values)
186-
elif is_datetime64tz_dtype(dtype) or is_period_dtype(dtype):
187-
values = Index(original)._shallow_copy(values, name=None)
188184
elif is_bool_dtype(dtype):
189185
values = values.astype(dtype)
190186

@@ -1645,19 +1641,13 @@ def take_nd(
16451641
May be the same type as the input, or cast to an ndarray.
16461642
"""
16471643

1648-
# TODO(EA): Remove these if / elifs as datetimeTZ, interval, become EAs
1649-
# dispatch to internal type takes
16501644
if is_extension_array_dtype(arr):
16511645
return arr.take(indexer, fill_value=fill_value, allow_fill=allow_fill)
1652-
elif is_datetime64tz_dtype(arr):
1653-
return arr.take(indexer, fill_value=fill_value, allow_fill=allow_fill)
1654-
elif is_interval_dtype(arr):
1655-
return arr.take(indexer, fill_value=fill_value, allow_fill=allow_fill)
16561646

16571647
if is_sparse(arr):
16581648
arr = arr.to_dense()
16591649
elif isinstance(arr, (ABCIndexClass, ABCSeries)):
1660-
arr = arr.values
1650+
arr = arr._values
16611651

16621652
arr = np.asarray(arr)
16631653

pandas/core/arrays/categorical.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1840,8 +1840,8 @@ def fillna(self, value=None, method=None, limit=None):
18401840
raise ValueError("fill value must be in categories")
18411841

18421842
values_codes = _get_codes_for_values(value, self.categories)
1843-
indexer = np.where(values_codes != -1)
1844-
codes[indexer] = values_codes[values_codes != -1]
1843+
indexer = np.where(codes == -1)
1844+
codes[indexer] = values_codes[indexer]
18451845

18461846
# If value is not a dict or Series it should be a scalar
18471847
elif is_hashable(value):

0 commit comments

Comments
 (0)