Skip to content

Fix bug in contains when looking up a string in a non-monotonic datet… #13574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f3b114f
Fix bug in contains when looking up a string in a non-monotonic datet…
tjader Jul 6, 2016
cc0a188
BUG: Groupby.nth includes group key inconsistently #12839
adneu Jul 6, 2016
2655dae
In gbq, use googleapiclient instead of apiclient #13454 (#13458)
parthea Jul 7, 2016
f11b9c1
RLS: switch master from 0.18.2 to 0.19.0 (#13586)
jorisvandenbossche Jul 8, 2016
ba82b51
BUG: Datetime64Formatter not respecting ``formatter``
haleemur Jul 8, 2016
f95576b
BUG: Fix TimeDelta to Timedelta (#13600)
yui-knk Jul 9, 2016
5701c69
COMPAT: 32-bit compat fixes mainly in testing
jreback Jul 7, 2016
3c202b1
Added more exhaustive tests for __contains__.
tjader Jul 9, 2016
713eaa6
BUG: DatetimeIndex - Period shows ununderstandable error
sinhrks Jul 10, 2016
675a6e3
ENH: add downcast to pd.to_numeric
gfyoung Jul 10, 2016
1edc1df
CLN: remove radd workaround in ops.py
sinhrks Jul 10, 2016
2a96ab7
DEPR: rename Timestamp.offset to .freq
sinhrks Jul 10, 2016
c989570
CLN: Remove the engine parameter in CSVFormatter and to_csv
gfyoung Jun 10, 2016
c2cc68d
BUG: Block/DTI doesnt handle tzlocal properly
sinhrks Jul 10, 2016
2e8c993
BUG: Series contains NaT with object dtype comparison incorrect (#13592)
sinhrks Jul 11, 2016
5605f99
CLN/TST: Add tests for nan/nat mixed input (#13477)
sinhrks Jul 11, 2016
2f7fdd0
BUG: groupby apply on selected columns yielding scalar (GH13568) (#13…
jorisvandenbossche Jul 11, 2016
65849d3
TST: Clean up tests of DataFrame.sort_{index,values} (#13496)
IamJeffG Jul 11, 2016
8dbc0f4
DOC: asfreq clarify original NaNs are not filled (GH9963) (#13617)
jorisvandenbossche Jul 12, 2016
93b7d13
BUG: Invalid Timedelta op may raise ValueError
sinhrks Jul 12, 2016
dbd5330
CLN: Cleanup ops.py
sinhrks Jul 12, 2016
7c357d2
CLN: Removed outtype in DataFrame.to_dict (#13627)
gfyoung Jul 12, 2016
27d2915
CLN: Fix compile time warnings
yui-knk Jul 13, 2016
06103dd
Pin IPython for doc build to 4.x (see #13639)
jorisvandenbossche Jul 13, 2016
7dd4091
CLN: reorg type inference & introspection
jreback Jul 13, 2016
20de266
BLD: included pandas.api.* in setup.py (#13640)
gfyoung Jul 13, 2016
44f3229
DOC/BLD: pin IPython version to 4.2.0 (#13639) (#13647)
jorisvandenbossche Jul 14, 2016
6f0a020
TST: reorganize tools.tests (#13619)
sinhrks Jul 14, 2016
a711b42
BF(TST): allow AttributeError being raised (in addition to TypeError)…
yarikoptic Jul 14, 2016
084ceae
API, DEPR: Raise and Deprecate Reshape for Pandas Objects
gfyoung Jul 14, 2016
3f6d4bd
CLN: Fix compile time warnings
yui-knk Jul 14, 2016
c9a27ed
CLN: fix some issues in asv benchmark suite (#13630)
jorisvandenbossche Jul 14, 2016
05b976c
TST: add tests for Timestamp.toordinal/fromordinal
sinhrks Jul 15, 2016
71a0675
CLN: Initialization coincides with mapping, hence with uniqueness check
toobaz Jul 15, 2016
0a70b5f
API: Change Period('NAT') to return NaT
sinhrks Jul 15, 2016
1bee56e
BUG: construction of Series with integers on windows not default to i…
jreback Jul 15, 2016
d7c028d
CLN: Removed levels attribute from Categorical
gfyoung Jul 15, 2016
401b0ed
Fix bug in contains when looking up a string in a non-monotonic datet…
tjader Jul 6, 2016
1a86b3a
Added more exhaustive tests for __contains__.
tjader Jul 9, 2016
3bf7cce
Fix bug in contains when looking up a string in a non-monotonic datet…
tjader Jul 6, 2016
0f5a4e0
Added more exhaustive tests for __contains__.
tjader Jul 9, 2016
783ea6d
Fix bug in contains when looking up a string in a non-monotonic datet…
tjader Jul 6, 2016
592a09d
Added more exhaustive tests for __contains__.
tjader Jul 9, 2016
690e034
Fix bug in contains when looking up a string in a non-monotonic datet…
tjader Jul 6, 2016
d4348d3
Merge remote-tracking branch 'origin/bugfixes' into bugfixes
tjader Jul 16, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
API, DEPR: Raise and Deprecate Reshape for Pandas Objects
Author: gfyoung <gfyoung17@gmail.com>

Closes #13012 from gfyoung/categorical-reshape-validate and squashes the following commits:

3ad161d [gfyoung] API: Prevent invalid arguments to Categorical.reshape
  • Loading branch information
gfyoung authored and jreback committed Jul 14, 2016
commit 084ceaee135627680f4dd00115c3d6c7d930a22d
3 changes: 3 additions & 0 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,7 @@ API changes
~~~~~~~~~~~


- ``Index.reshape`` will raise a ``NotImplementedError`` exception when called (:issue: `12882`)
- Non-convertible dates in an excel date column will be returned without conversion and the column will be ``object`` dtype, rather than raising an exception (:issue:`10001`)
- ``eval``'s upcasting rules for ``float32`` types have been updated to be more consistent with NumPy's rules. New behavior will not upcast to ``float64`` if you multiply a pandas ``float32`` object by a scalar float64. (:issue:`12388`)
- An ``UnsupportedFunctionCall`` error is now raised if NumPy ufuncs like ``np.mean`` are called on groupby or resample objects (:issue:`12811`)
Expand Down Expand Up @@ -449,6 +450,8 @@ Furthermore:

Deprecations
^^^^^^^^^^^^
- ``Categorical.reshape`` has been deprecated and will be removed in a subsequent release (:issue:`12882`)
- ``Series.reshape`` has been deprecated and will be removed in a subsequent release (:issue:`12882`)

- ``compact_ints`` and ``use_unsigned`` have been deprecated in ``pd.read_csv()`` and will be removed in a future version (:issue:`13320`)
- ``buffer_lines`` has been deprecated in ``pd.read_csv()`` and will be removed in a future version (:issue:`13360`)
Expand Down
23 changes: 20 additions & 3 deletions pandas/core/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,11 +383,28 @@ def itemsize(self):

def reshape(self, new_shape, *args, **kwargs):
"""
An ndarray-compatible method that returns
`self` because categorical instances cannot
actually be reshaped.
DEPRECATED: calling this method will raise an error in a
future release.

An ndarray-compatible method that returns `self` because
`Categorical` instances cannot actually be reshaped.

Parameters
----------
new_shape : int or tuple of ints
A 1-D array of integers that correspond to the new
shape of the `Categorical`. For more information on
the parameter, please refer to `np.reshape`.
"""
warn("reshape is deprecated and will raise "
"in a subsequent release", FutureWarning, stacklevel=2)

nv.validate_reshape(args, kwargs)

# while the 'new_shape' parameter has no effect,
# we should still enforce valid shape parameters
np.reshape(self.codes, new_shape)

return self

@property
Expand Down
26 changes: 24 additions & 2 deletions pandas/core/internals.py
Original file line number Diff line number Diff line change
Expand Up @@ -1839,7 +1839,7 @@ def convert(self, *args, **kwargs):
try:
values = values.reshape(shape)
values = _block_shape(values, ndim=self.ndim)
except AttributeError:
except (AttributeError, NotImplementedError):
pass
newb = make_block(values, ndim=self.ndim, placement=[rl])
blocks.append(newb)
Expand Down Expand Up @@ -3616,7 +3616,7 @@ def value_getitem(placement):
return value
else:
if value.ndim == self.ndim - 1:
value = value.reshape((1,) + value.shape)
value = _safe_reshape(value, (1,) + value.shape)

def value_getitem(placement):
return value
Expand Down Expand Up @@ -4686,6 +4686,28 @@ def rrenamer(x):
_transform_index(right, rrenamer))


def _safe_reshape(arr, new_shape):
"""
If possible, reshape `arr` to have shape `new_shape`,
with a couple of exceptions (see gh-13012):

1) If `arr` is a Categorical or Index, `arr` will be
returned as is.
2) If `arr` is a Series, the `_values` attribute will
be reshaped and returned.

Parameters
----------
arr : array-like, object to be reshaped
new_shape : int or tuple of ints, the new shape
"""
if isinstance(arr, ABCSeries):
arr = arr._values
if not isinstance(arr, Categorical):
arr = arr.reshape(new_shape)
return arr


def _transform_index(index, func):
"""
Apply function to all values found in index.
Expand Down
14 changes: 11 additions & 3 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -843,14 +843,22 @@ def repeat(self, reps, *args, **kwargs):

def reshape(self, *args, **kwargs):
"""
Return the values attribute of `self` with shape `args`.
However, if the specified shape matches exactly the current
shape, `self` is returned for compatibility reasons.
DEPRECATED: calling this method will raise an error in a
future release. Please call ``.values.reshape(...)`` instead.

return an ndarray with the values shape
if the specified shape matches exactly the current shape, then
return self (for compat)

See also
--------
numpy.ndarray.reshape
"""
warnings.warn("reshape is deprecated and will raise "
"in a subsequent release. Please use "
".values.reshape(...) instead", FutureWarning,
stacklevel=2)

if len(args) == 1 and hasattr(args[0], '__iter__'):
shape = args[0]
else:
Expand Down
10 changes: 10 additions & 0 deletions pandas/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -957,6 +957,16 @@ def rename(self, name, inplace=False):
"""
return self.set_names([name], inplace=inplace)

def reshape(self, *args, **kwargs):
"""
NOT IMPLEMENTED: do not call this method, as reshaping is not
supported for Index objects and will raise an error.

Reshape an Index.
"""
raise NotImplementedError("reshaping is not supported "
"for Index objects")

@property
def _has_complex_internals(self):
# to disable groupby tricks in MultiIndex
Expand Down
7 changes: 4 additions & 3 deletions pandas/io/packers.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
from pandas.core.generic import NDFrame
from pandas.core.common import PerformanceWarning
from pandas.io.common import get_filepath_or_buffer
from pandas.core.internals import BlockManager, make_block
from pandas.core.internals import BlockManager, make_block, _safe_reshape
import pandas.core.internals as internals

from pandas.msgpack import Unpacker as _Unpacker, Packer as _Packer, ExtType
Expand Down Expand Up @@ -622,8 +622,9 @@ def decode(obj):
axes = obj[u'axes']

def create_block(b):
values = unconvert(b[u'values'], dtype_for(b[u'dtype']),
b[u'compress']).reshape(b[u'shape'])
values = _safe_reshape(unconvert(
b[u'values'], dtype_for(b[u'dtype']),
b[u'compress']), b[u'shape'])

# locs handles duplicate column names, and should be used instead
# of items; see GH 9618
Expand Down
6 changes: 6 additions & 0 deletions pandas/tests/indexes/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1413,6 +1413,12 @@ def test_take_fill_value(self):
with tm.assertRaises(IndexError):
idx.take(np.array([1, -5]))

def test_reshape_raise(self):
msg = "reshaping is not supported"
idx = pd.Index([0, 1, 2])
tm.assertRaisesRegexp(NotImplementedError, msg,
idx.reshape, idx.shape)

def test_reindex_preserves_name_if_target_is_list_or_ndarray(self):
# GH6552
idx = pd.Index([0, 1, 2])
Expand Down
68 changes: 41 additions & 27 deletions pandas/tests/series/test_analytics.py
Original file line number Diff line number Diff line change
Expand Up @@ -1554,49 +1554,63 @@ def test_shift_categorical(self):
assert_index_equal(s.values.categories, sp1.values.categories)
assert_index_equal(s.values.categories, sn2.values.categories)

def test_reshape_non_2d(self):
# GH 4554
x = Series(np.random.random(201), name='x')
self.assertTrue(x.reshape(x.shape, ) is x)
def test_reshape_deprecate(self):
x = Series(np.random.random(10), name='x')
tm.assert_produces_warning(FutureWarning, x.reshape, x.shape)

# GH 2719
a = Series([1, 2, 3, 4])
result = a.reshape(2, 2)
expected = a.values.reshape(2, 2)
tm.assert_numpy_array_equal(result, expected)
self.assertIsInstance(result, type(expected))
def test_reshape_non_2d(self):
# see gh-4554
with tm.assert_produces_warning(FutureWarning):
x = Series(np.random.random(201), name='x')
self.assertTrue(x.reshape(x.shape, ) is x)

# see gh-2719
with tm.assert_produces_warning(FutureWarning):
a = Series([1, 2, 3, 4])
result = a.reshape(2, 2)
expected = a.values.reshape(2, 2)
tm.assert_numpy_array_equal(result, expected)
self.assertIsInstance(result, type(expected))

def test_reshape_2d_return_array(self):
x = Series(np.random.random(201), name='x')
result = x.reshape((-1, 1))
self.assertNotIsInstance(result, Series)

result2 = np.reshape(x, (-1, 1))
self.assertNotIsInstance(result2, Series)
with tm.assert_produces_warning(FutureWarning):
result = x.reshape((-1, 1))
self.assertNotIsInstance(result, Series)

result = x[:, None]
expected = x.reshape((-1, 1))
assert_almost_equal(result, expected)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
result2 = np.reshape(x, (-1, 1))
self.assertNotIsInstance(result2, Series)

with tm.assert_produces_warning(FutureWarning):
result = x[:, None]
expected = x.reshape((-1, 1))
assert_almost_equal(result, expected)

def test_reshape_bad_kwarg(self):
a = Series([1, 2, 3, 4])

msg = "'foo' is an invalid keyword argument for this function"
tm.assertRaisesRegexp(TypeError, msg, a.reshape, (2, 2), foo=2)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
msg = "'foo' is an invalid keyword argument for this function"
tm.assertRaisesRegexp(TypeError, msg, a.reshape, (2, 2), foo=2)

msg = "reshape\(\) got an unexpected keyword argument 'foo'"
tm.assertRaisesRegexp(TypeError, msg, a.reshape, a.shape, foo=2)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
msg = "reshape\(\) got an unexpected keyword argument 'foo'"
tm.assertRaisesRegexp(TypeError, msg, a.reshape, a.shape, foo=2)

def test_numpy_reshape(self):
a = Series([1, 2, 3, 4])

result = np.reshape(a, (2, 2))
expected = a.values.reshape(2, 2)
tm.assert_numpy_array_equal(result, expected)
self.assertIsInstance(result, type(expected))
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
result = np.reshape(a, (2, 2))
expected = a.values.reshape(2, 2)
tm.assert_numpy_array_equal(result, expected)
self.assertIsInstance(result, type(expected))

result = np.reshape(a, a.shape)
tm.assert_series_equal(result, a)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
result = np.reshape(a, a.shape)
tm.assert_series_equal(result, a)

def test_unstack(self):
from numpy import nan
Expand Down
37 changes: 32 additions & 5 deletions pandas/tests/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -4058,13 +4058,40 @@ def test_numpy_repeat(self):
msg = "the 'axis' parameter is not supported"
tm.assertRaisesRegexp(ValueError, msg, np.repeat, cat, 2, axis=1)

def test_reshape(self):
cat = pd.Categorical([], categories=["a", "b"])
tm.assert_produces_warning(FutureWarning, cat.reshape, 0)

with tm.assert_produces_warning(FutureWarning):
cat = pd.Categorical([], categories=["a", "b"])
self.assert_categorical_equal(cat.reshape(0), cat)

with tm.assert_produces_warning(FutureWarning):
cat = pd.Categorical([], categories=["a", "b"])
self.assert_categorical_equal(cat.reshape((5, -1)), cat)

with tm.assert_produces_warning(FutureWarning):
cat = pd.Categorical(["a", "b"], categories=["a", "b"])
self.assert_categorical_equal(cat.reshape(cat.shape), cat)

with tm.assert_produces_warning(FutureWarning):
cat = pd.Categorical(["a", "b"], categories=["a", "b"])
self.assert_categorical_equal(cat.reshape(cat.size), cat)

with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
msg = "can only specify one unknown dimension"
cat = pd.Categorical(["a", "b"], categories=["a", "b"])
tm.assertRaisesRegexp(ValueError, msg, cat.reshape, (-2, -1))

def test_numpy_reshape(self):
cat = pd.Categorical(["a", "b"], categories=["a", "b"])
self.assert_categorical_equal(np.reshape(cat, cat.shape), cat)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
cat = pd.Categorical(["a", "b"], categories=["a", "b"])
self.assert_categorical_equal(np.reshape(cat, cat.shape), cat)

msg = "the 'order' parameter is not supported"
tm.assertRaisesRegexp(ValueError, msg, np.reshape,
cat, cat.shape, order='F')
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
msg = "the 'order' parameter is not supported"
tm.assertRaisesRegexp(ValueError, msg, np.reshape,
cat, cat.shape, order='F')

def test_na_actions(self):

Expand Down