Skip to content

BUG: Fix Series doesn't work in pd.astype(). Now treat Series as dict. #16725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 30, 2017
Merged
Next Next commit
Make Series works the same as dict in pd.astype()
  • Loading branch information
BranYang authored and jreback committed Jun 30, 2017
commit 2351a2e2830dcfee722c4ef40c0830c1410bdb5f
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ Bug Fixes

Conversion
^^^^^^^^^^
- Fix a bug when pd.astype() receive Series as `dtype` paramter, no action is taken. Now Series
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug in :func:`DataFarme.astype` where a passed Series as a dtype mapping would be ignored

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the issue number as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

`dtype` work the same as a `dict`.



Expand Down
2 changes: 1 addition & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3507,7 +3507,7 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):
-------
casted : type of caller
"""
if isinstance(dtype, collections.Mapping):
if is_dict_like(dtype):
if self.ndim == 1: # i.e. Series
if len(dtype) > 1 or list(dtype.keys())[0] != self.name:
raise KeyError('Only the Series name can be used for '
Expand Down
48 changes: 48 additions & 0 deletions pandas/tests/frame/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,54 @@ def test_astype_dict(self):
assert_frame_equal(df, equiv)
assert_frame_equal(df, original)

def test_astype_Series(self):
# GH16717
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would instead modify the previous the test and use parametrize

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then just add the issue number as an additional comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as your suggestion.

a = Series(date_range('2010-01-04', periods=5))
b = Series(range(5))
c = Series([0.0, 0.2, 0.4, 0.6, 0.8])
d = Series(['1.0', '2', '3.14', '4', '5.4'])
df = DataFrame({'a': a, 'b': b, 'c': c, 'd': d})
original = df.copy(deep=True)

# change type of a subset of columns
result = df.astype(Series({'b': 'str', 'd': 'float32'}))
expected = DataFrame({
'a': a,
'b': Series(['0', '1', '2', '3', '4']),
'c': c,
'd': Series([1.0, 2.0, 3.14, 4.0, 5.4], dtype='float32')})
assert_frame_equal(result, expected)
assert_frame_equal(df, original)

result = df.astype(Series(
{'b': np.float32,
'c': 'float32',
'd': np.float64}))
expected = DataFrame({
'a': a,
'b': Series([0.0, 1.0, 2.0, 3.0, 4.0], dtype='float32'),
'c': Series([0.0, 0.2, 0.4, 0.6, 0.8], dtype='float32'),
'd': Series([1.0, 2.0, 3.14, 4.0, 5.4], dtype='float64')})
assert_frame_equal(result, expected)
assert_frame_equal(df, original)

# change all columns
result = df.astype(Series({'a': str, 'b': str, 'c': str, 'd': str}))
assert_frame_equal(result, df.astype(str))
assert_frame_equal(df, original)

# error should be raised when using something other than column labels
# in the keys of the dtype dict
pytest.raises(KeyError, df.astype, Series({'b': str, 2: str}))
pytest.raises(KeyError, df.astype, Series({'e': str}))
assert_frame_equal(df, original)

# if the dtypes provided are the same as the original dtypes, the
# resulting DataFrame should be the same as the original DataFrame
equiv = df.astype(Series({col: df[col].dtype for col in df.columns}))
assert_frame_equal(df, equiv)
assert_frame_equal(df, original)

def test_astype_duplicate_col(self):
a1 = Series([1, 2, 3, 4, 5], name='a')
b = Series([0.1, 0.2, 0.4, 0.6, 0.8], name='b')
Expand Down