Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API/BUG: .apply will correctly infer output shape when axis=1 #18577

Merged
merged 12 commits into from
Feb 7, 2018
Prev Previous commit
Next Next commit
validate result_type kwarg
  • Loading branch information
jreback committed Feb 7, 2018
commit ad9cbd95e7ba8866dcb81a7cdf50f88c682d1eac
4 changes: 4 additions & 0 deletions pandas/core/apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ def __init__(self, obj, func, broadcast, raw, reduce, result_type,
self.args = args or ()
self.kwds = kwds or {}

if result_type not in [None, 'reduce', 'broadcast', 'expand']:
raise ValueError("invalid value for result_type, must be one "
"of {None, 'reduce', 'broadcast', 'expand'}")

if broadcast is not None:
warnings.warn("The broadcast argument is deprecated and will "
"be removed in a future version. You can specify "
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,7 @@ def _apply(self, func, axis=0, subset=None, **kwargs):
data = self.data.loc[subset]
if axis is not None:
result = data.apply(func, axis=axis,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger I had to do this work-around here to push a list-like result into a frame (which we are breaking generally in this patch, but here style is relying on this, so kind of ok with this behavior, I have to add some more tests); this is from pandas/tests/io/formats/test_style.test_apply_axis

. e.g.

In [5]:         df = pd.DataFrame({'A': [0, 0], 'B': [1, 1]})
   ...:         f = lambda x: ['val: {max}'.format(max=x.max()) for v in x]
   ...: 

In [6]: df
Out[6]: 
   A  B
0  0  1
1  0  1

In [7]: df.apply(f, axis=1)
Out[7]: 
0    [val: 1, val: 1]
1    [val: 1, val: 1]
dtype: object

In [8]: df.apply(f, axis=1, result_type='infer')
Out[8]: 
        A       B
0  val: 1  val: 1
1  val: 1  val: 1

I am not entirely sure why you need this this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above snippet is no longer correct (I mean: in the meantime we have chosen to not preserve the original columns names when a list is returned, so this will have column names [0, 1]).

This has no impact on the style code?

result_type='infer', **kwargs)
result_type='expand', **kwargs)
result.columns = data.columns
else:
result = func(data, **kwargs)
Expand Down
12 changes: 12 additions & 0 deletions pandas/tests/frame/test_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -750,6 +750,18 @@ def test_result_type(self):
expected.columns = columns
assert_frame_equal(result, expected)

@pytest.mark.parametrize("result_type", ['foo', 1])
def test_result_type_error(self, result_type):
# allowed result_type
df = DataFrame(
np.tile(np.arange(3, dtype='int64'), 6).reshape(6, -1) + 1,
columns=['A', 'B', 'C'])

with pytest.raises(ValueError):
df.apply(lambda x: [1, 2, 3],
axis=1,
result_type=result_type)

@pytest.mark.parametrize(
"box",
[lambda x: list(x),
Expand Down