Skip to content

GroupBy.transform bug/API change #8046

Closed
@wesm

Description

@wesm

hi folks,

A Python for Data Analysis reader noted the following issue with recent versions of pandas (as of 1 year ago):

import pandas as pd
from pandas import DataFrame
import numpy as np

def demean(arr):
    return arr - arr.mean()

people = DataFrame(np.random.randn(5, 5),
columns=['a', 'b', 'c', 'd', 'e'],
index=['Joe', 'Steve', 'Wes', 'Jim', 'Travis'])
key = ['one', 'two', 'one', 'two', 'one']

on pandas 0.14.1:

In [14]: people.groupby(key).transform(demean).groupby(key).mean()
Out[14]: 
            a         b         c         d         e
one -0.228006  0.246737  0.201117  0.250544  0.273858
two  0.342009 -0.370106 -0.301676 -0.375816 -0.410788

on the other hand:

In [15]: people.groupby(key).apply(demean).groupby(key).mean()
Out[15]: 
                a             b             c             d             e
one -3.700743e-17  7.401487e-17 -7.401487e-17  7.401487e-17  0.000000e+00
two  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  5.551115e-17

Looks like transform has undergone some work in recent times; any ideas? I need to look at the book text and see if I can triage by replacing transform with apply. At this point transform feels a little bit anachronistic.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions