Description
Code Sample, a copy-pastable example if possible
import pandas as pd
# Create demo data for package delivery
df = pd.DataFrame({'package_id': [1, 1, 1, 2, 2, 3],
'status': ['Waiting', 'OnTheWay', 'Delivered', 'Waiting', 'OnTheWay', 'Waiting']})
# Status column: Make ordinal
delivery_status_type = pd.CategoricalDtype(categories=['Waiting', 'OnTheWay', 'Delivered'], ordered=True)
df['status'] = df['status'].astype(delivery_status_type)
# CALCULATE LAST-STATUS FOR EACH PACKAGE
# Way 1: (works OK)
# df['last_status'] = df.groupby('package_id')['status'].transform(lambda x: x.max())
# Way 2: Fails. Let's fix it.
df['last_status'] = df.groupby('package_id')['status'].transform(max)
df
Problem description
Problem is, that code above fails with error:
AttributeError Traceback (most recent call last)
<ipython-input-26-174c91615d45> in <module>
----> 1 df['last_status'] = df.groupby('package_id')['delivery_status'].transform(max)
2 df
~\Miniconda3\lib\site-packages\pandas\core\groupby\generic.py in transform(self, func, *args, **kwargs)
1015 # cythonized aggregation and merge
1016 return self._transform_fast(
-> 1017 lambda: getattr(self, func)(*args, **kwargs), func
1018 )
1019
~\Miniconda3\lib\site-packages\pandas\core\groupby\generic.py in _transform_fast(self, func, func_nm)
1062 ids, _, ngroup = self.grouper.group_info
1063 cast = self._transform_should_cast(func_nm)
-> 1064 out = algorithms.take_1d(func()._values, ids)
1065 if cast:
1066 out = self._try_cast(out, self.obj)
AttributeError: 'Categorical' object has no attribute '_values'
Expected Output
Code should work without error (especially Way 2)
and produce the same result as Way 1
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.25.1
numpy : 1.16.5
pytz : 2019.3
dateutil : 2.8.0
pip : 19.2.3
setuptools : 41.4.0
Cython : None
pytest : 5.0.1
hypothesis : None
sphinx : 2.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None