Closed
Description
Trying to remove a nan
category from Categorical series fails if categories are made of floats.
In the docs it says:
Note: As integer Series can’t include NaN, the categories were converted to object.
So it is probably linked to this with float
remaining float
and nan
!= nan
.
If this is intended behavior perhaps would be useful to add this to the docs?
import pandas as pd
df = pd.DataFrame({'a': pd.Categorical([1,2,3]),
'b': pd.Categorical(list('abc')),
'c': pd.Categorical([1.1,2.1,3.1])})
for col in df.columns:
df[col].cat.add_categories(pd.np.nan, inplace=True)
print df[col]
df[col].cat.remove_categories(pd.np.nan)
0 1
1 2
2 3
Name: a, dtype: category
Categories (4, object): [1, 2, 3, NaN]
0 a
1 b
2 c
Name: b, dtype: category
Categories (4, object): [a, b, c, NaN]
0 1.1
1 2.1
2 3.1
Name: c, dtype: category
Categories (4, float64): [1.1, 2.1, 3.1, NaN]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
d:\Anaconda\envs\py2k\lib\site-packages\pandas\core\categorical.pyc in _delegate_method(self, name, *args, **kwargs)
1643 from pandas import Series
1644 method = getattr(self.categorical, name)
-> 1645 res = method(*args, **kwargs)
1646 if not res is None:
1647 return Series(res, index=self.index)
d:\Anaconda\envs\py2k\lib\site-packages\pandas\core\categorical.pyc in remove_categories(self, removals, inplace)
753 not_included = removals - set(self._categories)
754 if len(not_included) != 0:
--> 755 raise ValueError("removals must all be in old categories: %s" % str(not_included))
756 new_categories = [ c for c in self._categories if c not in removals ]
757 return self.set_categories(new_categories, ordered=self.ordered, rename=False,
ValueError: removals must all be in old categories: set([nan])
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 64
pandas: 0.16.1
numpy: 1.9.2
...