-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
This is similar to #324, but the underlying problem seems to be a pandas issue.
Let's start with a filewise and a segmented table, both using a 'spk' scheme with the labels 'a' and 'b', and both containing two entries labeled as 'a'.
import audformat
db = audformat.Database('db')
db.schemes['spk'] = audformat.Scheme('str', labels=['a', 'b'])
index = audformat.filewise_index(['f1', 'f2'])
db['files'] = audformat.Table(index)
db['files']['spk'] = audformat.Column(scheme_id='spk')
db['files']['spk'].set(['a', 'a'])
db.schemes['label'] = audformat.Scheme('int')
index = audformat.segmented_index(['f1', 'f1'], [0, 1], [1, 2])
db['segments'] = audformat.Table(index)
db['segments']['spk'] = audformat.Column(scheme_id='spk')
db['segments']['spk'].set(['a', 'a'])The following behaves as expected:
>>> df = db['files'].get()
>>> df.spk.cat.categories
Index(['a', 'b'], dtype='object')
>>> df.loc['f1', 'spk'] = 'c'
...
TypeError: Cannot setitem on a Categorical with a new category (c), set the categories first
>>> df.iloc[0, 0] = 'c'
...
TypeError: Cannot setitem on a Categorical with a new category (c), set the categories first>>> df = db['segments'].get()
>>> df.spk.cat.categories
Index(['a', 'b'], dtype='object')
>>> df.loc[audformat.segmented_index(['f1'], [0], [1]), 'spk'] = 'c'
...
TypeError: Cannot setitem on a Categorical with a new category (c), set the categories first
>>> df.iloc[0, 0] = 'c'
...
TypeError: Cannot setitem on a Categorical with a new category (c), set the categories firstBut we can still force to set a forbidden label and remove CategoricalDtype by addressing several values at once:
>>> df = db['files'].get()
>>> df.loc[:, 'spk'] = 'c'
>>> df.spk.cat.categories
...
AttributeError: Can only use .cat accessor with a 'category' dtype
>>> df
spk
file
f1 c
f2 c>>> df = db['segments'].get()
>>> df.loc[:, 'spk'] = 'c'
>>> df.spk.cat.categories
...
AttributeError: Can only use .cat accessor with a 'category' dtype
>>> df
spk
file start end
f1 0 days 00:00:00 0 days 00:00:01 c
0 days 00:00:01 0 days 00:00:02 cI'm not sure yet if this is considered a feature or a bug in pandas.
There is no upstream issue that matches directly, but related issues: pandas-dev/pandas#46820, pandas-dev/pandas#40080
Metadata
Metadata
Assignees
Labels
No labels