Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop fails when supplied with an object that implements __getitem__ #11740

Open
max-sixty opened this issue Dec 2, 2015 · 6 comments
Open

Drop fails when supplied with an object that implements __getitem__ #11740

max-sixty opened this issue Dec 2, 2015 · 6 comments
Labels
Bug Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).

Comments

@max-sixty
Copy link
Contributor

When you supply drop with an item that implements __getitem__, drop doesn't drop it. It does when supplied with the item within a list.

That's because instead of checking is_list_like (here), drop calls _index_labels_to_array which does this. That returns an empty list when supplied with an object that implements __getitem__. (if you want the technicals, that's because list(obj) == [], as __getitem__ used to be the way of implementing iteration. is_list_like ignores that case, I think correctly).

Should _index_labels_to_array instead call is_list_like?

In [182]:

class GetItemObj(object):

    def __init__(self, n):
        self.n = n

    def __getitem__(self, item):
        if item == 'a':
            return self.n
        else:
            raise IndexError

    def __hash__(self):
        return self.n

    def __repr__(self):
        return 'Item: {}'.format(self.n)
In [183]:


index=pd.Index([GetItemObj(i) for i in range(2)])
index
Out[183]:
Index([Item: 0, Item: 1], dtype='object')
In [186]:

index.drop(index[0])
Out[186]:
Index([Item: 0, Item: 1], dtype='object')
In [187]:


index.drop([index[0]])
Out[187]:
Index([Item: 1], dtype='object')
@kawochen
Copy link
Contributor

kawochen commented Dec 2, 2015

I'm not sure about putting iterables in an index. Does your proposed method solve the case when your object defines __iter__?

@max-sixty
Copy link
Contributor Author

@kawochen have a look at is_list_like - it does almost exactly that: https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L2329. The headline of my suggestion is that we consolidate & encapsulate the list checking there, rather than do it differently in different places

Definitely agree putting iterables / list like objects in an index is not a good idea.

@kawochen
Copy link
Contributor

kawochen commented Dec 2, 2015

I think I did follow. But I think the support would still be partial. I was thinking

class DictLikeObj(dict):
    def __hash__(self):
        return hash(tuple(sorted(self.items())))

(I am against making it easy to put iterables in Index)

@max-sixty
Copy link
Contributor Author

Apologies @kawochen, I'm not sure I understand. What's that class for?
My class above was a demonstration, I'm not suggesting that goes anywhere in pandas.
I can do a quick PR - that may makes my point clearer

@kawochen
Copy link
Contributor

kawochen commented Dec 2, 2015

@MaximilianR that is a class whose instances I think would still fail to be dropped after your proposed change, so some iterables would work, some wouldn't.

@jreback
Copy link
Contributor

jreback commented Dec 3, 2015

yeh .drop is slightly different, not really sure why. always for cleaning / consolidating. You can try changing and see what breaks (if anything).

And there are slight differences in the iterables & checking and its very very subtle (e.g. the __hash__ DOES matter)

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Clean labels Dec 3, 2015
@jbrockmendel jbrockmendel added the Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). label Sep 22, 2020
@mroeschke mroeschke added Bug and removed Clean Indexing Related to indexing on series/frames, not to indexes themselves labels Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).
Projects
None yet
Development

No branches or pull requests

5 participants