This repository was archived by the owner on Apr 10, 2024. It is now read-only.
This repository was archived by the owner on Apr 10, 2024. It is now read-only.
Simplifying indexing (DataFrame.__getitem__) #22
Open
Description
The rules for exactly what DataFrame.__getitem__
/__setitem__
does (pandas-dev/pandas#9595) are sufficiently complex and inconsistent that they are impossible to understand without extensive experimentation.
This makes for a rather embarrassing situation that we really should fix for pandas 2.0.
I made a proposal when this came up last year:
- Indexing with a string or list of strings does label based selection on columns.
- All other indexing is position based, NumPy style. (This includes indexing with a boolean array.)
I still like my proposal, but more importantly, it satisfies two important criteria:
- The most common uses of DataFrame indexing work unchanged (
df['foo']
,df[['foo', 'bar']]
, anddf[df['foo'] == 'bar']
might cover 80% of use cases). - It's short and simple, with no exceptions.