Skip to content

BUG: NonConsolidatebleBlock with ndim 1 'take_nd' with len 1 gives wrong ndim #27785

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

When doing an indexing operation on an EA column that results in a single row, the resulting SingleBlockManager / Block has an inconsistent internal state:

In [62]: df = pd.DataFrame({'a': pd.Series([1, 2, 3], dtype='Int64'), 'b': [1, 2, 3]})                                                                        

In [63]: df.loc[[0], 'a']._data                                                                                                                               
Out[63]: 
SingleBlockManager
Items: Int64Index([0], dtype='int64')
ExtensionBlock: slice(0, 1, 1), 1 x 1, dtype: Int64

In [64]: df.loc[[0], 'a']._data.ndim                                                                                                                          
Out[64]: 1

In [65]: df.loc[[0], 'a']._data._block.ndim                                                                                                                   
Out[65]: 2

In [66]: df.loc[[0], 'b']._data                                                                                                                               
Out[66]: 
SingleBlockManager
Items: Int64Index([0], dtype='int64')
IntBlock: 1 dtype: int64

In [67]: df.loc[[0], 'b']._data._block.ndim                                                                                                                   
Out[67]: 1

So df.loc[[0], 'a'] gives a Series as result that holds a 2d block, which leads to other bugs (although not directly visible in the repr. Example: geopandas/geopandas#1078)

The reason for this is that Block.take_nd does not specify the ndim of the resulting new Block, and thus this is inferred in the NonConsolidatableBlock.__init__:

# Maybe infer ndim from placement
if ndim is None:
if len(placement) != 1:
ndim = 1
else:
ndim = 2

But, this len(placement) is not only 1 in case of a 2D block, but also if you have a 1D block with values of len 1 ...

This bug shows with any EA, so also eg Categorical. And it seems to go back to pandas 0.23, and still manifests on master.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugExtensionArrayExtending pandas with custom dtypes or arrays.InternalsRelated to non-user accessible pandas implementation

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions