Skip to content

[BUG] loc does not preserve datatype with a single element #36247

@glyg

Description

@glyg

Opening a new issue as per @Dr-Irv suggestion on #31763
Sorry issue formatting got lost

import pandas as pd
import numpy as np

not_preserved = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list("ABC"))
not_preserved['F'] = np.random.random(8)

print("Mixed datatype DataFrame")
print(not_preserved.dtypes)

print("Single value loc: ")
print(not_preserved.loc[0, ['A', 'B', 'C']].dtypes)

print("Single value through slice: ")
print(not_preserved.loc[0:0, ['A', 'B', 'C']].dtypes)

print("The columns are cast to float:")
print(not_preserved.loc[0, ['A', 'B', 'C']])

# This prints:
# Mixed datatype DataFrame
# A      int64
# B      int64
# C      int64
# F    float64
# dtype: object
# Single value loc: 
# float64
# Single value through slice: 
# A    int64
# B    int64
# C    int64
# dtype: object

# The columns are cast to float:
# A    0.0  
# B    1.0
# C    2.0

As a test:

# A single cell is fine

assert isinstance(not_preserved.loc[0, "A"], np.int64) # passes

assert not_preserved.loc[0:0, ["A", "B"]].dtypes["A"] is np.dtype('int64') # passes
assert not_preserved.loc[0, ["A", "B"]].dtypes is np.dtype('int64') # AssertionError

This is with pandas 1.0.3 but was already the case in 0.24.1, it seems that what changed was the behavior of a replace method downstream in my code.
Originally posted by @glyg in #31763 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions