Skip to content

API: creating DataFrame with no columns: object vs string dtype columns? #60338

Open
@jorisvandenbossche

Description

A typical case we encounter in the tests is starting from an empty DataFrame, and then adding some columns.

Simplied example of this pattern:

df = pd.DataFrame()
df["a"] = values
...

The dataframe starts with an empty Index columns, and the default dtype for an empty Index is object dtype. And then inserting string labels for the actual columns into that Index object, preserves the object dtype.

As long as we used object dtype for string column names, this was perfectly fine. But now that we will infer str dtype for actual string column names, it gets a bit annoying that the pattern above does not result in str but object colums.

This is not the best pattern, so maybe it's OK this does not give the ideal result. But at the same since we even use it quite regularly in our own tests, I suppose this is not that uncommon.

Metadata

Assignees

No one assigned

    Labels

    API DesignIndexRelated to the Index class or subclassesStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions