Skip to content

ENH: Make _to_dataframe faster for extension array columns after pandas fix #8950

Open
@ilan-gold

Description

@ilan-gold

What is your issue?

One pandas-dev/pandas#57676 is completed, we should be able to do the joins in the _to_dataframe method faster (we need to be able to handle the singleton case which is hte issue with pandas):

xarray/xarray/core/dataset.py

Lines 7170 to 7177 in 239309f

def _to_dataframe(self, ordered_dims: Mapping[Any, int]):
columns = [k for k in self.variables if k not in self.dims]
data = [
self._variables[k].set_dims(ordered_dims).values.reshape(-1)
for k in columns
]
index = self.coords.to_index([*ordered_dims])
return pd.DataFrame(dict(zip(columns, data)), index=index)

see discussion here

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions