Skip to content

Dataset.to_dataframe() dimension order is not alphabetically sorted by default #9653

Closed
@mgunyho

Description

@mgunyho

What happened?

Hi, I noticed that the documentation for Dataset.to_dataframe() says that "by default, dimensions are sorted alphabetically". This is contrast with DataArray.to_dataframe(), where the order is given by the order of the dimensions in the DataArray, which was discussed in this comment.

However, it appears that Dataset.to_dataframe() doesn't in fact sort the orders alphabetically with this example on current main 8f6e45b:

import xarray as xr
ds = xr.Dataset({
    "foo": xr.DataArray(0, coords=[("y", [1, 2, 3]), ("x", [4, 5, 6])]), 
})
print(ds.to_dataframe()) 

I get

     foo
y x     
1 4    0
  5    0
  6    0
2 4    0
  5    0
  6    0
3 4    0
  5    0
  6    0

What did you expect to happen?

The dimensions in the output should be sorted alphabetically, like this:

     foo
x y     
4 1    0
  2    0
  3    0
5 1    0
  2    0
  3    0
6 1    0
  2    0
  3    0

Minimal Complete Verifiable Example

See above

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.7 (main, Oct 1 2024, 00:00:00) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)]
python-bits: 64
OS: Linux
OS-release: 6.11.3-200.fc40.x86_64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.9.1.dev73+g8f6e45ba
pandas: 2.2.3
numpy: 1.26.4
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.0.3
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions