Skip to content

Dataset constructor always coerces 1D data variables with same name as dim to coordinates #8959

Open
@TomNicholas

Description

@TomNicholas

What is your issue?

Whilst xarray's data model appears to allow 1D data variables that have the same name as their dimension, it seems to be impossible to actually create this using the Dataset constructor, as they will always be converted to coordinate variables instead.

We can create a 1D data variable with the same name as it's dimension like this:

In [9]: ds = xr.Dataset({'x': 0})

In [10]: ds
Out[10]: 
<xarray.Dataset> Size: 8B
Dimensions:  ()
Data variables:
    x        int64 8B 0

In [11]: ds.expand_dims('x')
Out[11]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Dimensions without coordinates: x
Data variables:
    x        (x) int64 8B 0

so it seems to be a valid part of the data model.

But I can't get to that situation from the Dataset constructor. This should create the same dataset:

In [15]: ds = xr.Dataset(data_vars={'x': ('x', [0])})

In [16]: ds
Out[16]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Coordinates:
  * x        (x) int64 8B 0
Data variables:
    *empty*

But actually it makes x a coordinate variable (and implicitly creates a pandas Index for it). This means that in this case there is no difference between using the data_vars and coords kwargs to the constructor:

ds = xr.Dataset(coords={'x': ('x', [0])})

In [18]: ds
Out[18]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Coordinates:
  * x        (x) int64 8B 0
Data variables:
    *empty*

This all seems weird to me. I would have thought that if a 1D data variable is allowed, we shouldn't coerce to making it a coordinate variable in the constructor. If anything that's actively misleading.

Note that whilst this came up in the context of trying to avoid auto-creation of 1D indexes for coordinate variables, this issue is actually separate. (xref #8872 (comment))

cc @benbovy who probably has thoughts

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    To do

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions