Iterating over a Dataset iterates only over its data_vars

This has been a small-but-persistent issue for me for a while. I suspect that my perspective might be dependent on my current outlook, but socializing it here to test if it's secular...

Currently `Dataset.keys()` returns both variables and coordinates (but not its `attrs` keys):

``` python
In [5]: ds=xr.Dataset({'a': (('x', 'y'), np.random.rand(10,2))})
In [12]: list(ds.keys())
Out[12]: ['a', 'x', 'y']
```

Is this conceptually correct? I would posit that a Dataset is a mapping of _keys to variables_, and the coordinates contain values that _label_ that data. 

**So should `Dataset.keys()` instead return just the keys of the Variables?**

We're often passing around a dataset as a `Mapping` of keys to values - but then when we run a function across each of the keys, we get something run on both the Variables' keys, _and_ the Coordinate / label's keys.

In Pandas, `DataFrame.keys()` returns just the columns, so that conforms to what we need. While I think the xarray design is in general much better in these areas, this is one area that pandas seems to get correct - and because of the inconsistency between pandas & xarray, we're having to coerce our objects to pandas `DataFrame`s before passing them off to functions that pull out their keys (this is also why we can't just look at `ds.data_vars.keys()` - because it breaks that duck-typing).

Does that make sense?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Iterating over a Dataset iterates only over its data_vars #884

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Iterating over a Dataset iterates only over its data_vars #884

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions