Skip to content

Warn on automatic coercion to coordinate variables in Dataset constructor #8979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,13 @@ New Features
`create_index=False`. (:pull:`8960`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.

Deprecations
~~~~~~~~~~~~
- Passing variables with the same name as their only dimension to the :py:class:`Dataset` constructor will now raise a `PendingDeprecationWarning`.
This is to deprecate the current behaviour of auto-promoting such variables to coordinates. To avoid the warning pass such variables explicitly via the `coords` kwarg.
(:pull:`8979`)
By `Tom Nicholas <https://github.com/TomNicholas>`_.

Breaking changes
~~~~~~~~~~~~~~~~
- The PyNIO backend has been deleted (:issue:`4491`, :pull:`7301`).
Expand Down
34 changes: 27 additions & 7 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -673,16 +673,28 @@ def __init__(
self,
# could make a VariableArgs to use more generally, and refine these
# categories
data_vars: DataVars | None = None,
vars: Mapping[Any, Any] | None = None,
/,
coords: Mapping[Any, Any] | None = None,
attrs: Mapping[Any, Any] | None = None,
*,
data_vars: DataVars | None = None,
) -> None:
if data_vars is None:
data_vars = {}

if vars is not None and data_vars is not None:
raise ValueError("Cannot pass both vars and data_vars together")

if data_vars is not None:
# TODO deprecate the vars argument by replacing it with data_vars
# Passing a dimension variable to data_vars will also trigger the PendingDeprecationWarning below about promotion to coordinates
vars = data_vars

if vars is None:
vars = {}
if coords is None:
coords = {}

both_data_and_coords = set(data_vars) & set(coords)
both_data_and_coords = set(vars) & set(coords)
if both_data_and_coords:
raise ValueError(
f"variables {both_data_and_coords!r} are found in both data_vars and coords"
Expand All @@ -691,9 +703,17 @@ def __init__(
if isinstance(coords, Dataset):
coords = coords._variables

variables, coord_names, dims, indexes, _ = merge_data_and_coords(
data_vars, coords
)
variables, coord_names, dims, indexes, _ = merge_data_and_coords(vars, coords)

vars_promoted_to_coords = [
var_name for var_name in vars if var_name in coord_names
]
if vars_promoted_to_coords:
warnings.warn(
f"Variables {vars_promoted_to_coords} were automatically promoted to coordinates despite not being passed via the coords argument. "
f"In future this behaviour will be deprecated, so to ensure these variables are assigned as coordinates at creation, please pass them explicity via `coords={vars_promoted_to_coords}`",
PendingDeprecationWarning,
)

self._attrs = dict(attrs) if attrs else None
self._close = None
Expand Down
23 changes: 23 additions & 0 deletions xarray/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -669,6 +669,29 @@ class CustomIndex(Index): ...
# test coordinate variables copied
assert ds.variables["x"] is not coords.variables["x"]

@pytest.mark.filterwarnings("error")
def test_constructor_warn_on_promotion_to_coordinates(self) -> None:
# see GH issue 8959

with pytest.warns(
PendingDeprecationWarning, match="were automatically promoted"
):
Dataset({"x": ("x", [0, 1])})

with pytest.warns(
PendingDeprecationWarning, match="were automatically promoted"
):
Dataset(data_vars={"x": ("x", [0, 1])})

with pytest.raises(
ValueError, match="Cannot pass both vars and data_vars together"
):
Dataset({"x": 0}, data_vars={"y": 0})

# no warning should be raised
ds = Dataset(coords={"x": ("x", [0, 1])})
assert "x" in ds.coords

@pytest.mark.filterwarnings("ignore:return type")
def test_properties(self) -> None:
ds = create_test_data()
Expand Down
Loading