Skip to content

Basic examples for creating data structures fail type-checking #6576

Closed
@rsokl

Description

@rsokl

What happened?

The examples provided by this documentation reveal issues with the type-annotations for DataArray and Dataset. Running mypy and pyright on these basic use-cases, only slightly modified, produce type-checking errors.

What did you expect to happen?

The annotations for these classes should accommodate these common use-cases.

Minimal Complete Verifiable Example

# run mypy or pyright on the following file to reproduce the errors

import numpy as np
import xarray as xr
import pandas as pd

data = np.random.rand(4, 3)
locs = ["IA", "IL", "IN"]
times = pd.date_range("2000-01-01", periods=4)

foo = xr.DataArray(
    data,
    coords=[times, locs],  # error: List item 1 has incompatible type "List[str]"; expected "Tuple[Any, ...]"
    dims=["time", "space"],
)


temp = 15 + 8 * np.random.randn(2, 2, 3)
precip = 10 * np.random.rand(2, 2, 3)
lon = [[-99.83, -99.32], [-99.79, -99.23]]
lat = [[42.25, 42.21], [42.63, 42.59]]

A = {
    "temperature": (["x", "y", "time"], temp), 
    "precipitation": (["x", "y", "time"], precip),
}

C = {
    "lon": (["x", "y"], lon),
    "lat": (["x", "y"], lat),
    "time": pd.date_range("2014-09-06", periods=3),
    "reference_time": pd.Timestamp("2014-09-05"),
}

ds = xr.Dataset(
    A, # error: Argument 1 to "Dataset" has incompatible type "Dict[str, Tuple[List[str], Any]]"; expected "Optional[Mapping[Hashable, Any]]"
    coords=C,  # error: Argument "coords" to "Dataset" has incompatible type "Dict[str, Any]"; expected "Optional[Mapping[Hashable, Any]]"
)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

Some of these errors are circumvented when one provides a literal inline, and thus exploit bidrectional inference, which may be why the current mypy tests ran in your CI miss these.

E.g.

from typing import Dict, Hashable, Any

def f(x: Dict[Hashable, Any]): ...

f({"hi": 1})  # this is ok -- uses bidirectional inference to see Dict[Hashable, Any]

x = {"hi": 1}
f(x)  # error: Dict[Hashable, Any] is invariant in Hashable, and is incompatible with str

This is a sticky situation as key is invariant even in Mapping: python/typing#445. IMHO it would be great to tweak these annotations, e.g. Hashable -> Hashable | str | <other common coord types> to ensure that users don't face such false positives.

Environment

INSTALLED VERSIONS

commit: None
python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-153-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 0.19.0
pandas: 1.3.3
numpy: 1.20.3
scipy: 1.7.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: None
distributed: None
matplotlib: 3.5.2
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 59.5.0
pip: 21.3
conda: None
pytest: 6.2.5
IPython: 7.28.0
sphinx: 4.5.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions