Open
Description
In #4622 @toddrjen points out that xr.testing.assert_equal
does not test for the dtype
, only for the value. Therefore the following does not raise an error:
import numpy as np
import xarray as xr
import pandas as pd
xr.testing.assert_equal(
xr.DataArray(np.array(1, dtype=int)), xr.DataArray(np.array(1, dtype=float))
)
xr.testing.assert_equal(
xr.DataArray(np.array(1, dtype=int)), xr.DataArray(np.array(1, dtype=object))
)
xr.testing.assert_equal(
xr.DataArray(np.array("a", dtype=str)), xr.DataArray(np.array("a", dtype=object))
)
This comes back to numpy, i.e. the following is True:
np.array(1, dtype=int) == np.array(1, dtype=float)
Depending on the situation one or the other is desirable or not. Thus, I would suggest to add a check_dtype
argument to xr.testing.assert_equal
and also to DataArray.equals
(and Dataset
and Variable
and identical
). I have not seen such an option in numpy, but pandas has it (e.g. pd.testing.assert_series_equal(left, right, check_dtype=True, ...)
. I would not change __eq__
.
- Thoughts?
- What should the default be? We could try
True
first and see how many failures this creates? - What to do with coords and indexes?
pd.testing.assert_series_equal
has acheck_index_type
keyword. Probably we needcheck_coords_type
as well? This makes the whole thing much more complicated... Also Coordinate dtype changing to object after xr.concat #4543