Description
What happened:
When using the hue keyword in a scatter plot to color the points based on a string variable, the color assignment in the plot is wrong (whereas the legend is correct).
What you expected to happen:
In the example, data of category "A"
ranges between 0 and 2 in u-direction and 0 and 0.5 in v-direction. Points in that square should be orange (the color for "A") but currently are blue.
Minimal Complete Verifiable Example:
import xarray as xr
import numpy as np
u = np.random.rand(50, 2) * np.array([1, 2])
v = np.random.rand(50, 2) * np.array([1, 0.5])
ds = xr.Dataset(
{
"u": (("x", "category"), u),
"v": (("x", "category"), v),
},
coords={"category": ["B", "A"],}
)
g = ds.plot.scatter(
y="u",
x="v",
hue="category",
);
Anything else we need to know?:
I think that this might be related to sorting at some point. If the variable by which I color is sorted alphabetically (["A", "B"]
instead of ["B", "A"]
), the color assignment is correct.
Not sure if this issue is related to #4126, bit it looks different to me (the problem is not the legend, but the colors in the plot itself).
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-122-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.0
pandas: 1.1.2
numpy: 1.17.5
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.26.0
distributed: 2.26.0
matplotlib: 3.3.2
cartopy: None
seaborn: 0.11.0
numbagg: None
pint: None
setuptools: 49.6.0.post20200814
pip: 20.2.3
conda: 4.8.3
pytest: 6.0.1
IPython: 7.18.1
sphinx: None