Skip to content

Commit 7c3e71d

Browse files
committed
GroupBy(multiple variables)
1 parent 231c2f3 commit 7c3e71d

File tree

1 file changed

+125
-0
lines changed

1 file changed

+125
-0
lines changed

src/posts/multiple-groupers/index.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
title: 'Grouping by multiple arrays with Xarray'
3+
date: '2023-07-18'
4+
authors:
5+
- name: Deepak Cherian
6+
github: dcherian
7+
8+
summary: 'Xarray finally supports grouping by multiple arrays. 🎉'
9+
---
10+
11+
## TLDR
12+
13+
Xarray now supports grouping by multiple variables ([docs](https://docs.xarray.dev/en/latest/user-guide/groupby.html#grouping-by-multiple-variables)). 🎉 😱 🤯 🥳. Try it out!
14+
15+
## How do I use it?
16+
17+
Install `xarray>=2024.08.0` and optionally [flox](https://flox.readthedocs.io/en/latest/) for better performance with reductions.
18+
19+
## Simple example
20+
21+
Set up a multiple variable groupby using [Grouper objects](https://docs.xarray.dev/en/latest/user-guide/groupby.html#grouping-by-multiple-variables).
22+
23+
```python
24+
import xarray as xr
25+
from xarray.groupers import UniqueGrouper
26+
27+
da = xr.DataArray(
28+
np.array([1, 2, 3, 0, 2, np.nan]),
29+
dims="d",
30+
coords=dict(
31+
labels1=("d", np.array(["a", "b", "c", "c", "b", "a"])),
32+
labels2=("d", np.array(["x", "y", "z", "z", "y", "x"])),
33+
),
34+
)
35+
36+
gb = da.groupby(labels1=UniqueGrouper(), labels2=UniqueGrouper())
37+
gb
38+
```
39+
40+
```
41+
<DataArrayGroupBy, grouped over 2 grouper(s), 9 groups in total:
42+
'labels1': 3 groups with labels 'a', 'b', 'c'
43+
'labels2': 3 groups with labels 'x', 'y', 'z'>
44+
```
45+
46+
Reductions work as usual:
47+
48+
```python
49+
gb.mean()
50+
```
51+
52+
```
53+
xarray.DataArray (labels1: 3, labels2: 3)> Size: 72B
54+
array([[1. , nan, nan],
55+
[nan, 2. , nan],
56+
[nan, nan, 1.5]])
57+
Coordinates:
58+
* labels1 (labels1) object 24B 'a' 'b' 'c'
59+
* labels2 (labels2) object 24B 'x' 'y' 'z'
60+
```
61+
62+
So does `map`:
63+
64+
```python
65+
gb.map(lambda x: x[0])
66+
```
67+
68+
```
69+
<xarray.DataArray (labels1: 3, labels2: 3)> Size: 72B
70+
array([[ 1., nan, nan],
71+
[nan, 2., nan],
72+
[nan, nan, 3.]])
73+
Coordinates:
74+
* labels1 (labels1) object 24B 'a' 'b' 'c'
75+
* labels2 (labels2) object 24B 'x' 'y' 'z'
76+
```
77+
78+
## Multiple Groupers
79+
80+
Combining different grouper types is allowed, that is you can combine
81+
categorical grouping with` UniqueGrouper`, binning with `BinGrouper`, and
82+
resampling with `TimeResampler`.
83+
84+
```python
85+
ds = xr.Dataset(
86+
{"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
87+
coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
88+
)
89+
gb = ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
90+
gb
91+
```
92+
93+
```
94+
from xarray.groupers import BinGrouper
95+
96+
ds = xr.Dataset(
97+
{"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
98+
coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
99+
)
100+
gb = ds.foo.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
101+
gb
102+
```
103+
104+
```
105+
<DatasetGroupBy, grouped over 2 grouper(s), 4 groups in total:
106+
'x_bins': 2 groups with labels (5,, 15], (15,, 25]
107+
'letters': 2 groups with labels 'a', 'b'>
108+
```
109+
110+
```python
111+
gb.mean()
112+
```
113+
114+
```
115+
<xarray.DataArray 'foo' (x_bins: 2, letters: 2, y: 3)> Size: 96B
116+
array([[[ 0., 1., 2.],
117+
[nan, nan, nan]],
118+
119+
[[nan, nan, nan],
120+
[ 3., 4., 5.]]])
121+
Coordinates:
122+
* x_bins (x_bins) object 16B (5, 15] (15, 25]
123+
* letters (letters) object 16B 'a' 'b'
124+
Dimensions without coordinates: y
125+
```

0 commit comments

Comments
 (0)