Skip to content

Commit db5ab03

Browse files
committed
revamp blog post
1 parent ed4fb14 commit db5ab03

File tree

3 files changed

+484
-27
lines changed

3 files changed

+484
-27
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
public/atom.xml
2+
public/rss.json
3+
public/rss.html
4+
public/rss.xml
5+
16
yarn.lock
27
package-lock.json
38

src/posts/flexible-indexes/index.md

Lines changed: 28 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,63 @@
11
---
22
title: 'Xarray indexes: unleash the power of coordinates'
3-
date: '2023-08-07'
3+
date: '2025-06-05'
44
authors:
55
- name: Benoît Bovy
66
github: benbovy
7+
- name: Scott Henderson
8+
github: scottyhq
79
summary: 'It is now possible to take full advantage of coordinate data via Xarray explicit and flexible indexes'
810
---
911

10-
_TLDR: Xarray has been through a major refactoring of its internals that makes coordinate-based data selection and alignment (almost) fully customizable, via built-in and/or 3rd party indexes. It also addresses a good amount of long-standing issues with "dimension coordinates" implicitly backed by pandas (multi-)indexes._
12+
_TLDR: Xarray has been through a major refactoring of its internals that makes coordinate-based data selection and alignment more customizable, via built-in and/or 3rd party indexes! In this post we highlight a few examples that take advantage of this new superpower_
1113

1214
## Introduction
1315

14-
[link to Joe's CZI blog post]
16+
Xarray is a large project that is constantly evolving to meet needs of users and stay relevant to work with novel data formats and use-cases. One area of improvement identified in the [Development Roadmap](https://docs.xarray.dev/en/stable/roadmap.html#flexible-indexes) is the ability add new coordinate indexing capabilities beyond the original `pandas.Index`. Let's look at a few examples to understand what is now possible!
1517

16-
## The concept of "dimension coordinate" and its shortcomings
18+
TODO: Insert Benoit's awesome schematic from indexing sprint :)
1719

18-
Some datasets could not be loaded with Xarray (dimension name and coordinate with same name but different dimensions)
20+
## Alternatives to pandas.Index
1921

20-
Complicated workarounds (swap_dims, etc.)
22+
Generally-useful index alternatives are already part of Xarray!
2123

22-
Limited and/or challenging for data cubes representing arbitrary grids (curvilinear grids, unstructured meshes, etc.).
24+
### RangeIndex
2325

24-
## Better index vs. coordinate separation
26+
By default a `pandas.Index` calculates all coordinates and holds them in-memory. There are many use-cases where for 1-D coordinates where it's more efficient to store the start,stop,and step and calculate specific coordinate values on-the-fly. THis is what RangeIndex accomplishes:
2527

26-
Refactor index logic in `Index` classes. More easily maintainable. May help Pandas become optional dependency in the future? (cf. Xarray-lite).
28+
```python
29+
import xarray as xr
30+
from xarray.indexes import RangeIndex
2731

28-
Also allowed to solve lots of issues with multi-indexes, for which each level has now its own real coordinate.
32+
index = RangeIndex.arange(0.0, 100_000, 0.1, dim='x')
33+
ds = xr.Dataset(coords=xr.Coordinates.from_xindex(index))
34+
ds
35+
```
2936

30-
Dataset / DataArray section has now an "indexes" section.
37+
<RawHTML filePath='/posts/flexible-indexes/rangeindex-repr.html' />
3138

32-
## Selection using non-dimension, 1-d coordinates
3339

34-
Set an index for non-dimension coordinates! (No more swap_dims anymore or coordinate renaming)
40+
### IntervalIndex
3541

36-
```python
37-
ds.set_xindex(“non_dim_coord”).sel(non_dim_coord=“something”)
38-
```
39-
40-
## Alternatives to pandas.Index
42+
TODO: Not sure if this one is ready to highlight(https://github.com/pydata/xarray/pull/10296)
4143

42-
E.g., Numpy index (much faster to build, much more expensive to query), Geometry index (xvec)
4344

44-
Out-of-core index, etc.
45+
## Third-party custom Indexes
4546

46-
...or no index at all! (Create dataset with no default index, `drop_indexes`)
4747

48-
## Create custom indexes from arbitrary coordinates and dimensions
48+
### Xvec GeometryIndex
4949

50-
Not limited to 1-dimensional coordinates, even more flexible!
50+
TODO: Highlight https://xvec.readthedocs.io/en/v0.2.0/generated/xvec.GeometryIndex.html
5151

52-
RasterIndex, FunctionalIndex, etc.
52+
### RasterIndex
5353

54-
See xarray discussion for examples
54+
TODO: Highlight https://github.com/dcherian/rasterix
5555

5656
## What’s next
5757

58-
Still unfinished [link: indexes next steps GH issue], extension entry points, etc.
58+
While we're extremely excited about what can *already* be accomplished with the new indexing capabilities, there are plenty of exciting ideas for future work. If you're interested in getting involved we recommend following [this GitHub Issue](https://github.com/pydata/xarray/issues/6293)!
5959

6060
## Acknowledgments
6161

62-
CZI, Xarray core developers, etc.
62+
This work would not have been possible without technical input from the Xarray core team and community!
63+
Several developers received essential funding from a [CZI Essential Open Source Software for Science (EOSS) grant](https://xarray.dev/blog/czi-eoss-grant-conclusion) as well as NASA's Open Source Tools, Frameworks, and Libraries (OSTFL) grant 80NSSC22K0345.

0 commit comments

Comments
 (0)