-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding vectorized indexing docs #4711
Changes from 2 commits
40f0050
942dc17
33c6216
5b94de5
404f1f1
33ee0f9
7949744
b374a17
0f83b76
1ad0c04
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -1195,6 +1195,55 @@ def sel( | |||||||||||||
Dataset.sel | ||||||||||||||
DataArray.isel | ||||||||||||||
|
||||||||||||||
Examples | ||||||||||||||
-------- | ||||||||||||||
>>> ds = xr.tutorial.open_dataset("air_temperature") | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we usually prefer having example data generated by
Suggested change
The doctests / documentation fail because of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the tip. I've implemented your suggestion in da = xr.DataArray(
np.arange(25).reshape(5, 5),
coords={"x": np.arange(5), "y": np.arange(5)},
dims=("x", "y")
)
tgt_x = xr.DataArray(np.linspace(0, 2, num=3), dims="points")
tgt_y = xr.DataArray(np.linspace(0, 2, num=3), dims="points")
da.isel(x=tgt_x, y=tgt_y) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Edit: yes, it seems There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Working on building an example for import numpy as np
import pandas as pd
import xarray as xr
ds = xr.Dataset(
data_vars=dict(
temperature=(["x", "y", "time"], np.arange(125).reshape(5, 5, 5))
),
coords=dict(
lon=(["x", "y"], np.arange(25).reshape(5, 5)),
lat=(["x", "y"], np.arange(25).reshape(5, 5)),
time=pd.date_range("2014-09-06", periods=5),
),
attrs=dict(description="Weather related data.")
)
# Define target latitude and longitude
tgt_x = xr.DataArray(np.arange(0, 5), dims="points")
tgt_y = xr.DataArray(np.arange(0, 5), dims="points")
# Dataset.sel
ds_sel = ds.sel(x=tgt_x, y=tgt_y, method='nearest') However I get the following error, which I don't understand. I assume my coordinate definitions are inconsistent?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the reason is that you define This also means that you don't have to define |
||||||||||||||
>>> ds | ||||||||||||||
<xarray.Dataset> | ||||||||||||||
Dimensions: (lat: 25, lon: 53, time: 2920) | ||||||||||||||
Coordinates: | ||||||||||||||
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 | ||||||||||||||
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 | ||||||||||||||
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 | ||||||||||||||
Data variables: | ||||||||||||||
air (time, lat, lon) float32 ... | ||||||||||||||
Attributes: | ||||||||||||||
Conventions: COARDS | ||||||||||||||
title: 4x daily NMC reanalysis (1948) | ||||||||||||||
description: Data is from NMC initialized reanalysis\n(4x/day). These a... | ||||||||||||||
platform: Model | ||||||||||||||
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... | ||||||||||||||
|
||||||||||||||
>>> tgt_lat = xr.DataArray(np.linspace(40, 45, num=6), dims="points") | ||||||||||||||
>>> tgt_lon = xr.DataArray(np.linspace(200, 205, num=6), dims="points") | ||||||||||||||
>>> da = ds['air'].sel(lon=tgt_lon, lat=tgt_lat, method='nearest') | ||||||||||||||
>>> da | ||||||||||||||
<xarray.DataArray 'air' (time: 2920, points: 6)> | ||||||||||||||
array([[284.6 , 284.6 , 283.19998, 283.19998, 280.19998, 280.19998], | ||||||||||||||
[283.29 , 283.29 , 282.79 , 282.79 , 280.79 , 280.79 ], | ||||||||||||||
[282. , 282. , 280.79 , 280.79 , 280. , 280. ], | ||||||||||||||
..., | ||||||||||||||
[282.49 , 282.49 , 281.29 , 281.29 , 280.49 , 280.49 ], | ||||||||||||||
[282.09 , 282.09 , 280.38998, 280.38998, 279.49 , 279.49 ], | ||||||||||||||
[282.09 , 282.09 , 280.59 , 280.59 , 279.19 , 279.19 ]], | ||||||||||||||
dtype=float32) | ||||||||||||||
Coordinates: | ||||||||||||||
lat (points) float32 40.0 40.0 42.5 42.5 45.0 45.0 | ||||||||||||||
lon (points) float32 200.0 200.0 202.5 202.5 205.0 205.0 | ||||||||||||||
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 | ||||||||||||||
Dimensions without coordinates: points | ||||||||||||||
Attributes: | ||||||||||||||
long_name: 4xDaily Air temperature at sigma level 995 | ||||||||||||||
units: degK | ||||||||||||||
precision: 2 | ||||||||||||||
GRIB_id: 11 | ||||||||||||||
GRIB_name: TMP | ||||||||||||||
var_desc: Air temperature | ||||||||||||||
dataset: NMC Reanalysis | ||||||||||||||
level_desc: Surface | ||||||||||||||
statistic: Individual Obs | ||||||||||||||
parent_stat: Other | ||||||||||||||
actual_range: [185.16 322.1 ] | ||||||||||||||
""" | ||||||||||||||
ds = self._to_temp_dataset().sel( | ||||||||||||||
indexers=indexers, | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did tell you to modify the vectorized indexing section, but while reading #3768 I noticed that we already document that in the
More advanced indexing
section. Given that it uses pointwise indexing as its only example, maybe we should rename it toPointwise indexing
to make this easier to find?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like that section should be merged with "Vectorized Indexing" which could then be renamed to "Pointwise (or vectorized) Indexing)"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. Also, the term
vectorized indexing
is not really explained (unless I'm missing something, there's only a comment hinting at it's meaning), so it might be good to explicitly do that somewhere. This sounds like a lot of work, though, so it might be better to do that in a different PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting discussion. At this point, I don't think I have much to add here. So I concur that this should be addressed in a different PR.