Skip to content

Commit

Permalink
Improved resampling guide
Browse files Browse the repository at this point in the history
  • Loading branch information
philippjfr committed Jan 20, 2018
1 parent bfdab25 commit 5d8328d
Showing 1 changed file with 132 additions and 15 deletions.
147 changes: 132 additions & 15 deletions notebooks/user_guide/Resampling_Grids.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"import geoviews as gv\n",
"import datashader as dsh\n",
"\n",
"hv.extension('bokeh')\n",
"hv.extension('bokeh', 'matplotlib')\n",
"\n",
"%opts QuadMesh Image [width=600 height=400 colorbar=True] Feature [apply_ranges=False]"
]
Expand All @@ -23,9 +23,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In geographical applications grids and meshes of different kinds are very common and for visualization and analysis it is often very important to be able to regrid them in different ways. Regridding can refer both to upsampling and downsampling a grid or mesh, which is achieved through interpolation and aggregation respectively.\n",
"In geographical applications grids and meshes of different kinds are very common and for visualization and analysis it is often very important to be able to resample them in different ways. Regridding can refer both to upsampling and downsampling a grid or mesh, which is achieved through interpolation and aggregation.\n",
"\n",
"Naive approaches to regridding treat the space as flat, which is often simpler but can also give incorrect results when working with a flat earth. In this user guide we will summarize how to work with different grid types including rectilinear, curvilinear grids and trimeshes. Additionally we will discuss different approaches to regridding working based on the assumption of a flat earth (using [datashader](http://datashader.org/)) and a spherical earth ([xESMF](http://xesmf.readthedocs.io/en/latest/Curvilinear_grid.html))."
"Naive approaches to regridding treat the space as flat, which is often simpler but can also give incorrect results when working with a spherical space such as the earth. In this user guide we will summarize how to work with different grid types including rectilinear, curvilinear grids and trimeshes. Additionally we will discuss different approaches to regridding working based on the assumption of a flat earth (using [datashader](http://datashader.org/)) and a spherical earth ([xESMF](http://xesmf.readthedocs.io/en/latest/Curvilinear_grid.html))."
]
},
{
Expand All @@ -48,7 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"ds = xr.tutorial.load_dataset('air_temperature').isel(time=slice(0, 5))\n",
"ds = xr.tutorial.load_dataset('air_temperature').isel(time=slice(0, 100))\n",
"# Ensure data is in range (-180, 180)\n",
"ds['lon'] = np.where(ds.lon>180, ds.lon-360, ds.lon)\n",
"ds"
Expand All @@ -69,15 +69,55 @@
},
"outputs": [],
"source": [
"gvds = gv.Dataset(ds)\n",
"gvds = gv.Dataset(ds).redim.range(air=(230, 300))\n",
"gvds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also easily plot this data by using the ``.to`` method to create set of ``Image`` elements:"
"We can also easily plot this data by using the ``.to`` method to group the data into a set of ``Image`` elements indexed by 'time':"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"images = gvds.to(gv.Image, ['lon', 'lat'], dynamic=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is important to note that if we look at the longitude coordinates above we can see that they are defined in the range (0, 360), while GeoViews generally expects it to be in the range (-180, 180). To correct this we have one of two options:\n",
"\n",
"1. We use the ``project_image`` operation to correct this.\n",
"\n",
"2. We explicitly adjust the coordinates."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Option 1\n",
"images = gv.operation.project_image(images, projection=ccrs.PlateCarree())\n",
"\n",
"# Option 2\n",
"# ds['lon'] = np.where(ds.lon>180, ds.lon-360, ds.lon)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are ready to display the data. Note that throughout this user guide we will be using Bokeh but we could easily switch to matplotlib if needed."
]
},
{
Expand All @@ -87,7 +127,6 @@
"outputs": [],
"source": [
"%opts Image (cmap='viridis') \n",
"images = gvds.to(gv.Image, ['lon', 'lat'], dynamic=True)\n",
"images * gv.feature.coastline"
]
},
Expand All @@ -98,14 +137,23 @@
"### Datashader"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"HoloViews provides high-level wrappers around the datashader library, which make it possible to quickly resample or aggregate datasets of different kinds. Datashader knows nothing about non-flat coordinate systems, but provides a very fast, parallelized regridding operation for rectilinear grids. Here we will import the ``regrid`` operation and pass it our stack of images from above. While this dataset is fairly small and regridding will actually upsample the image to match the dimensions of the plot, ``regrid`` can very quickly downsample very large datasets.\n",
"\n",
"One important thing to note about the resampling operations we will be working with in this user guide is that they are dynamic and linked to the plot dimensions and axis ranges. This means that whenever we zoom or pan the data will be resampled. If we want to disable this linked behavior and supply an explicit width and height we can disable the streams by passing ``streams=[]`` as a keyword argument."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from holoviews.operation.datashader import regrid\n",
"regrid(images, interpolation='linear') * gv.feature.coastline"
"regrid(images) * gv.feature.coastline"
]
},
{
Expand All @@ -115,14 +163,65 @@
"### xESMF"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The xESMF library is specifically designed to provide an easy way to correctly resample grids defined in geographic coordinate systems and differs significantly from the simpler approach used by datashader, which applies simple upsampling and downsampling. xESMF is a wrapper around the [ESMF regridding algorithms](https://www.earthsystemcog.org/projects/esmf/regridding), which compute an interpolation weight matrix which is applied to remap the values of the source grid onto the destination grid. \n",
"\n",
"In GeoViews these algorithms are made available via the simple ``weighted_regrid`` operation, which supports the different interpolation modes including: 'bilinear', 'nearest_s2d', 'nearest_d2s' and 'conservative'. Since generating the sparse weight matrix takes much longer than applying it the operation will cache the weight matrix on disk for later use, this optimization can be disabled via the ``reuse_weights`` parameter or customized by defining a custom ``file_pattern``."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from geoviews.operation.regrid import weighted_regrid\n",
"weighted_regrid(images, interpolation='nearest_s2d').redim.range(air=(230, 300)) * gv.feature.coastline"
"weighted_regrid(images) * gv.feature.coastline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Onto an existing grid"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The operation also allows us to define a target grid, which we can either define manually or by using a utility provided by the xESMF library. Here we will define a $2^{\\circ}x1^{\\circ}$ grid."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import xesmf as xe\n",
"grid = xe.util.grid_2d(-160, -35, 2, 15, 70, 2)\n",
"grid"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since the grid has 2D coordinate arrays the regridded data will be wrapped in and displayed as a QuadMesh:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"target = gv.Dataset(grid, kdims=['lon', 'lat'])\n",
"weighted_regrid(images, target=target, streams=[])"
]
},
{
Expand Down Expand Up @@ -172,7 +271,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can plot this data directly as a ``gv.QuadMesh``, however this is generally quite slow, especially when we are working with bokeh where each grid point is rendered as a distinct polygon. We will therefore downsample the data by 4x along both dimensions:"
"Now we can plot this data directly as a ``gv.QuadMesh``, however this is generally quite slow, especially when we are working with bokeh where each grid point is rendered as a distinct polygon. We will therefore downsample the data by a factor of 3 along both dimensions:"
]
},
{
Expand All @@ -183,14 +282,32 @@
"source": [
"%opts Image QuadMesh (cmap='RdBu_r')\n",
"quadmeshes = gvds.to(gv.QuadMesh, ['xc', 'yc'], dynamic=True)\n",
"quadmeshes.map(lambda x: x.clone(x.data.Tair[::4, ::4]), gv.QuadMesh) * gv.feature.coastline"
"quadmeshes.map(lambda x: x.clone(x.data.Tair[::3, ::3]), gv.QuadMesh) * gv.feature.coastline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The problem is less severe when plotting using matplotlib but even then plotting can be fairly slow given a large enough mesh."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%output backend='matplotlib' size=300\n",
"%%opts QuadMesh (cmap='RdBu_r')\n",
"quadmeshes * gv.feature.coastline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we want to explore the whole dataset we therefore don't have much of a choice except for regridding it onto a rectilinear grid, which can be rendered much more efficiently. Once again we have the option of using the datashader based approach or the more correct xESMF based approach."
"If we want to explore a very large grid it therefore often makes sense to resample the data onto a rectilinear grid, which can be rendered much more efficiently. Once again we have the option of using the datashader based approach or the more correct xESMF based approach."
]
},
{
Expand All @@ -204,7 +321,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To regrid a ``QuadMesh`` using GeoViews we can import the ``rasterize`` operation. In the background the operation will convert the ``QuadMesh`` into a ``TriMesh``, which datashader understands. To optimize this conversion so it occurs only when aggregating an Image for the first time we can activate the ``precompute`` option. Additionally we have to define an aggregator, in this case to compute the mean ``Tair`` value in a pixel:"
"To regrid a ``QuadMesh`` using GeoViews we can import the ``rasterize`` operation. In the background the operation will convert the ``QuadMesh`` into a ``TriMesh``, which datashader understands. To optimize this conversion so it occurs only when aggregating the ``QuadMesh`` for the first time we can activate the ``precompute`` option. Additionally we have to define an aggregator, in this case to compute the mean ``Tair`` value in a pixel:"
]
},
{
Expand All @@ -228,7 +345,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The xESMF based approach stores generates a sparse weight matrix which weights each grid cell appropriately. Optionally this weight matrix can be cached to disk to speed up subsequent aggregation, to enable this optimization enable ``reuse_weights`` (and optionally define a custom filepattern)."
"Now we will once again use the xESMF based regridding for which we can still use the ``weighted_regrid`` operation, since it supports both rectilinear and curvilinear grids. Since the original data doesn't have a very high resolution we will also disable the ``streams`` linking the operation to the plot dimensions and axis ranges."
]
},
{
Expand All @@ -240,7 +357,7 @@
"outputs": [],
"source": [
"from geoviews.operation.regrid import weighted_regrid\n",
"weighted_regrid(quadmeshes, reuse_weights=True) * gv.feature.coastline"
"weighted_regrid(quadmeshes, streams=[]) * gv.feature.coastline"
]
},
{
Expand Down

0 comments on commit 5d8328d

Please sign in to comment.