Description
When running sphinx to build the documentation, it frequently times out when trying to build the visualization gallery. Running
/usr/bin/time -v python -c 'import xarray as xr; xr.open_rasterio("https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif")'
reports that it takes at least 5 minutes (or time out after 10 minutes) if opened from the url. Subsequent calls use the cache, so the second rasterio
example is fast.
If instead I download the file manually and then load from disk, the whole notebook completes in about 10 seconds. Also, directly calling rasterio.open
completes in a few seconds, so the bug should be in open_rasterio
.
I do think we should try to fix this in the backend, but maybe we could also cache RGB.byte.tif
in the same directory as the xarray.tutorial
data and open the cached file in the gallery?
Edit: this is really flaky, I can't reliably reproduce this.
Edit2: for now, I'm using a extra cell containing
import pathlib
import shutil
import requests
cache_dir = pathlib.Path.home() / ".xarray_tutorial_data"
path = cache_dir / "RGB.byte.tif"
url = "https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif"
if not path.exists() or path.stat().st_size == 0:
with requests.get(url) as r, path.open(mode="wb") as f:
if r.status_code == requests.codes.ok:
shutil.copyfileobj(r.raw, f)
else:
print("download failed: {r.status_code}")
r.raise_for_status()
url = path
and modify both examples to use the new url