Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error downloading layer #1

Closed
tomjwebb opened this issue Jun 18, 2020 · 15 comments
Closed

Error downloading layer #1

tomjwebb opened this issue Jun 18, 2020 · 15 comments
Assignees
Labels
wontfix This will not be worked on

Comments

@tomjwebb
Copy link
Collaborator

Hey Anna, I'm trying to access some of the big seabed map layers, am hitting an error when running:

wfs_seabedgeneral <- emodnet_init_wfs_client(service = "seabed_habitats_general_datasets_and_products")
hab_dat <- emodnet_get_layers(wfs = wfs_seabedgeneral, layers = "eusm2019_subs_full")

Error message is:

Warning: Download of layer 'eusm2019_subs_full' failed: Error in readBin(content, character()): R character strings are limited to 2^31-1 bytes

Seems like some problem parsing the layer?

@annakrystalli
Copy link
Collaborator

From Tom on slack:

I even tried just downloading the thing (from http://gis.ices.dk/geonetwork/srv/eng/catalog.search#/metadata/01bf1f24-fdcd-4ee7-af8b-e62cf72fe2f9) so I could get something done, and that failed too…

@annakrystalli
Copy link
Collaborator

So I did manage to download the data from this page:
https://www.emodnet-seabedhabitats.eu/access-data/download-data/?linkid=eusm2019_group

selecting Broad-scale seabed habitat map including confidence & EUNIS & MSFD classifications (updated 1st July 2019) for download

image

@annakrystalli
Copy link
Collaborator

Will look into the WFS issue next

@annakrystalli
Copy link
Collaborator

Got the same error as Tom using the same commands through Rstudio 1.2.5019

Tried through the R console in response to: tidyverse/vroom#136 but still get the same error.

@LennertSchepers
Copy link
Member

temporarily solution from @tomjwebb to download full seabed habitat map as zip, rather than using the wfs:

First, download the EMODnet broadscale habitat map - NB big download - 589MB zipped. Unzip and remove the zipped version.

download.file(
  url = "https://www.emodnet-seabedhabitats.eu/files/eusm2019_model_and_confidence.zip",
  destfile = here::here(
    "data", "raw_data/eusm2019_model_and_confidence.zip"))
unzip(here::here("data", "raw_data/eusm2019_model_and_confidence.zip"),
      exdir = here::here("data", "raw_data/eusm2019_model_and_confidence"))
# remove the zipped file
invisible(file.remove(here::here(
  "data", "raw_data/eusm2019_model_and_confidence.zip")))

see https://github.com/EMODnet/EMODnet-Biology-Benthic-Habitats-Occurrences-Traits/blob/master/analysis/benthic%20data%20habitat%20matching.Rmd#L187

@annakrystalli
Copy link
Collaborator

I'm getting another issue downloading the abiotic_observations layer from the biology_occurrence_data server. This time it's causing a memory issue.

Error in gsub("<!--.*?-->", "", text): 'Calloc' could not allocate memory (18446744072424415232 of 4 bytes)

@maelle
Copy link
Collaborator

maelle commented Mar 25, 2022

I might be completely off here: would it be worth checking out tools like https://github.com/paleolimbot/geoarrow/ / https://wcjochem.github.io/sfarrow/ ?

@maelle maelle added this to the rOpenSci submission milestone Apr 26, 2022
@maelle
Copy link
Collaborator

maelle commented Mar 10, 2023

@salvafern how much of a blocker is this, in your opinion?

@maelle
Copy link
Collaborator

maelle commented Mar 10, 2023

library("EMODnetWFS")
wfs_seabedgeneral <- emodnet_init_wfs_client(service = "seabed_habitats_general_datasets_and_products")
#> Loading ISO 19139 XML schemas...
#> Loading ISO 19115 codelists...
#> ✔ WFS client created successfully
#> ℹ Service: "https://ows.emodnet-seabedhabitats.eu/geoserver/emodnet_open/wfs"
#> ℹ Version: "2.0.0"
hab_dat <- emodnet_get_layers(wfs = wfs_seabedgeneral, layers = "eusm2019_subs_full")
#> Warning: Download of layer "eusm2019_subs_full" failed: Error in readBin(content,
#> character()): R character strings are limited to 2^31-1 bytes

Created on 2023-03-10 with reprex v2.0.2

@salvafern
Copy link
Collaborator

Hi @maelle and all,

I don't think there is an obvious solution for this. WFS is not designed to handle large requests IMO. If you apply a filter to the layer it works (see reprex below). This would be the best solution if it were possible depending on the scope of the study.

If we needed the full layer, the way to go would be pagination. But this does not seem to be fully functional at the moment in ows4R (see eblondel/ows4R#70)

I don't think is blocking tho, EMODnetWFS works really well in most of the cases.

Maybe close with won't do and we update the docs to indicate that large requests may fail and better to get subsets?

Reprex:

library("EMODnetWFS")
wfs_seabedgeneral <- emodnet_init_wfs_client(service = "seabed_habitats_general_datasets_and_products")
#> Loading ISO 19139 XML schemas...
#> Loading ISO 19115 codelists...
#> Loading IANA mime types...
#> No encoding supplied: defaulting to UTF-8.
#> ✔ WFS client created successfully
#> ℹ Service: "https://ows.emodnet-seabedhabitats.eu/geoserver/emodnet_open/wfs"
#> ℹ Version: "2.0.0"

hab_dat <- emodnet_get_layers(wfs = wfs_seabedgeneral, 
                              layers = "eusm2019_subs_full", 
                              count = 1,
                              outputFormat = "json")
hab_dat
#> $eusm2019_subs_full
#> Simple feature collection with 1 feature and 7 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 2709119 ymin: 4211729 xmax: 2709162 ymax: 4211738
#> CRS:           3857
#>                          id substrate shape_length shape_area
#> 1 eusm2019_subs_full.852630   Unknown     90.35503   125.3248
#>                                         geom_200
#> 1 { "type": "MultiPolygon", "coordinates": [ ] }
#>                                         geom_400
#> 1 { "type": "MultiPolygon", "coordinates": [ ] }
#>                                         geom_800                       geometry
#> 1 { "type": "MultiPolygon", "coordinates": [ ] } MULTIPOLYGON (((2709127 421...

plot(sf::st_geometry(hab_dat[[1]]))

Created on 2023-03-10 by the reprex package (v2.0.1)

@annakrystalli
Copy link
Collaborator

Hi all.

One thing to note is that there are examples in the docs of downloading batches of features: https://emodnet.github.io/EMODnetWFS/articles/request-params.html#return-blocks-of-features-from-specific-starting-point

Perhaps an option would be to add a check that fails if someone tries to download the whole bathymetry layer and point them to the docs?

@maelle
Copy link
Collaborator

maelle commented Mar 31, 2023

related #143 (at least I think so? getting a data dump as an alternative)

@TJB197
Copy link

TJB197 commented Nov 1, 2023

Hi, can you please give an example to show how to download a spatial subset of a layer, to avoid the OP issue.

@salvafern
Copy link
Collaborator

Hi, can you please give an example to show how to download a spatial subset of a layer, to avoid the OP issue.

Hi @TJB197 , you can request only a certain extent by providing a bounding box. You can also apply any geoserver spatial relationship function in the cql_filter argument.

@bart-v bart-v added the wontfix This will not be worked on label Nov 6, 2023
@bart-v
Copy link

bart-v commented Nov 6, 2023

If you download large amounts of data from any resource, you need to handle things completely differently.
Seems out of scope of this package.
Marking as closed for now

@bart-v bart-v closed this as not planned Won't fix, can't repro, duplicate, stale Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

7 participants