-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s2_dwithin()
and s2_dwithin_matrix()
are slow
#157
Comments
Worth noting that preselection using |
Reprex to play with: library(s2)
points <- s2_point(runif(1e4, -1, 1), runif(1e4, -1, 1), runif(1e4, -1, 1))
geog <- s2::as_s2_geography(s2::as_s2_lnglat(points))
countries <- s2_data_countries()
grid_items <- expand.grid(geog = geog, countries = countries)
system.time(result <- s2_dwithin_matrix(countries, geog, 1e6))
#> user system elapsed
#> 3.093 0.010 3.104
system.time(result <- s2_dwithin_matrix(geog, countries, 1e6))
#> user system elapsed
#> 2.995 0.005 3.001
system.time(
result <- s2_dwithin(
grid_items$geog,
grid_items$countries,
1e6
)
)
#> user system elapsed
#> 3.203 0.019 3.224
system.time(
result <- s2_dwithin(
grid_items$countries,
grid_items$geog,
1e6
)
)
#> user system elapsed
#> 3.115 0.020 3.208
xy <- geos::geos_make_point(s2::s2_x(geog), s2::s2_y(geog))
countries2 <- geos::as_geos_geometry(wk::as_wkb(countries))
wk::wk_crs(countries2) <- NULL
grid_items2 <- expand.grid(xy = xy, countries2 = countries2)
system.time(
result2 <- geos::geos_is_within_distance(
grid_items2$xy,
grid_items2$countries,
0.1
)
)
#> user system elapsed
#> 0.113 0.008 0.125
system.time(
result2 <- geos::geos_is_within_distance(
grid_items2$countries,
grid_items2$xy,
0.1
)
)
#> user system elapsed
#> 0.114 0.005 0.122
system.time(
result2 <- geos::geos_prepared_is_within_distance(
grid_items2$xy,
grid_items2$countries,
0.1
)
)
#> user system elapsed
#> 0.117 0.005 0.123
system.time(
result2 <- geos::geos_prepared_is_within_distance(
grid_items2$countries,
grid_items2$xy,
0.1
)
)
#> user system elapsed
#> 0.115 0.005 0.120 It looks like this should be much faster than it is....I bet we can bake the preselection by cell buffering straight into dwithin to make this better. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As noted here: #125 (comment)
Where the matrix version is implemented: https://github.com/r-spatial/s2/blob/main/src/s2-matrix.cpp#L433-L448
I wonder if part of the performance hit here is because building a mutable shape index for a point over and over again has a lot of overhead, and this function probably most of its action on points. The
s2_cell()
vector class might be able to help with this; properly vectorizings2_latlng()
as a record-style vector and providing specific implementations is also probably helpful.It also may be that there is a more effective way to index
y
, or to use a single query and iterate through the edges (as opposed to creating a query for each pair of features). "iterating through the edges" for points is probably really fast (since each point is one edge); I don't know if this will be a performance hit for big polygons where there may be many edges from the same feature within some distance.The text was updated successfully, but these errors were encountered: