Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C++ API for point_linestring_nearest_points #658

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Aug 20, 2022

This PR adds C++ API of point_linestring_nearest_points. The host counterpart of this function is shapely.ops.nearest_points. Multiple fields are written when called on this API, including the geometry and part id indicating the geometry that the nearest point is on. In the header only API, the output iterator accepts a zipped iterator made of 4 output iterators. In the column API, a tuple is returned.

The kernel is parallelized per geometry. Each thread computes the nearest point between a pair of multipoint and multilinestring.

Contribute to #646

isVoid and others added 30 commits June 14, 2022 10:52
…into feature/header_only_point_linestring_distance
Co-authored-by: Mark Harris <mharris@nvidia.com>
…b.com:isVoid/cuspatial into feature/header_only_point_linestring_distance
…into feature/header_only_point_linestring_distance
@isVoid isVoid requested review from harrism and thomcom and removed request for zhangjianting September 14, 2022 01:03
@isVoid
Copy link
Contributor Author

isVoid commented Sep 19, 2022

Hi, I noticed that cython does not yet support std::tuple due to lack of variadic template support. Therefore pairwise_point_linestring_nearest_points cannot return a tuple of the results. We'll need to construct a custom struct for the return type. Working on a change.

@isVoid
Copy link
Contributor Author

isVoid commented Sep 19, 2022

Hi, I noticed that cython does not yet support std::tuple due to lack of variadic template support. Therefore pairwise_point_linestring_nearest_points cannot return a tuple of the results. We'll need to construct a custom struct for the return type. Working on a change.

done.

@isVoid
Copy link
Contributor Author

isVoid commented Sep 19, 2022

rerun tests

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @isVoid !

cpp/include/cuspatial/detail/utility/linestring.cuh Outdated Show resolved Hide resolved
thrust::distance(points_geometry_offsets_first, points_geometry_offsets_last) - 1;
auto num_linestring_points = thrust::distance(linestring_points_first, linestring_points_last);

for (auto idx = threadIdx.x + blockIdx.x * blockDim.x; idx < num_pairs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this quadruple nested loop, I see now how important it will be to add automatic geometry iterators that know how to do the nested indexing. You want to be able to just iterate over the pairs, and for each pair, just have a for_each over the points and a nested for_each over the points of the multilinestring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, abstracting different level of iteration is direly needed. Perhaps we can continue the discussion in #677 ?

cpp/include/cuspatial/point_linestring_nearest_points.hpp Outdated Show resolved Hide resolved
cpp/include/cuspatial/point_linestring_nearest_points.hpp Outdated Show resolved Hide resolved
* ```
*
* @param multipoint_geometry_offsets Beginning and ending indices for each multipoint
* @param points_xy Interleaved x, y-coordinate of points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interleaved x and y doesn't match other cuspatial column-based APIs. We have always had the X/Y and Lon/Lat positions in separate columns, right?

Copy link
Contributor Author

@isVoid isVoid Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started making this adaptation for #649. We could keep them separate but on the python side this would always require 2 scatter operations before we can call c++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I was under the impression that if you are using the column-based API you are using dataframes, which tend to have a separate column for each field. What does GeoPandas do here?

Copy link
Contributor Author

@isVoid isVoid Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A GeoSeries in geopandas is an array of structures. There's currently a discussion over the format of xy-coordinates going on at geoarrow:
geoarrow/geoarrow#26

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We argued successfully to use xy interleaved coordinates for points to improve cache coherency and word size (coalesced memory access) availability for kernels when GeoArrow was designed. @isVoid has been in the process of converting our existing APIs to support GeoArrow more directly. I'll be following up in the above conversation today.

cpp/tests/utility/vector_equality.hpp Outdated Show resolved Hide resolved
@isVoid isVoid added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Sep 20, 2022
cpp/include/cuspatial/point_linestring_nearest_points.hpp Outdated Show resolved Hide resolved
* ```
*
* @param multipoint_geometry_offsets Beginning and ending indices for each multipoint
* @param points_xy Interleaved x, y-coordinate of points
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We argued successfully to use xy interleaved coordinates for points to improve cache coherency and word size (coalesced memory access) availability for kernels when GeoArrow was designed. @isVoid has been in the process of converting our existing APIs to support GeoArrow more directly. I'll be following up in the above conversation today.

@harrism
Copy link
Member

harrism commented Sep 21, 2022

rerun tests

@isVoid
Copy link
Contributor Author

isVoid commented Sep 21, 2022

Upon addressing this comment: #658 (comment), I introduced a bug overwrites nearest_points under every iteration. When I attempt to fix this bug, I realized that thrust::tuple still doesn't work well with structure binding. (The binding works, but the bounded variable is a thrust::detail::cons) that requires more unwrapping. I simply reverted this change.

for thrust::tuple min repro: https://godbolt.org/z/drTr4dx8z

isVoid and others added 2 commits September 21, 2022 16:45
Co-authored-by: H. Thomson Comer <thomcom@gmail.com>
@thomcom
Copy link
Contributor

thomcom commented Sep 22, 2022

I think this is ready to merge @isVoid? Congrats! :)

@isVoid
Copy link
Contributor Author

isVoid commented Sep 22, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit e9a2faa into rapidsai:branch-22.10 Sep 22, 2022
rapids-bot bot pushed a commit that referenced this pull request Sep 29, 2022
This PR adds python API for point-linestring nearest points. Closes #646 .
Depend on #658 #686 

This PR also moves `Feature_Enum` from io module to geometa module to make sure the coding is consistent and reused throughout the codebase.

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: #681
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team cmake Related to CMake code or build configuration feature request New feature or request libcuspatial Relates to the cuSpatial C++ library non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants