-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor haversine_distance to a header-only API #477
Refactor haversine_distance to a header-only API #477
Conversation
@@ -0,0 +1,125 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be named haversine.cuh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on your style. Thrust doesn't follow that convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, in the future we may want to enable both host and device execution of these algorithms, a la Thrust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought anything with CUDA C++ in it was supposed to either be .cu
or .cuh
? If someone includes this from a .cpp
file, GCC won't know what to do with the __device__
functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Thrust uses .h for all headers, and makes sure to protect everything so that it can be portable to different backends. I don't know if we want to go that direction. I just have a particular aversion to ".cuh" used in non-detail headers (not sure why, it's just ugly I guess).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally my rule is that .h/.hpp
files should be able to included in a host-only TU without issue.
I'd go with .cuh
unless we decide to make cuSpatial work on both CPU/GPU via Thrust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. It would be nice to support CPU and GPU backends but I'm not sure how feasible that is with the more complicated algorithms. Also it would have a large impact on tests, which currently use device vectors. So this is definitely future work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is experimenting with a new API/library design, some bigger pictures things to consider:
Do we want to provide per-algorithm control over the temporary allocations like Thrust? Or will cuSpatial continue to depend on RMM and a user can control temp allocations via per_device_resource?
Looking at the implementation, this algorithm could conceivably run on CPUs with the thrust::host
backend. Is cuSpatial interested in providing the flexibility to run on both CPU and GPU?
yeah, this is what I meant when I asked you if you thought it should take an exec policy argument. This algorithm doesn't allocate, so I removed the MR argument. But if it did, it could get the allocator from the exec_policy like Thrust does. |
That requires further discussion. We have not received any requirements for this yet that I know of. |
Co-authored-by: Jake Hemstad <jhemstad@nvidia.com> Co-authored-by: Paul Taylor <paul.e.taylor@me.com>
…ial into fea-header-only-haversine
I made a significant changed based on my experimentation with refactoring other APIs which use both LonLat and XY coordinates. I replaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. 3 small nits in doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake approval
3. Longitude/Latitude data is passed as array of structures, using the `cuspatial::lonlat_2d` | ||
type (include/cuspatial/types.hpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code doesn't currently tell me this. Maybe add a static_assert
to demonstrate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concepts would be so nice for this :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, I already have that static assert.
static_assert(
std::conjunction_v<std::is_same<lonlat_2d<T>, Location>, std::is_same<lonlat_2d<T>, LocationB>>,
"Inputs must be cuspatial::lonlat_2d");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned the static asserts in the refactoring guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the code on lines33-43 and then reading on it describes the important things about this code...
"There are a few key points to notice...
Longitude/Latitude data is passed as array of structures, using the cuspatial::lonlat_2d
type"
My only point is that it's impossible to "notice" this point just from the code as is on 33-43.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I added this clarification to that point:
Longitude/Latitude data is passed as array of structures, using the `cuspatial::lonlat_2d`
type (include/cuspatial/types.hpp). This is enforced using a `static_assert` in the function
body (discussed later).
Thanks everyone for reviewing! @gpucibot merge |
@gpucibot merge |
Following #477, implements the header-only API for `cuspatial::lonlat_to_cartesian` and implements existing C++ API on top of it. Follows the refactoring guide introduced in #477. Note this branch is based on the branch for #477 so diff includes all changes from that PR as well. Once #477 is merged this PR will simplify a lot. Authors: - Mark Harris (https://github.com/harrism) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Michael Wang (https://github.com/isVoid) URL: #514
This is an exploration of what a refactoring of libcuspatial to a header-only API would look like. This PR refactors a single API,
haversine_distance
. The new API is iterator-based, and is completely independent of libcudf and libcudf_test.The design is based on
std
/Thrust algorithms. It is an iterator-based API, which enables flexibility to use "fancy" iterators such as transform iterators.There is no support for nulls. We didn't allow
cudf::column
inputs with NULLS in the cuDF-based API anyway.