Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor haversine_distance to a header-only API #477

Merged
merged 31 commits into from
Apr 28, 2022
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
314580d
Create header-only refactoring of cuspatial::haversine_distance
harrism Jan 19, 2022
0736356
Merge branch 'branch-22.04' into fea-header-only-haversine
harrism Jan 19, 2022
5830b66
Apply suggestions from code review
harrism Jan 20, 2022
4c16cd6
require RandomAccessIterator
harrism Jan 20, 2022
7580661
Merge branch 'fea-header-only-haversine' of github.com:harrism/cuspat…
harrism Jan 20, 2022
073e2d7
Convert haversine API to use AOS inputs.
harrism Feb 16, 2022
e37d61e
Revert cosmetic changes to top-level haversine.hpp
harrism Feb 16, 2022
9d8e3eb
Align location_2d and remove unused location_3d and coord_2d.
harrism Feb 16, 2022
7f2bbad
__device__ only
harrism Feb 16, 2022
2448677
"" --> <>
harrism Feb 16, 2022
0d954e0
Remove unused macro.
harrism Feb 16, 2022
7d100dc
Add refactoring guide.
harrism Mar 30, 2022
daa82a8
Add refactoring guide.
harrism Mar 30, 2022
c4ba1f7
Merge branch 'branch-22.04' into fea-header-only-haversine
harrism Mar 31, 2022
454d967
Add fancy iterator test
harrism Mar 31, 2022
a5dab4a
Merge branch 'branch-22.06' into fea-header-only-haversine
harrism Mar 31, 2022
08dfe95
.hpp->.cuh
harrism Mar 31, 2022
e373065
Add note about not making tests depend on libcudf_test
harrism Mar 31, 2022
3ae133e
gitignore
harrism Apr 5, 2022
f8947eb
Add missing include and @
harrism Apr 5, 2022
564cc4c
Don't hide the stream parameter in the detail layer.
harrism Apr 5, 2022
4a6976e
header cleanup
harrism Apr 5, 2022
c208144
Simplify coordinate types to a single vec_2d
harrism Apr 7, 2022
21db279
Clean up haversine_test.cu includes
harrism Apr 7, 2022
be1226c
type-safe vectors
harrism Apr 26, 2022
8134f12
Fix typo
harrism Apr 26, 2022
42f909e
vec_2d --> lonlat_2d in docs
harrism Apr 26, 2022
e5cb703
Merge branch 'branch-22.06' into fea-header-only-haversine
harrism Apr 26, 2022
0b78d87
Review suggestions
harrism Apr 27, 2022
17569c6
Clarified documentation / refactoring guide.
harrism Apr 27, 2022
d5ef3b7
style
harrism Apr 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,9 @@ ENV/

# mypy
.mypy_cache/

# notebook/example generated files
notebooks/taxi2016.csv
notebooks/taxi2017.csv
notebooks/tzones_lonlat.json
notebooks/cu_taxi_zones.*
272 changes: 272 additions & 0 deletions cpp/doc/libcuspatial_refactoring_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
# cuSpatial C++ API Refactoring Guide

The original cuSpatial C++ API (libcuspatial) was designed to depend on RAPIDS libcudf and use
its core data types, especially `cudf::column`. For users who do not also use libcudf or other
RAPIDS APIS, depending on libcudf could be a big barrier to adoption of libcuspatial. libcudf is
a very large library and building it takes a lot of time.

Therefore, we are developing a standalone libcuspatial C++ API that does not depend on libcudf. This
is a header-only template API with an iterator-based interface. This has a number of advantages

1. With a header-only API, users can include and build exactly what they use.
2. With a templated API, the API can be flexible to support a variety of basic data types, such
as float and double for positional data, and different integer sizes for indices.
3. By templating on iterator types, cuSpatial algorithms can be fused with transformations of the
input data, by using "fancy" iterators. Examples include transform iterators and counting
iterators.
4. Memory resources only need to be part of APIs that allocate temporary intermediate storage.
Output storage is allocated outside the API and an output iterator is passed as an argument.

The main disadvantages of this type of API are

1. Header-only APIs can increase compilation time for code that depends on them.
2. Some users (especially our Python API) may prefer a cuDF-based API.

The good news is that by maintaining the existing libcudf-based C++ API as a layer above the header-
only libcuspatial API, we can avoid problem 1 and problem 2 for users of the legacy API.

## Example API

Following is an example iterator-based API for `cuspatial::haversine_distance`.
harrism marked this conversation as resolved.
Show resolved Hide resolved

```C++
template <class LonLatItA,
class LonLatItB,
class OutputIt,
class Location = typename std::iterator_traits<LonLatItA>::value_type,
class T = typename Location::value_type>
OutputIt haversine_distance(LonLatItA a_lonlat_first,
LonLatItA a_lonlat_last,
LonLatItB b_lonlat_first,
OutputIt distance_first,
T const radius = EARTH_RADIUS_KM,
rmm::cuda_stream_view stream = rmm::cuda_stream_default);
```

There are a few key points to notice.

1. The API is very similar to STL algorithms such as `std::transform`.
2. All array inputs and outputs are iterator type templates.
3. Longitude/Latitude data is passed as array of structures, using the `cuspatial::lonlat_2d`
type (include/cuspatial/types.hpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code doesn't currently tell me this. Maybe add a static_assert to demonstrate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concepts would be so nice for this :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, I already have that static assert.

static_assert(
    std::conjunction_v<std::is_same<lonlat_2d<T>, Location>, std::is_same<lonlat_2d<T>, LocationB>>,
    "Inputs must be cuspatial::lonlat_2d");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned the static asserts in the refactoring guide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code on lines33-43 and then reading on it describes the important things about this code...

"There are a few key points to notice...

Longitude/Latitude data is passed as array of structures, using the cuspatial::lonlat_2d type"

My only point is that it's impossible to "notice" this point just from the code as is on 33-43.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I added this clarification to that point:

 Longitude/Latitude data is passed as array of structures, using the `cuspatial::lonlat_2d`
     type (include/cuspatial/types.hpp). This is enforced using a `static_assert` in the function
     body (discussed later).

4. The `Location` type is a template that is by default equal to the `value_type` of the input
iterators.
5. The floating point type is a template (`T`) that is by default equal to the `value_type` of
`Location`.
6. The iterator types for the two input ranges (A and B) are distinct templates. This is crucial
to enable composition of fancy iterators that may be different types for A and B.
7. The size of the input and output ranges in the example API are equal, so the start and end of
only the A range is provided (`a_lonlat_first` and `a_lonlat_last`). This mirrors STL APIs.
8. This API returns an iterator to the element past the last element written to the output. This
is inspired by `std::transform`, even though as with `transform`, many uses of
`haversine_distance` will not need this returned iterator.
9. All APIs that run CUDA device code (including Thrust algorithms) or allocate memory take a CUDA
stream on which to execute the device code and allocate memory.

## Example Documentation

Following is the (Doxygen) documentation for the above `cuspatial::haversine_distance`.

```C++
/**
* @brief Compute haversine distances between points in set A to the corresponding points in set B.
*
* Computes N haversine distances, where N is `std::distance(a_lon_first, a_lon_last)`.
* The distance for each `(a_lon[i], a_lat[i])` and `(b_lon[i], b_lat[i])` point pair is assigned to
* `distance_first[i]`. `distance_first` must be an iterator to output storage allocated for N
* distances.
harrism marked this conversation as resolved.
Show resolved Hide resolved
*
* https://en.wikipedia.org/wiki/Haversine_formula
*
* @param[in] a_lonlat_first: beginning of range of (longitude, latitude) locations in set A
* @param[in] a_lonlat_last: end of range of (longitude, latitude) locations in set A
* @param[in] b_lonlat_first: beginning of range of (longitude, latitude) locations in set B
* @param[out] distance_first: beginning of output range of haversine distances
* @param[in] radius: radius of the sphere on which the points reside. default: 6371.0
* (approximate radius of Earth in km)
* @param[in] stream: The CUDA stream on which to perform computations and allocate memory.
*
* All iterators must have the same floating-point `value_type`.
harrism marked this conversation as resolved.
Show resolved Hide resolved
*
* Computed distances will have the same units as `radius`.
*
* @tparam LonLatItA Iterator to input location set A. Must meet the requirements of
* [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible.
* @tparam LonLatItB Iterator to input location set B. Must meet the requirements of
* [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible.
* @tparam OutputIt Output iterator. Must meet the requirements of
* [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible.
* @tparam Location The `value_type` of `LonLatItA` and `LonLatItB`. Must be
* `cuspatial::lonlat_2d<T>`.
* @tparam T The underlying coordinate type. Must be a floating-point type.
*
* @return Output iterator to the element past the last distance computed.
*
* [LinkLRAI]: https://en.cppreference.com/w/cpp/named_req/RandomAccessIterator
* "LegacyRandomAccessIterator"
*/
```

Key points:

1. Precisely and succinctly documents what the API computes, and provides references.
2. All parameters and all template parameters are documented.
3. States the C++ standard iterator concepts that must be implemented, and that iterators must be
device-accessible.
4. Documents requirements such as that the input iterators must have `value_type` of
`cuspatial::lonlat_2d<T>`, and that all iterators must have the same floating-point
`value_type`.
5. Documents the units of any inputs or outputs that have them.

## cuSpatial libcudf-based C++ API (legacy API)

This is the existing API, unchanged by refactoring. Here is the existing
`cuspatial::haversine_distance`:

```C++
std::unique_ptr<cudf::column> haversine_distance(
cudf::column_view const& a_lon,
cudf::column_view const& a_lat,
cudf::column_view const& b_lon,
cudf::column_view const& b_lat,
double const radius = EARTH_RADIUS_KM,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());
```

key points:
1. All input data are `cudf::column_view`. This is a type-erased container so determining the
type of data must be done at run time.
2. All inputs are arrays of scalars. Longitude and latitude are separate.
3. The output is a returned `unique_ptr<cudf::column>`.
4. The output is allocated inside the function using the passed memory resource.
5. The public API does not take a stream. There is a `detail` version of the API that takes a
stream. This follows libcudf, and may change in the future.

## File Structure

For now, libcuspatial APIs should be defined in a header file in the
`cpp/include/cuspatial/experimental/` directory. Later, as we adopt the new API, we will rename
the `experimental` directory. The API header should be named after the API. In the example,
`haversine.hpp` defines the `cuspatial::haversine_distance` API.

The implementation must also be in a header, but should be in the `cuspatial/experimental/detail`
directory. The implementation should be included from the API definition file, at the end of the
file. Example:

```C++
... // declaration of API above this point
#include <cuspatial/experimental/detail/haversine.hpp>
```

## Namespaces

Public APIs are in the `cuspatial` namespace. Note that both the header-only API and the libcudf-
based API can live in the same namespace, because they are non-ambiguous (very different
parameters).

Implementation of the header-only API should be in a `cuspatial::detail` namespace.

## Implementation

The main implementation should be in detail headers.

### Header-only API Implementation

Because it is a statically typed API, the header-only implementation can be much simpler than the
libcudf-based API, which requires run-time type dispatching. In the case of `haversine_distance`, it is
a simple matter of a few static asserts and dynamic expectation checks, followed by a call to
`thrust::transform` with a custom transform functor.

```C++
template <class LonLatItA,
class LonLatItB,
class OutputIt,
class Location = typename std::iterator_traits<LonLatItA>::value_type,
class T = typename Location::value_type>
OutputIt haversine_distance(LonLatItA a_lonlat_first,
LonLatItA a_lonlat_last,
LonLatItB b_lonlat_first,
OutputIt distance_first,
T const radius,
rmm::cuda_stream_view stream)
{
using LocationB = typename std::iterator_traits<LonLatItB>::value_type;
static_assert(std::conjunction_v<std::is_same<lonlat_2d<T>, Location>,
std::is_same<lonlat_2d<T>, LocationB>>,
"Inputs must be cuspatial::lonlat_2d");
static_assert(
std::conjunction_v<std::is_floating_point<T>,
std::is_floating_point<typename LocationB::value_type>,
std::is_floating_point<typename std::iterator_traits<OutputIt>::value_type>>,
isVoid marked this conversation as resolved.
Show resolved Hide resolved
"Haversine distance supports only floating-point coordinates.");

CUSPATIAL_EXPECTS(radius > 0, "radius must be positive.");

return thrust::transform(rmm::exec_policy(stream),
a_lonlat_first,
a_lonlat_last,
b_lonlat_first,
distance_first,
detail::haversine_distance_functor<T>(radius));
}
```

Note that we `static_assert` that the types of the iterator inputs match documented expectations.
We also do a runtime check that the radius is positive. Finally we just call `thrust::transform`,
passing it an instance of `haversine_distance_functor`, which is a function of two `lonlat_2d`
inputs that implements the Haversine distance formula.

### libcudf-based API Implementation

The substance of the refactoring is making the libcudf-based API a wrapper around the header-only
API. This mostly involves replacing business logic implementation in the type-dispatched functor
with a call to the header-only API. We also need to convert disjoint latitude and longitude inputs
into `lonlat_2d<T>` structs. This is easily done using the `cuspatial::make_lonlat_iterator` utility
provided in `type_utils.hpp`.

So, to refactor the libcudf-based API, we remove the following code.

```C++
auto input_tuple = thrust::make_tuple(thrust::make_constant_iterator(static_cast<T>(radius)),
a_lon.begin<T>(),
a_lat.begin<T>(),
b_lon.begin<T>(),
b_lat.begin<T>());

auto input_iter = thrust::make_zip_iterator(input_tuple);

thrust::transform(rmm::exec_policy(stream),
input_iter,
input_iter + result->size(),
result->mutable_view().begin<T>(),
[] __device__(auto inputs) {
return calculate_haversine_distance(thrust::get<0>(inputs),
thrust::get<1>(inputs),
thrust::get<2>(inputs),
thrust::get<3>(inputs),
thrust::get<4>(inputs));
});
```

And replace it with the following code.

```C++
auto lonlat_a = cuspatial::make_lonlat_iterator(a_lon.begin<T>(), a_lat.begin<T>());
auto lonlat_b = cuspatial::make_lonlat_iterator(b_lon.begin<T>(), b_lat.begin<T>());

cuspatial::haversine_distance(lonlat_a,
lonlat_a + a_lon.size(),
lonlat_b,
static_cast<cudf::mutable_column_view>(*result).begin<T>(),
T{radius},
stream);
```

## Testing

Existing libcudf-based API tests can mostly be left alone. New tests should be added to exercise
the header-only API separately in case the libcudf-based API is removed.

Note that tests, like the header-only API, should not depend on libcudf or libcudf_test. The
cuDF-based API made the mistake of depending on libcudf_test, which results in breakages
of cuSpatial sometimes when libcudf_test changes.
5 changes: 3 additions & 2 deletions cpp/include/cuspatial/error.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,11 @@ struct logic_error : public std::logic_error {
struct cuda_error : public std::runtime_error {
cuda_error(std::string const& message) : std::runtime_error(message) {}
};

} // namespace cuspatial

#define STRINGIFY_DETAIL(x) #x
#define CUSPATIAL_STRINGIFY(x) STRINGIFY_DETAIL(x)
#define CUSPATIAL_STRINGIFY_DETAIL(x) #x
#define CUSPATIAL_STRINGIFY(x) CUSPATIAL_STRINGIFY_DETAIL(x)

/**---------------------------------------------------------------------------*
* @brief Macro for checking (pre-)conditions that throws an exception when
Expand Down
89 changes: 89 additions & 0 deletions cpp/include/cuspatial/experimental/detail/haversine.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <cuspatial/constants.hpp>
#include <cuspatial/error.hpp>
#include <cuspatial/types.hpp>

#include <rmm/cuda_stream_view.hpp>
#include <rmm/exec_policy.hpp>

#include <thrust/transform.h>
#include <thrust/tuple.h>

#include <type_traits>

namespace cuspatial {

namespace detail {

template <typename T>
struct haversine_distance_functor {
haversine_distance_functor(T radius) : radius_(radius) {}

__device__ T operator()(lonlat_2d<T> point_a, lonlat_2d<T> point_b)
{
auto ax = point_a.x * DEGREE_TO_RADIAN;
auto ay = point_a.y * DEGREE_TO_RADIAN;
auto bx = point_b.x * DEGREE_TO_RADIAN;
auto by = point_b.y * DEGREE_TO_RADIAN;

// haversine formula
auto x = (bx - ax) / 2;
auto y = (by - ay) / 2;
auto sinysqrd = sin(y) * sin(y);
auto sinxsqrd = sin(x) * sin(x);
auto scale = cos(ay) * cos(by);

return 2 * radius_ * asin(sqrt(sinysqrd + sinxsqrd * scale));
}

T radius_{};
};

} // namespace detail

template <class LonLatItA, class LonLatItB, class OutputIt, class Location, class T>
OutputIt haversine_distance(LonLatItA a_lonlat_first,
LonLatItA a_lonlat_last,
LonLatItB b_lonlat_first,
OutputIt distance_first,
T const radius,
rmm::cuda_stream_view stream)
{
using LocationB = typename std::iterator_traits<LonLatItB>::value_type;
static_assert(
std::conjunction_v<std::is_same<lonlat_2d<T>, Location>, std::is_same<lonlat_2d<T>, LocationB>>,
"Inputs must be cuspatial::lonlat_2d");
static_assert(
std::conjunction_v<std::is_floating_point<T>,
std::is_floating_point<typename LocationB::value_type>,
std::is_floating_point<typename std::iterator_traits<OutputIt>::value_type>>,
"Haversine distance supports only floating-point coordinates.");

CUSPATIAL_EXPECTS(radius > 0, "radius must be positive.");

return thrust::transform(rmm::exec_policy(stream),
a_lonlat_first,
a_lonlat_last,
b_lonlat_first,
distance_first,
detail::haversine_distance_functor<T>(radius));
}

} // namespace cuspatial
Loading