Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TESTING.md and BENCHMARKING.md #672

Merged
merged 8 commits into from
Sep 27, 2022
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions cpp/doc/BENCHMARKING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Unit Benchmarking in libcuspatial

Unit benchmarks in libcuspatial are written using [NVBench](https://github.com/NVIDIA/nvbench).
While some existing benchmarks are written using
[Google Benchmark](https://github.com/google/benchmark), new benchmarks should use NVBench.

The NVBench library is similar to Google Benchmark, but has several quality of life improvements
when doing GPU benchmarking such as displaying the fraction of peak memory bandwidth achieved and
details about the GPU hardware.

Both NVBench and Google Benchmark provide many options for specifying ranges of parameters to
benchmark, as well as to control the time unit reported, among other options. Refer to existing
benchmarks in `cpp/benchmarks` to understand the options.

## Directory and File Naming

The naming of unit benchmark directories and source files should be consistent with the feature
being benchmarked. For example, the benchmarks for APIs in `point_in_polygon.hpp` should live in
`cpp/benchmarks/point_in_polygon.cu`. Each feature (or set of related features) should have its own
benchmark source file named `<feature>{.cu,cpp}`.

In the interest of improving compile time, whenever possible, test source files should be `.cpp`
files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector`
includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`,
`rmm::device_buffer` and the various `column_wrapper` types described in [Testing](TESTING.md)
can be used in `.cpp` files, and are therefore preferred in test code over `thrust::device_vector`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using NVBench precludes naming the file as cpp? NVBench header is cuh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never thought about that. Asking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nvbench is .cu only...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this paragraph.


Testing header-only APIs requires CUDA compilation so should be done in `.cu` files.

## CUDA Asynchrony and benchmark accuracy

CUDA computations and operations like copies are typically asynchronous with respect to host code,
so it is important to carefully synchronize in order to ensure the benchmark timing is not stopped
before the feature you are benchmarking has completed. An RAII helper class `cuda_event_timer` is
provided in `cpp/benchmarks/synchronization/synchronization.hpp` to help with this. This class
can also optionally clear the GPU L2 cache in order to ensure cache hits do not artificially inflate
performance in repeated iterations.

## Data generation

For generating benchmark input data, random data generation functions are provided in
`cpp/benchmarks/utility/random.cuh`. The input data generation happens on device.

## What should we benchmark?

In general, we should benchmark all features over a range of data sizes and types, so that we can
catch regressions across libcudf changes. However, running many benchmarks is expensive, so ideally
we should sample the parameter space in such a way to get good coverage without having to test
exhaustively.

A rule of thumb is that we should benchmark with enough data to reach the point where the algorithm
reaches its saturation bottleneck, whether that bottleneck is bandwidth or computation. Using data
sets larger than this point is generally not helpful, except in specific cases where doing so
exercises different code and can therefore uncover regressions that smaller benchmarks will not
(this should be rare).
thomcom marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do think we should suggest here that benchmarks should only be written for public APIs?


Generally we should benchmark public APIs. Benchmarking detail functions and/or internal utilities should
only be done if detecting regressions in them would be sufficiently difficult to do from public API
benchmarks.
112 changes: 112 additions & 0 deletions cpp/doc/TESTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Unit Testing in libcuspatial

Unit tests in libcuspatial are written using
[Google Test](https://github.com/google/googletest/blob/master/docs/primer.md).

## Best Practices: What Should We Test?

In general we should test to make sure all code paths are covered. This is not always easy or
possible. But generally this means we test all supported combinations of algorithms and data types,
and the main iterator and container types supported by algorithms. Here are some other guidelines.

* Test public APIs. Try to ensure that public API tests result in 100% coverage of libcuspatial
code (including internal details and utilities).

* Test exceptional cases. For example, anything that causes the function to `throw`.

* Test boundary cases. For example points that fall exactly on lines or boundaries.

* In general empty input is not an error in libcuspatial. Typically empty input results in empty
output. Tests should verify this.

* Most algorithms should have one or more tests exercising inputs with a large enough number of
rows to require launching multiple thread blocks, especially when values are ultimately
communicated between blocks (e.g. reductions). This is especially important for custom kernels
but also applies to Thrust and CUB algorithm calls with lambdas / functors.

## Header-only and Column-based API tests

libcuspatial currently has two C++ APIs: the column-based API uses libcudf data structures as
input and output. These tests live in `cpp/tests/` and can use libcudf features for constructing
columns and tables. The header-only API does not depend on libcudf at all and so tests of these
APIs should not include any libcudf headers. These tests currently live in `cpp/tests/experimental`.
isVoid marked this conversation as resolved.
Show resolved Hide resolved

Generally we test algorithms and business logic in the header-only API's unit tests. Column-based
API tests should only cover specifics of the column-based API, such as type handling,
input validation, and exceptions that are only thrown by that API.

## Directory and File Naming

The naming of unit test directories and source files should be consistent with the feature being
tested. For example, the tests for APIs in `point_in_polygon.hpp` should live in
`cuspatial/cpp/tests/point_in_polygon_test.cpp`. Each feature (or set of related features) should
have its own test source file named `<feature>_test.cu/cpp`.

In the interest of improving compile time, whenever possible, test source files should be `.cpp`
files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector`
includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`,
`rmm::device_buffer` and the various `column_wrapper` types described later can be used in `.cpp`
files, and are therefore preferred in test code over `thrust::device_vector`.

Testing header-only APIs requires CUDA compilation so should be done in `.cu` files.

## Base Fixture

All libcuspatial unit tests should make use of a GTest
["Test Fixture"](https://github.com/google/googletest/blob/master/docs/primer.md#test-fixtures-using-the-same-data-configuration-for-multiple-tests-same-data-multiple-tests).
Even if the fixture is empty, it should inherit from the base fixture `cuspatial::test::BaseFixture`
found in `cpp/tests/base_fixture.hpp`. This ensures that RMM is properly initialized and
finalized. `cuspatial::test::BaseFixture` already inherits from `testing::Test` and therefore it is
not necessary for your test fixtures to inherit from it.

Example:

class MyTestFixture : public cuspatial::test::BaseFixture {...};

## Typed Tests

In general, libcuspatial features must work across all supported types (for cuspatial this
typically just means `float` and `double`). In order to automate the process of running
the same tests across multiple types, we use GTest's
[Typed Tests](https://github.com/google/googletest/blob/master/docs/advanced.md#typed-tests).
Typed tests allow you to write a test once and run it across a list of types.

For example:

```c++
// Fixture must be a template
template <typename T>
class TypedTestFixture : cuspatial::test::BaseFixture {...};
using TestTypes = ::test::types<float,double>; // Notice custom cudf type list type
TYPED_TEST_SUITE(TypedTestFixture, TestTypes);
TYPED_TEST(TypedTestFixture, FirstTest){
// Access the current type using `TypeParam`
using T = TypeParam;
}
```

In this example, all tests using the `TypedTestFixture` fixture will run once for each type in the
list defined in `TestTypes` (`float, double`).

## Utilities

libcuspatial test utilities include `cuspatial::test::expect_vector_equivalent()` in
`cpp/tests/utility/vector_equality()`. This function compares two containers using Google Test's
approximate matching for floating-point values. It can handle vectors of `cuspatial::vec_2d<T>`,
where `T` is `float` or `double`. It automatically copies data in device containers to host
containers before comparing, so you can pass it one host and one device vector, for example.

Example:

```c++
auto h_expected = std::vector<cuspatial::vec_2d<float>>{...}; // expected values

auto d_actual = rmm::device_vector<cuspatial::vec_2d<float>>{...}; // actual computed values

cuspatial::test::expect_vector_equivalent(h_expected, d_actual);
```

Before creating your own test utilities, look to see if one already exists that does
what you need. If not, consider adding a new utility to do what you need. However, make sure that
the utility is generic enough to be useful for other tests and is not overly tailored to your
specific testing need.