PCL RFC 0003: Unified API for Algorithms

Details

Title: PCL-RFC-0003: Unified API for Algorithms
Author: kunaltyagi
Gitter room: PointCloudLibrary/PCL-RFC-03

Motivation

Current design in several modules (filters, features, gpu) has several flaws:

Multiple classes with independent API and independent implementations for OpenMP, GPU, CPU code
Inextensible model for
- adding support for thread_pools
- multi-GPU support
- using SIMD in OpenMP code

It is possible to use function overloading in C++ to achieve a "Unified API". This is a forward-compatible design with the proposed API for executors (The API is standardized since C++17, implementation is being standardized). The benefits are:

Adding SIMD/OpenMP implementation of algorithm automatically allows the other to use it
Ability to use thread_pools using the future model proposed by C++
Ability to encapsulate multi-GPU support and provide it along side single-GPU support
API remains static, allowing users to switch from CPU to OpenMP to GPU with minor changes

Detail

A prototype with OpenMP, SIMD and CPU versions can be found here. A simpler prototype is also available. Please try to change the compile flags (for SSE, AVX and OpenMP) to verify that the code adapts to different choices.

The basic details are:

A "tag" (empty struct) as the first parameter for a function call to enable overloading
Lack of tag implies allowing PCL to choose the best option
Tags can be inherited allowing overload resolution to choose the best option without run-time checks
Missing implementation for a tag raises compile-time errors
No use of macros beyond hiding the implementation for lack of supported platform. constexpr+static_assert or SFINAE machinery can be used to eliminate macros

For more details on implementation of executors, please see

Implementation of executors by champion of the executor proposal
C++17 API reference

Pros

More freedom for user to choose the execution details
Allows for unified API to forward the executor to C++ algorithms
Allows reuse between SIMD, CPU and OpenMP code
Allows single-GPU code to not pay for overhead of multi-GPU implementation
Greater ease in testing
Extensible for user: Bring Your Own Executors (for unsupported tags in PCL)

Cons

None so far except API redesign

ABI/API Breakage

None in the beginning. The idea is to extend current API not break it
Complete break after deprecation

Effort Required

Minor to Medium

Migration Path

For implementation:

Add new functionality using executors
Original functionality will be rerouted to use the new implementation
- OpenMP: use temporary tags for redirection (no executors available now, expect proposals soon)
- CUDA: use temporary tags for redirection, and in future, stream executors (tensorflow has implementation of stream executors for CUDA and OpenCL)
Deprecate original functionality

For users:

On deprecation warnings, change the class to original class, and add an executor as first argument

// current
pcl::NormalEstimationOMP<T1, T2> est;
est.setViewPoint(vx, vy, vz);
est.setInputCloud(cloud);
est.computePointNormal(*normal_cloud, indices, nx, ny, nz, curvature);

// proposed
pcl::NormalEstimation<T1, T2> est;
est.setViewPoint(vx, vy, vz);
est.setInputCloud(cloud);
est.computePointNormal(pcl::executor::openmp {}, *normal_cloud, indices, nx, ny, nz, curvature);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PCL RFC 0003: Unified API for Algorithms

Details

Motivation

Detail

Pros

Cons

ABI/API Breakage

Effort Required

Migration Path

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally