Threading Building Blocks with gpu and numa
Use dynamic splitting algorithms for data-driven parallel tasks
Reference from:
Self-Tuning Query Scheduling for Analytical Workloads
https://15721.courses.cs.cmu.edu/spring2024/papers/08-scheduling/wagner-sigmod21.pdf
use sh build.sh to build and run unit test
use sh build.sh r to build with release
- make a task set
we got some situation here:
- the input's iterator is known
// define input
std::vector<int> input(1000, 1);
// operator
auto process_func = [](std::vector<int>::iterator begin,
std::vector<int>::iterator end,
int multiplier) -> int {
// ...
};
// finalizer
auto finalize_func =
[](std::vector<int>&& partial_results) -> int {
// ...
};
auto task_set =
make_task_set(data.begin(), data.end(), process_func, finalize_func, /* params */ 2);
// we can get a result by future type
auto future = task_set->get_future();
// submit to scheduler...
int result = future.get();- the input is a future type (eg. from the task set before)
auto input_future = before_task_set->get_future();
// operator
auto process_func = [](std::vector<int>::iterator begin,
std::vector<int>::iterator end,
int multiplier) -> int {
// ...
};
// finalizer
auto finalize_func =
[](std::vector<int>&& partial_results) -> int {
// ...
};
auto task_set =
make_task_set(std::move(input_future), process_func, finalize_func, /* params */ 2);
// we can get a result by future type
auto future = task_set->get_future();
// submit to scheduler...
int result = future.get();- need to pass begin or end of the input, if the input is already defined, just pass the value, else if the input is a future type, need to define like this:
auto input_future = before_task_set->get_future();
// operator
auto process_func = [](std::vector<int>::iterator begin,
std::vector<int>::iterator end,
std::pair<std::vector<int>::iterator, std::vector<int>::iterator> context,
int multiplier) -> int {
// ...
// now can calculate offset by the context
};
// finalizer
auto finalize_func =
[](std::vector<int>&& partial_results) -> int {
// ...
};
// the nullptr will be replace by the _begin / _end when the func is invoked
auto task_set =
make_task_set(std::move(input_future), process_func, finalize_func, /* params */ {nullptr, nullptr}, 2);
// we can get a result by future type
auto future = task_set->get_future();
// submit to scheduler...
int result = future.get();-
if the func return void, you can use
VOID_FINALIZEas a finalizer func, and still can get a future(always return true) to judge whether the task set is finished -
some params is future type or shared future type, just pass them, the inner will call future.get() to get the value