-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: Implement HOST_UDF
aggregations
#17249
base: branch-25.02
Are you sure you want to change the base?
Changes from 2 commits
bba150c
04e2bda
5f7ab2b
7c9316a
57674e1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -791,6 +791,23 @@ void aggregate_result_functor::operator()<aggregation::MERGE_TDIGEST>(aggregatio | |
mr)); | ||
} | ||
|
||
template <> | ||
void aggregate_result_functor::operator()<aggregation::HOST_UDF>(aggregation const& agg) | ||
{ | ||
// TODO: Add a name string to the aggregation so that we can look up different host UDFs. | ||
if (cache.has_result(values, agg)) { return; } | ||
Comment on lines
+797
to
+798
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggestion: Why not ask the implementer of the host udf to provide hash and equality, like the other aggregations have? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Providing a name string should be enough for hashing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I was wrong. Indeed, providing hash and equality operators seems to be the good way to go. I've added the interface for that in my latest idea: 57674e1 |
||
auto const udf_ptr = dynamic_cast<cudf::detail::host_udf_aggregation const&>(agg).host_udf_ptr; | ||
CUDF_EXPECTS(udf_ptr != nullptr, "errrrrrrrrr"); | ||
cache.add_result(values, | ||
agg, | ||
udf_ptr(get_grouped_values(), | ||
helper.group_offsets(stream), | ||
helper.group_labels(stream), | ||
helper.num_groups(stream), | ||
stream, | ||
mr)); | ||
} | ||
|
||
} // namespace detail | ||
|
||
// Sort-based groupby | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only passes group values (the first column parameter). It seems that we should better pass the group keys too, to have all the group information needed for generic computation.