-
Notifications
You must be signed in to change notification settings - Fork 974
Remove unnecessary synchronization (miss-sync) during Parquet reading (Part 4: vector_factories) #20120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-25.12
Are you sure you want to change the base?
Remove unnecessary synchronization (miss-sync) during Parquet reading (Part 4: vector_factories) #20120
Conversation
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
This reverts commit c7ad2e8.
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
I’ve marked this as a draft to remind myself to run the script and count how many pageable copies this PR eliminates before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This links back to the draft PR for reference:
https://github.com/rapidsai/cudf/pull/18968/files#diff-b281f280563cbbee7c16afb29ef989d808476a355c9c36a8f4e27fc5dc2ca4fd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This links back to the draft PR for reference, but covering the full change in reduction.cuh
:
auto pinned_initial = cudf::detail::make_pinned_vector_async<OutputType>(1, stream); | ||
pinned_initial[0] = initial_value; | ||
using ScalarType = cudf::scalar_type_t<OutputType>; | ||
auto result = std::make_unique<ScalarType>(initial_value, true, stream, mr); | ||
auto result = std::make_unique<ScalarType>(pinned_initial[0], true, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed on Slack: assign initial_value
to element zero of a pinned vector, effectively treating it like a pinned scalar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot most of the context here :(
are we passing the value by reference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This links back to the draft PR for reference: https://github.com/rapidsai/cudf/pull/18968/files#diff-672b345ca18d5957ee39bd7802bbf67f454cb941ab78174fbab3dc1ddaa048b3
This is the only host-to-host copy in the Parquet component (I haven’t reviewed other code paths yet). So we’ve decided to handle it in-place rather than introducing a dedicated host-to-host copy utility function in the factory.
auto host_pinned_offsets = cudf::detail::make_host_vector<size_type>(offsets.size(), stream); | ||
auto host_pinned_buff_addrs = cudf::detail::make_host_vector<size_type*>(buff_addrs.size(), stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on Slack, we’re using host_vector
to eliminate pageable copies by setting LIBCUDF_ALLOCATE_HOST_AS_PINNED_THRESHOLD
to a sufficiently high value.
I was wondering whether we should explicitly use make_pinned_vector(size, stream)
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here what we actually want to do is change the type of the owner of offsets
and buff_addrs
to cudf::detail::host_vector
at the call site of write_final_offsets
, so we don't have to do a H2H copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s true—I lost track of the context a bit as well. Sorry for the delay; it’s been long enough :/
template <typename T> | ||
host_vector<T> make_pinned_vector(device_span<T const> source_data, rmm::cuda_stream_view stream) | ||
{ | ||
auto result = make_pinned_vector_async(source_data.size(), stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is supposed to be
auto result = make_pinned_vector_async(source_data.size(), stream); | |
auto result = make_pinned_vector_async(source_data, stream); |
auto pinned_initial = cudf::detail::make_pinned_vector_async<OutputType>(1, stream); | ||
pinned_initial[0] = initial_value; | ||
using ScalarType = cudf::scalar_type_t<OutputType>; | ||
auto result = std::make_unique<ScalarType>(initial_value, true, stream, mr); | ||
auto result = std::make_unique<ScalarType>(pinned_initial[0], true, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot most of the context here :(
are we passing the value by reference here?
auto host_pinned_offsets = cudf::detail::make_host_vector<size_type>(offsets.size(), stream); | ||
auto host_pinned_buff_addrs = cudf::detail::make_host_vector<size_type*>(buff_addrs.size(), stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here what we actually want to do is change the type of the owner of offsets
and buff_addrs
to cudf::detail::host_vector
at the call site of write_final_offsets
, so we don't have to do a H2H copy.
Description
For issue #18967, this PR is the first part of merging the PR Draft #18968. In this PR, I added host-pinned vector construction in
vector_factories.hpp
. After a careful read-through, I’ve improved the comments in this file as well.(As discussed, I’ve also made manual changes to
reduction.cuh
andpage_data.cu
.)Checklist