Skip to content

Conversation

JigaoLuo
Copy link
Contributor

Description

For issue #18967, this PR is the first part of merging the PR Draft #18968. In this PR, I added host-pinned vector construction in vector_factories.hpp. After a careful read-through, I’ve improved the comments in this file as well.
(As discussed, I’ve also made manual changes to reduction.cuh and page_data.cu.)

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
This reverts commit c7ad2e8.
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
Signed-off-by: Jigao Luo <jigao.luo@outlook.com>
@JigaoLuo JigaoLuo requested a review from a team as a code owner September 26, 2025 10:58
Copy link

copy-pr-bot bot commented Sep 26, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 26, 2025
@JigaoLuo JigaoLuo marked this pull request as draft September 26, 2025 11:00
@JigaoLuo
Copy link
Contributor Author

I’ve marked this as a draft to remind myself to run the script and count how many pageable copies this PR eliminates before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This links back to the draft PR for reference, but covering the full change in reduction.cuh:

https://github.com/rapidsai/cudf/pull/18968/files#diff-d99740825ef0d2e73c3e8392d06ca11b229400d864913b4221f3f3626ad95f85

Comment on lines +69 to +72
auto pinned_initial = cudf::detail::make_pinned_vector_async<OutputType>(1, stream);
pinned_initial[0] = initial_value;
using ScalarType = cudf::scalar_type_t<OutputType>;
auto result = std::make_unique<ScalarType>(initial_value, true, stream, mr);
auto result = std::make_unique<ScalarType>(pinned_initial[0], true, stream, mr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed on Slack: assign initial_value to element zero of a pinned vector, effectively treating it like a pinned scalar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot most of the context here :(
are we passing the value by reference here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This links back to the draft PR for reference: https://github.com/rapidsai/cudf/pull/18968/files#diff-672b345ca18d5957ee39bd7802bbf67f454cb941ab78174fbab3dc1ddaa048b3

This is the only host-to-host copy in the Parquet component (I haven’t reviewed other code paths yet). So we’ve decided to handle it in-place rather than introducing a dedicated host-to-host copy utility function in the factory.

Comment on lines +582 to +583
auto host_pinned_offsets = cudf::detail::make_host_vector<size_type>(offsets.size(), stream);
auto host_pinned_buff_addrs = cudf::detail::make_host_vector<size_type*>(buff_addrs.size(), stream);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Slack, we’re using host_vector to eliminate pageable copies by setting LIBCUDF_ALLOCATE_HOST_AS_PINNED_THRESHOLD to a sufficiently high value.

I was wondering whether we should explicitly use make_pinned_vector(size, stream) here instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here what we actually want to do is change the type of the owner of offsets and buff_addrs to cudf::detail::host_vector at the call site of write_final_offsets, so we don't have to do a H2H copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s true—I lost track of the context a bit as well. Sorry for the delay; it’s been long enough :/

template <typename T>
host_vector<T> make_pinned_vector(device_span<T const> source_data, rmm::cuda_stream_view stream)
{
auto result = make_pinned_vector_async(source_data.size(), stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is supposed to be

Suggested change
auto result = make_pinned_vector_async(source_data.size(), stream);
auto result = make_pinned_vector_async(source_data, stream);

Comment on lines +69 to +72
auto pinned_initial = cudf::detail::make_pinned_vector_async<OutputType>(1, stream);
pinned_initial[0] = initial_value;
using ScalarType = cudf::scalar_type_t<OutputType>;
auto result = std::make_unique<ScalarType>(initial_value, true, stream, mr);
auto result = std::make_unique<ScalarType>(pinned_initial[0], true, stream, mr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot most of the context here :(
are we passing the value by reference here?

Comment on lines +582 to +583
auto host_pinned_offsets = cudf::detail::make_host_vector<size_type>(offsets.size(), stream);
auto host_pinned_buff_addrs = cudf::detail::make_host_vector<size_type*>(buff_addrs.size(), stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here what we actually want to do is change the type of the owner of offsets and buff_addrs to cudf::detail::host_vector at the call site of write_final_offsets, so we don't have to do a H2H copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants