Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding set_stream for column and column_buffer #12100

Open
wants to merge 5 commits into
base: branch-23.08
Choose a base branch
from

Conversation

hyperbolic2346
Copy link
Contributor

Description

As described in #12093, I ran into some issues with streams when I was doing some prototyping. I don't know if we are ready to do this change, but I wanted to put it up so people could see what it is we may need to do in the future in some form.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@hyperbolic2346 hyperbolic2346 added feature request New feature or request proposal Change current process or code 5 - DO NOT MERGE Hold off on merging; see PR for details labels Nov 9, 2022
@hyperbolic2346 hyperbolic2346 self-assigned this Nov 9, 2022
@hyperbolic2346 hyperbolic2346 requested a review from a team as a code owner November 9, 2022 03:22
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Nov 9, 2022
for (auto& child : _children) {
child->set_stream(stream);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer the function definition be put into cpp/src/column/column.cu instead of in the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has changed to a templated function, but I pulled out the child functions.

for (auto& child : children) {
child.set_stream(stream);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Move the function definition into cpp/src/io/utilities/column_buffer.cpp if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_stream() still seems to be defined in the header column_buffer.hpp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how that happened. VS Code shows the file without the definition, but the local filesystem still had it. Updated via my old friend, vi.

@ttnghia
Copy link
Contributor

ttnghia commented Nov 9, 2022

Why do we have this to change the underlying stream for these buffers? Can we construct these buffers with the desired stream from the beginning?

@nvdbaranec
Copy link
Contributor

nvdbaranec commented Nov 9, 2022

Why do we have this to change the underlying stream for these buffers? Can we construct these buffers with the desired stream from the beginning?

It's applicable for certain kinds of worker-stream type situations. The overlapped parquet work that Mike is doing uses two streams to overlap IO work and decoding work in a double-buffering kind of way. The decode step (including memory allocation) happens in a separate stream from the top level one that the reader is called with so you have to explicitly tell the underlying rmm device buffers that they belong to a different stream than the one they were created on.

@ttnghia
Copy link
Contributor

ttnghia commented May 19, 2023

Is this still relevant?

@hyperbolic2346
Copy link
Contributor Author

I believe so. This will become more and more of a thing as we expand our use of streams in cuDF. I'll update this PR.

@hyperbolic2346 hyperbolic2346 changed the base branch from branch-22.12 to branch-23.08 July 3, 2023 17:48
@hyperbolic2346 hyperbolic2346 requested a review from davidwendt July 3, 2023 20:23
@hyperbolic2346 hyperbolic2346 added 3 - Ready for Review Ready for review by team non-breaking Non-breaking change and removed 5 - DO NOT MERGE Hold off on merging; see PR for details labels Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change proposal Change current process or code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants