-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding set_stream for column and column_buffer #12100
base: branch-23.08
Are you sure you want to change the base?
Adding set_stream for column and column_buffer #12100
Conversation
cpp/include/cudf/column/column.hpp
Outdated
for (auto& child : _children) { | ||
child->set_stream(stream); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer the function definition be put into cpp/src/column/column.cu
instead of in the header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has changed to a templated function, but I pulled out the child functions.
for (auto& child : children) { | ||
child.set_stream(stream); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Move the function definition into cpp/src/io/utilities/column_buffer.cpp
if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_stream()
still seems to be defined in the header column_buffer.hpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how that happened. VS Code shows the file without the definition, but the local filesystem still had it. Updated via my old friend, vi.
Why do we have this to change the underlying stream for these buffers? Can we construct these buffers with the desired stream from the beginning? |
It's applicable for certain kinds of worker-stream type situations. The overlapped parquet work that Mike is doing uses two streams to overlap IO work and decoding work in a double-buffering kind of way. The decode step (including memory allocation) happens in a separate stream from the top level one that the reader is called with so you have to explicitly tell the underlying rmm device buffers that they belong to a different stream than the one they were created on. |
Is this still relevant? |
I believe so. This will become more and more of a thing as we expand our use of streams in cuDF. I'll update this PR. |
…-column-buffer-stream
Description
As described in #12093, I ran into some issues with streams when I was doing some prototyping. I don't know if we are ready to do this change, but I wanted to put it up so people could see what it is we may need to do in the future in some form.
Checklist