[FEA] Make device_vector
safer to use in multi-device setting #1527
Description
Is your feature request related to a problem? Please describe.
Since #1370, device_buffer
is safe to use in a multi-device setting wrt active devices when the destructor runs. While it was always possible (and relatively straightforward) to arrange for the active device to be correct in scenarios where no exceptions occurred, when there are exceptions setting the correct device for destruction was much more complicated.
We therefore added the cuda_set_device_raii
helper object and stored the active device id in the device_buffer
to ensure that the correct device is always active when calling allocate/deallocate functions.
In contrast, since device_vector
is just an alias for thrust::device_vector
, it still suffers from the old issue: the user must manually arrange that the correct device is active for the dtor.
Describe the solution you'd like
#1523 documents this restriction, but it would be good if we could lift it. One way would be to store the active device in the thrust allocator wrapper that we use to interface RMM's memory resources with the thrust allocator model.
We would then use cuda_set_device_raii
in all the allocate/deallocate functions.
This was discounted as an approach in #1370 since it produces more device switches than necessary in some circumstances (pushing the device switching as far out as possible was preferred), so there would be some overhead compared to use of device_buffer
(though hopefully small). And we note that since device_vector
isn't stream ordered there are other disadvantages to using it, so the small performance cost is probably not that terminal.
Describe alternatives you've considered
Maintain status quo, and eventually deprecate and then remove device_vector
, since it is not stream-ordered anyway and we are trying to move away from that model.
Metadata
Assignees
Type
Projects
Status
Done