Refactor logging_resource_adaptor to shared CCCL MR design#2246
Refactor logging_resource_adaptor to shared CCCL MR design#2246rockhowse merged 8 commits intorapidsai:stagingfrom
Conversation
Convert logging_resource_adaptor from a templated, header-only class to a non-templated class using cuda::mr::shared_resource for reference-counted ownership. Implementation is now compiled into librmm.so. Key changes: - Remove template parameter (was logging_resource_adaptor<Upstream>) - Remove device_memory_resource inheritance, use CCCL resource concepts - Add logging_resource_adaptor_impl in rmm::mr::detail namespace - Remove Upstream* constructor overloads (use device_async_resource_ref) - Store upstream as any_resource for owning semantics - Equality now compares both upstream AND logger - Class is now copyable with shared ownership semantics This is a breaking API change for C++ users.
Temporarily keep device_memory_resource as a base class so existing Python/Cython bindings continue to work. The legacy compatibility layer includes using declarations to resolve ambiguity between base class methods and private do_allocate/do_deallocate overrides that forward to the shared_resource implementation.
Remove template parameter from Cython declarations and usage since logging_resource_adaptor is no longer a class template.
| * @throws rmm::logic_error if `upstream == nullptr` | ||
| * This property declares that a `logging_resource_adaptor_impl` provides device accessible memory | ||
| */ | ||
| friend void get_property(logging_resource_adaptor_impl const&, |
There was a problem hiding this comment.
This should take the properties from the upstream. I can't remember exactly what utility cccl provides for that though
There was a problem hiding this comment.
Yes, that’s a good catch. Until now, RMM had to lie and say every resource (adaptor) was device accessible. Now that we are storing an any_resource we can actually query its properties at runtime and forward those through.
There was a problem hiding this comment.
This is tricky. We have to make a decision. Do we:
- Declare that all resources/adaptors are
cuda::mr::device_accessibleand hardcode that. This is the current strategy and it lets us avoid templates, lets us compile more intolibrmm.so, and is friendlier to Cython. - Forward
<Properties...>through the whole chain oflogging_resource_adaptor->impl->upstream_member and keep it templated and header-only. For Cython we'd still have to declare everything asdevice-accessible because I don't think it is in scope for us to have device and host and device-host versions of every Cython class (no template support).
I think there is an "option 3" which is that we declare no resource properties and just keep any_resource<> everywhere to make it friendly to any resource type (albeit losing the static knowledge of accessibility). I can't recall if this would be fully compatible with CCCL today or if it requires more work in CCCL -- I discussed this a bit with @pciolkosz but don't remember the status quo.
There was a problem hiding this comment.
Update: Option 3 isn't possible at the moment because we have many APIs requiring device_async_resource_ref which enforces the cuda::mr::device_accessible property. I think we kick this can down the road, at least until I'm able to get rid of device_memory_resource and move everything to CCCL resource (ref) classes rather than RMM wrappers.
I think option (1), declaring everything must be cuda::mr::device_accessible, is the best choice we have for the immediate term.
There was a problem hiding this comment.
Bradley and I chatted for a bit today and also came down on option 1 as the short term solution. We'd like to see better property support long-term, but only after device_memory_resource is history.
| * @param auto_flush If true, flushes the log for every (de)allocation | ||
| */ | ||
| logging_resource_adaptor_impl(std::shared_ptr<rapids_logger::logger> logger, | ||
| device_async_resource_ref upstream, |
There was a problem hiding this comment.
question: Should this accept a cuda::mr::any_resource to explicitly indicate that the person constructing this does not need to keep the upstream alive?
There was a problem hiding this comment.
The guidance I've been using is to always use ref types as parameters. That can always be reified into an owning type (any_resource, by copy or incrementing shared refcount).
There was a problem hiding this comment.
Sure, that works. I am wondering about the principle of least surprise. If I think about standard &-ref types:
// Caller knows that they are on the hook to keep `x` alive.
auto takes_ref(foo & x) {
return something_that_uses(x);
}
// Caller knows they are on the hook to implement copy
auto takes_value(foo x) {
return something_that_uses(x);
}
In contrast here:
// Caller must look at the implementation/documentation to know that they
// need not keep `mr` alive
auto logging_resource_adaptor(resource_ref mr, ...) {
}
I agree that functions can take _ref types as parameters and the body can decide to reify. But here we know we're going to reify so why not just advertise that that is happening?
There was a problem hiding this comment.
Ah, I forgot to mention the other catch: we still support device_memory_resource* inputs (currently the caller promises to keep the underlying resource alive) until that can be banished from RAPIDS. The device_async_resource_ref type supports that conversion implicitly, while any_resource won't.
I think it is worth evaluating a switch to any_resource in the future, though.
There was a problem hiding this comment.
I think switching will make sense once we have excised device_memory_resource.
| */ | ||
| class RMM_EXPORT logging_resource_adaptor | ||
| : public device_memory_resource, | ||
| public cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> { |
There was a problem hiding this comment.
question: Is it possible to move the detail::logging_resource_adaptor_impl into a separate file and only forward-declare the struct name in this file.
If yes, then that is great because we can split the detail impl into a header that is never included in user-facing API code and hence hide the API (and ABI)
There was a problem hiding this comment.
Another question. This says "every logging_resource_adaptor is-a shared_resource<detail::logging_resource_adaptor_impl>".
Why is it not instead:
class logging_resource_adaptor {
...
private:
cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> impl_;
};
i.e. "every logging_resource_adaptor has-a shared_resource<detail::logging_resource_adaptor_impl>" that provides the implementation.
There was a problem hiding this comment.
I'll try separating the impl class as you suggested. That would be nice.
is-a shared_resource (inheritance rather than composition) means we don't have to use PIMPL-like forwarding for the APIs that a shared_resource provides: allocation, deallocation, getting properties, copy constructors, etc.
There was a problem hiding this comment.
There is the concept of private inheritance which models more like a has-a pattern than an is-a pattern and is recommend by Scott Meyers Effective C++ [citation needed]
https://isocpp.org/wiki/faq/private-inheritance
There was a problem hiding this comment.
Does that mean that the impl class becomes part of the ABI?
There was a problem hiding this comment.
Private inheritance will make the inherited-from class part of the ABI, but in this case I think David is suggesting using private inheritance for shared_resource<detail::logging_resource_adaptor_impl>, in which case only shared_resource<detail::logging_resource_adaptor_impl> would be part of the ABI, not detail::logging_resource_adaptor_impl itself since it's hidden behind a pointer in shared_resource. In doing this you would lose the automatic availability of shared_resource methods that public inheritance currently gives you, but you can reexpose them with using declarations, which is probably the cleanest solution here.
There was a problem hiding this comment.
I see. That is possible. One downside would be that it becomes impossible to cast an adaptor to a shared_resource<T>, but perhaps that is desirable given that T (the impl class) is a private implementation detail. I think I might go that route!
There was a problem hiding this comment.
Note that this change means that some of the using declarations marked as only for the legacy layer will probably need to stay (but modified to be from shared_resource rather than device_memory_resource). Otherwise we won't have the the shared_resource APIs like allocate and deallocate available on the adaptor itself.
There was a problem hiding this comment.
Yes, that’s a pretty minor change and I don’t have any issue with doing that. Thanks for calling it out.
|
I retargeted this to a |
| * @param auto_flush If true, flushes the log for every (de)allocation | ||
| */ | ||
| logging_resource_adaptor_impl(std::shared_ptr<rapids_logger::logger> logger, | ||
| device_async_resource_ref upstream, |
There was a problem hiding this comment.
I think switching will make sense once we have excised device_memory_resource.
| * @throws rmm::logic_error if `upstream == nullptr` | ||
| * This property declares that a `logging_resource_adaptor_impl` provides device accessible memory | ||
| */ | ||
| friend void get_property(logging_resource_adaptor_impl const&, |
There was a problem hiding this comment.
Bradley and I chatted for a bit today and also came down on option 1 as the short term solution. We'd like to see better property support long-term, but only after device_memory_resource is history.
| */ | ||
| class RMM_EXPORT logging_resource_adaptor | ||
| : public device_memory_resource, | ||
| public cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> { |
There was a problem hiding this comment.
Private inheritance will make the inherited-from class part of the ABI, but in this case I think David is suggesting using private inheritance for shared_resource<detail::logging_resource_adaptor_impl>, in which case only shared_resource<detail::logging_resource_adaptor_impl> would be part of the ABI, not detail::logging_resource_adaptor_impl itself since it's hidden behind a pointer in shared_resource. In doing this you would lose the automatic availability of shared_resource methods that public inheritance currently gives you, but you can reexpose them with using declarations, which is probably the cleanest solution here.
Convert logging_resource_adaptor from a templated, header-only class to a non-templated class using cuda::mr::shared_resource for reference-counted ownership. Implementation is now compiled into librmm.so. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Lawrence Mitchell (https://github.com/wence-) - David Wendt (https://github.com/davidwendt) URL: #2246
Summary
Convert
logging_resource_adaptorfrom a templated, header-only class to a non-templated class usingcuda::mr::shared_resourcefor reference-counted ownership. Implementation is now compiled intolibrmm.so.Thanks to @wence- and @davidwendt for input on the design. If this seems reasonable, I will make similar changes for all adaptors. This is breaking change, and I do not plan to have a deprecation period (there is not a good way to provide a transition for downstream users, though I will help refactor all the RAPIDS libraries using RMM).
This replaces the initial design from #2246. Part of #2011.
Key changes
logging_resource_adaptor<Upstream>)logging_resource_adaptor_implinrmm::mr::detailnamespaceUpstream*constructor overloads (usedevice_async_resource_ref)any_resourcefor owning semanticsdevice_memory_resourceinheritance for Python compatibilityBreaking API changes
logging_resource_adaptor<Upstream>→logging_resource_adaptorUpstream*overloads, usedevice_async_resource_refFuture changes
Once we migrate everything from using
device_memory_resourceto the CCCL MR design, we can drop the pieces of code labeled aslegacy device_memory_resource compatibility layer.