Skip to content

Comments

Refactor logging_resource_adaptor to shared CCCL MR design#2246

Merged
rockhowse merged 8 commits intorapidsai:stagingfrom
bdice:refactor-logging-adaptor
Feb 17, 2026
Merged

Refactor logging_resource_adaptor to shared CCCL MR design#2246
rockhowse merged 8 commits intorapidsai:stagingfrom
bdice:refactor-logging-adaptor

Conversation

@bdice
Copy link
Collaborator

@bdice bdice commented Feb 5, 2026

Summary

Convert logging_resource_adaptor from a templated, header-only class to a non-templated class using cuda::mr::shared_resource for reference-counted ownership. Implementation is now compiled into librmm.so.

Thanks to @wence- and @davidwendt for input on the design. If this seems reasonable, I will make similar changes for all adaptors. This is breaking change, and I do not plan to have a deprecation period (there is not a good way to provide a transition for downstream users, though I will help refactor all the RAPIDS libraries using RMM).

This replaces the initial design from #2246. Part of #2011.

Key changes

  • Remove template parameter (was logging_resource_adaptor<Upstream>)
  • Add logging_resource_adaptor_impl in rmm::mr::detail namespace
  • Remove Upstream* constructor overloads (use device_async_resource_ref)
  • Store upstream as any_resource for owning semantics
  • Equality now compares both upstream AND logger
  • Class is now copyable with shared ownership semantics
  • Temporarily keep device_memory_resource inheritance for Python compatibility

Breaking API changes

  1. Template removal: logging_resource_adaptor<Upstream>logging_resource_adaptor
  2. Constructor changes: Removed Upstream* overloads, use device_async_resource_ref
  3. Equality semantics: Now compares both upstream AND logger

Future changes

Once we migrate everything from using device_memory_resource to the CCCL MR design, we can drop the pieces of code labeled as legacy device_memory_resource compatibility layer.

bdice added 3 commits February 5, 2026 16:21
Convert logging_resource_adaptor from a templated, header-only class to a
non-templated class using cuda::mr::shared_resource for reference-counted
ownership. Implementation is now compiled into librmm.so.

Key changes:
- Remove template parameter (was logging_resource_adaptor<Upstream>)
- Remove device_memory_resource inheritance, use CCCL resource concepts
- Add logging_resource_adaptor_impl in rmm::mr::detail namespace
- Remove Upstream* constructor overloads (use device_async_resource_ref)
- Store upstream as any_resource for owning semantics
- Equality now compares both upstream AND logger
- Class is now copyable with shared ownership semantics

This is a breaking API change for C++ users.
Temporarily keep device_memory_resource as a base class so existing
Python/Cython bindings continue to work. The legacy compatibility layer
includes using declarations to resolve ambiguity between base class methods
and private do_allocate/do_deallocate overrides that forward to the
shared_resource implementation.
Remove template parameter from Cython declarations and usage since
logging_resource_adaptor is no longer a class template.
@bdice bdice requested review from a team as code owners February 5, 2026 22:59
@bdice bdice requested a review from vyasr February 5, 2026 22:59
@bdice bdice requested a review from shrshi February 5, 2026 22:59
@bdice bdice added feature request New feature or request breaking Breaking change labels Feb 5, 2026
@bdice bdice self-assigned this Feb 5, 2026
@bdice bdice moved this to In Progress in RMM Project Board Feb 5, 2026
* @throws rmm::logic_error if `upstream == nullptr`
* This property declares that a `logging_resource_adaptor_impl` provides device accessible memory
*/
friend void get_property(logging_resource_adaptor_impl const&,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should take the properties from the upstream. I can't remember exactly what utility cccl provides for that though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s a good catch. Until now, RMM had to lie and say every resource (adaptor) was device accessible. Now that we are storing an any_resource we can actually query its properties at runtime and forward those through.

Copy link
Collaborator Author

@bdice bdice Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky. We have to make a decision. Do we:

  1. Declare that all resources/adaptors are cuda::mr::device_accessible and hardcode that. This is the current strategy and it lets us avoid templates, lets us compile more into librmm.so, and is friendlier to Cython.
  2. Forward <Properties...> through the whole chain of logging_resource_adaptor -> impl -> upstream_ member and keep it templated and header-only. For Cython we'd still have to declare everything as device-accessible because I don't think it is in scope for us to have device and host and device-host versions of every Cython class (no template support).

I think there is an "option 3" which is that we declare no resource properties and just keep any_resource<> everywhere to make it friendly to any resource type (albeit losing the static knowledge of accessibility). I can't recall if this would be fully compatible with CCCL today or if it requires more work in CCCL -- I discussed this a bit with @pciolkosz but don't remember the status quo.

Copy link
Collaborator Author

@bdice bdice Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Option 3 isn't possible at the moment because we have many APIs requiring device_async_resource_ref which enforces the cuda::mr::device_accessible property. I think we kick this can down the road, at least until I'm able to get rid of device_memory_resource and move everything to CCCL resource (ref) classes rather than RMM wrappers.

I think option (1), declaring everything must be cuda::mr::device_accessible, is the best choice we have for the immediate term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bradley and I chatted for a bit today and also came down on option 1 as the short term solution. We'd like to see better property support long-term, but only after device_memory_resource is history.

@bdice bdice requested review from davidwendt and wence- February 9, 2026 17:42
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more questions.

* @param auto_flush If true, flushes the log for every (de)allocation
*/
logging_resource_adaptor_impl(std::shared_ptr<rapids_logger::logger> logger,
device_async_resource_ref upstream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Should this accept a cuda::mr::any_resource to explicitly indicate that the person constructing this does not need to keep the upstream alive?

Copy link
Collaborator Author

@bdice bdice Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance I've been using is to always use ref types as parameters. That can always be reified into an owning type (any_resource, by copy or incrementing shared refcount).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that works. I am wondering about the principle of least surprise. If I think about standard &-ref types:

// Caller knows that they are on the hook to keep `x` alive.
auto takes_ref(foo & x) {
    return something_that_uses(x);
}

// Caller knows they are on the hook to implement copy
auto takes_value(foo x) {
    return something_that_uses(x);
}

In contrast here:

// Caller must look at the implementation/documentation to know that they
// need not keep `mr` alive
auto logging_resource_adaptor(resource_ref mr, ...) {
}

I agree that functions can take _ref types as parameters and the body can decide to reify. But here we know we're going to reify so why not just advertise that that is happening?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot to mention the other catch: we still support device_memory_resource* inputs (currently the caller promises to keep the underlying resource alive) until that can be banished from RAPIDS. The device_async_resource_ref type supports that conversion implicitly, while any_resource won't.

I think it is worth evaluating a switch to any_resource in the future, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think switching will make sense once we have excised device_memory_resource.

*/
class RMM_EXPORT logging_resource_adaptor
: public device_memory_resource,
public cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is it possible to move the detail::logging_resource_adaptor_impl into a separate file and only forward-declare the struct name in this file.

If yes, then that is great because we can split the detail impl into a header that is never included in user-facing API code and hence hide the API (and ABI)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question. This says "every logging_resource_adaptor is-a shared_resource<detail::logging_resource_adaptor_impl>".

Why is it not instead:

class logging_resource_adaptor {
...
private:
    cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> impl_;
};

i.e. "every logging_resource_adaptor has-a shared_resource<detail::logging_resource_adaptor_impl>" that provides the implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try separating the impl class as you suggested. That would be nice.

is-a shared_resource (inheritance rather than composition) means we don't have to use PIMPL-like forwarding for the APIs that a shared_resource provides: allocation, deallocation, getting properties, copy constructors, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the concept of private inheritance which models more like a has-a pattern than an is-a pattern and is recommend by Scott Meyers Effective C++ [citation needed]
https://isocpp.org/wiki/faq/private-inheritance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that the impl class becomes part of the ABI?

Copy link
Contributor

@vyasr vyasr Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private inheritance will make the inherited-from class part of the ABI, but in this case I think David is suggesting using private inheritance for shared_resource<detail::logging_resource_adaptor_impl>, in which case only shared_resource<detail::logging_resource_adaptor_impl> would be part of the ABI, not detail::logging_resource_adaptor_impl itself since it's hidden behind a pointer in shared_resource. In doing this you would lose the automatic availability of shared_resource methods that public inheritance currently gives you, but you can reexpose them with using declarations, which is probably the cleanest solution here.

Copy link
Collaborator Author

@bdice bdice Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That is possible. One downside would be that it becomes impossible to cast an adaptor to a shared_resource<T>, but perhaps that is desirable given that T (the impl class) is a private implementation detail. I think I might go that route!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a9ca317.

Copy link
Contributor

@vyasr vyasr Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this change means that some of the using declarations marked as only for the legacy layer will probably need to stay (but modified to be from shared_resource rather than device_memory_resource). Otherwise we won't have the the shared_resource APIs like allocate and deallocate available on the adaptor itself.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s a pretty minor change and I don’t have any issue with doing that. Thanks for calling it out.

@bdice bdice changed the base branch from main to staging February 9, 2026 21:10
@bdice
Copy link
Collaborator Author

bdice commented Feb 9, 2026

I retargeted this to a staging branch so that I can merge a bunch of breaking changes there and test them all with RAPIDS libraries before merging to main.

* @param auto_flush If true, flushes the log for every (de)allocation
*/
logging_resource_adaptor_impl(std::shared_ptr<rapids_logger::logger> logger,
device_async_resource_ref upstream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think switching will make sense once we have excised device_memory_resource.

* @throws rmm::logic_error if `upstream == nullptr`
* This property declares that a `logging_resource_adaptor_impl` provides device accessible memory
*/
friend void get_property(logging_resource_adaptor_impl const&,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bradley and I chatted for a bit today and also came down on option 1 as the short term solution. We'd like to see better property support long-term, but only after device_memory_resource is history.

*/
class RMM_EXPORT logging_resource_adaptor
: public device_memory_resource,
public cuda::mr::shared_resource<detail::logging_resource_adaptor_impl> {
Copy link
Contributor

@vyasr vyasr Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private inheritance will make the inherited-from class part of the ABI, but in this case I think David is suggesting using private inheritance for shared_resource<detail::logging_resource_adaptor_impl>, in which case only shared_resource<detail::logging_resource_adaptor_impl> would be part of the ABI, not detail::logging_resource_adaptor_impl itself since it's hidden behind a pointer in shared_resource. In doing this you would lose the automatic availability of shared_resource methods that public inheritance currently gives you, but you can reexpose them with using declarations, which is probably the cleanest solution here.

@rockhowse rockhowse merged commit adf8778 into rapidsai:staging Feb 17, 2026
65 of 66 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in RMM Project Board Feb 17, 2026
bdice added a commit that referenced this pull request Feb 20, 2026
Convert logging_resource_adaptor from a templated, header-only class to a non-templated class using cuda::mr::shared_resource for reference-counted ownership. Implementation is now compiled into librmm.so.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Lawrence Mitchell (https://github.com/wence-)
  - David Wendt (https://github.com/davidwendt)

URL: #2246
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change feature request New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants