-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom spilling handler #12287
base: branch-23.04
Are you sure you want to change the base?
Custom spilling handler #12287
Conversation
Codecov ReportBase: 85.69% // Head: 85.69% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #12287 +/- ##
=============================================
Coverage 85.69% 85.69%
=============================================
Files 155 155
Lines 24798 24798
=============================================
Hits 21251 21251
Misses 3547 3547 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
It might be possible to do in a decorator that could replace
cached_property. Would that be sufficient?
…On Fri, 2 Dec 2022 at 17.58, Ashwin Srinath ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/cudf/cudf/core/index.py
<#12287 (comment)>:
> self._start, self._stop, self._step, dtype=self.dtype
)
+ manager = get_global_manager()
I'm concerned about knowledge of the spilling manager leaking into other
cuDF types like RangeIndex. Is there any way we can do this totally
outside of RangeIndex?
—
Reply to this email directly, view it on GitHub
<#12287 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH6FQB2C46KAPJGHXF7QR3WLITD5ANCNFSM6AAAAAASR7ESWE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Yes, I think it would be a significant improvement |
I have created a decorator instead, do you think this is acceptable @shwina? |
Thanks! I do think this is an improvement. Some questions/thoughts:
|
No, I think we can generalize the decorator to support any type. Do you know, if we are caching columns, series, arrays, or dataframes anywhere else? I am still seeing a difference between the memory usage of JIT-unspill and cudf-spilling, which could be because of caching. Note, JIT-unspill circumvent this issue because it uses
There might be some cases where it is a problem, I will investigate. |
@shwina, I have generalized and renamed the decorator. Now, it can be used everywhere with no ill effect. |
if manager is None: | ||
return ret | ||
|
||
buf = ret.base_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the mask
buffer of the Column
, or - in the case of nested columns - its children?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I have simplified the code a bit. Instead of supporting any type, we now check that the instance is a RangeIndex
. I haven't found any other use of cached_property
where the cache can be the sole owner of a buffer, so I think we should limit the scope to RangeIndex
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems related to the question I asked above. I'm OK with not overengineering to support anything else, although it would be good to ensure that if we do need to support other types later it only requires adding new code and not restructuring the existing code (hence my comment above).
s = func(*args, **kwargs) | ||
if s is not None: | ||
spilled += s | ||
self._spill_handlers.pop(buf, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit odd that the spill handler would be removed when the buffer is spilled. Is that always what we want?
Also, the usage makes it clear that these spill handlers are only appropriate for device to host spilling. Should we key the spill_handlers
dict on both the buffer and the source/target of the spill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see why it's removed, I guess it's being added every time the cached property is accessed.
if manager is None: | ||
return ret | ||
|
||
buf = ret.base_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems related to the question I asked above. I'm OK with not overengineering to support anything else, although it would be good to ensure that if we do need to support other types later it only requires adding new code and not restructuring the existing code (hence my comment above).
return nbytes | ||
|
||
|
||
class cached_property(functools.cached_property): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use a different name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe cached_property_no_spill
or something.
cache_hit | ||
or not isinstance(instance, cudf.RangeIndex) | ||
or not isinstance(ret, cudf.core.column.NumericalColumn) | ||
or ret.nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the nullable check needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just to avoid having to handle a mask
return ret | ||
|
||
manager = get_global_manager() | ||
if manager is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances could this be None? Wouldn't the call to ret = super().__get__(instance, owner)
trigger the creation of a SpillableBuffer
(assuming this decorator is only applied to functions that return something that wraps a SpillableBuffer
) which in turn would always lead to the creation of a global SpillManager
if one isn't already initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to handle the case where spilling is disabled. Notice, this decorator is also used when spilling is disabled.
# If `cached` is known outside of the cache, we cannot free any | ||
# memory by clearing the cache. We know of three references: | ||
# `instance.__dict__`, `cached`, and `sys.getrefcount`. | ||
if sys.getrefcount(cached) > 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this line disconcerting, but I'm having trouble articulating exactly why so I'm just going to try to explain my thought process and see if we can resolve my confusion.
At the point where we call buf.spill
(and now the spill handler) we already know that the buffer is spillable. A buffer is spillable according to our existing logic if it hasn't been handed out to an external consumer and if there are no spill locks around it. If we get to to this point and the number of references is greater than 3, doesn't that imply that we should just spill? It seems like at that point you're in the exact same scenario that you would be for any other SpillableBuffer where a future use could try to unspill it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get to to this point and the number of references is greater than 3, doesn't that imply that we should just spill? It seems like at that point you're in the exact same scenario that you would be for any other SpillableBuffer where a future use could try to unspill it.
Correct, this is also what is done. By returning None
, we are telling the spill manager to spill the buffer as usual.
We could move this responsibility to the handler always. It would then be up to the handler to spill the buffer?
Retargeted to 23.04 |
@madsbk is this still something that we want? If so, at this point should it still wait until after you've done the final merging of COW/spilling to avoid conflicting code? |
Yes and yes :) |
Note, the above unification is done now so this work is unblocked if desired. |
This PR enables objects to register a spilling handler for a specific spillable buffer. For now, this is used by
RangeIndex
to delete rather than spill its cached data.Motivation
Normally, a
RangeIndex
, which consists of astart
,stop
,step
, and adtype
, doesn’t use any device memory and spilling it should be a no-op. However, if theRangeIndex
has been materialized, the spill manager might decide to spill the underlying buffer. This can degrade performance and increase memory usage significantly.Checklist