Custom spilling handler #12287

madsbk · 2022-12-02T14:28:09Z

This PR enables objects to register a spilling handler for a specific spillable buffer. For now, this is used by RangeIndex to delete rather than spill its cached data.

Motivation

Normally, a RangeIndex, which consists of a start, stop, step, and a dtype, doesn’t use any device memory and spilling it should be a no-op. However, if the RangeIndex has been materialized, the spill manager might decide to spill the underlying buffer. This can degrade performance and increase memory usage significantly.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

codecov · 2022-12-02T16:51:00Z

Codecov Report

Base: 85.69% // Head: 85.69% // No change to project coverage 👍

Coverage data is based on head (9a5a814) compared to base (9a5a814).
Patch has no changes to coverable lines.

❗ Current head 9a5a814 differs from pull request most recent head dfb0983. Consider uploading reports for the commit dfb0983 to get more accurate results

Additional details and impacted files

@@              Coverage Diff              @@
##           branch-23.02   #12287   +/-   ##
=============================================
  Coverage         85.69%   85.69%           
=============================================
  Files               155      155           
  Lines             24798    24798           
=============================================
  Hits              21251    21251           
  Misses             3547     3547

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

python/cudf/cudf/core/index.py

madsbk · 2022-12-02T17:56:29Z

It might be possible to do in a decorator that could replace cached_property. Would that be sufficient?

…

On Fri, 2 Dec 2022 at 17.58, Ashwin Srinath ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In python/cudf/cudf/core/index.py <#12287 (comment)>: > self._start, self._stop, self._step, dtype=self.dtype ) + manager = get_global_manager() I'm concerned about knowledge of the spilling manager leaking into other cuDF types like RangeIndex. Is there any way we can do this totally outside of RangeIndex? — Reply to this email directly, view it on GitHub <#12287 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAH6FQB2C46KAPJGHXF7QR3WLITD5ANCNFSM6AAAAAASR7ESWE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

shwina · 2022-12-02T20:48:22Z

It might be possible to do in a decorator that could replace cached_property. Would that be sufficient?

Yes, I think it would be a significant improvement

madsbk · 2022-12-06T17:05:38Z

It might be possible to do in a decorator that could replace cached_property. Would that be sufficient?

Yes, I think it would be a significant improvement

I have created a decorator instead, do you think this is acceptable @shwina?

shwina · 2022-12-08T01:25:05Z

Thanks! I do think this is an improvement. Some questions/thoughts:

cached_property_delete_column_when_spilled needs the property to return a Column, but what about cached properties that possibly return buffers or arrays? Would each of those need a custom cached_property specialization that does things slightly differently with the return value?
Can we aim to make the changes to RangeIndex (and potentially other classes) even more minimal? For example:
- Instead of using cached_property_delete_when_spilled, can we universally use a custom cached property in cuDF that knows about spilling? That is, everywhere that we currently use functools.cached_property, can we use cudf.utils.cached_property? Or are there situations we don't want to delete cached device objects when spilled?
- Instead of using a decorator, can we monkeypatch methods like RangeIndex._values when spilling is enabled? This allows RangeIndex to be truly untouched.
Admittedly I don't think either of those are fantastic suggestions; curious what you think?

madsbk · 2022-12-08T07:25:10Z

cached_property_delete_column_when_spilled needs the property to return a Column, but what about cached properties that possibly return buffers or arrays? Would each of those need a custom cached_property specialization that does things slightly differently with the return value?

No, I think we can generalize the decorator to support any type. Do you know, if we are caching columns, series, arrays, or dataframes anywhere else? I am still seeing a difference between the memory usage of JIT-unspill and cudf-spilling, which could be because of caching. Note, JIT-unspill circumvent this issue because it uses .serialize() for spilling, which doesn't serialize cached data already.

Instead of using cached_property_delete_when_spilled, can we universally use a custom cached property in cuDF that knows about spilling? That is, everywhere that we currently use functools.cached_property, can we use cudf.utils.cached_property? Or are there situations we don't want to delete cached device objects when spilled?

~~No, this should be doable.~~

There might be some cases where it is a problem, I will investigate.

madsbk · 2022-12-12T10:16:44Z

@shwina, I have generalized and renamed the decorator. Now, it can be used everywhere with no ill effect.

shwina · 2022-12-12T17:17:04Z

python/cudf/cudf/core/buffer/utils.py

+        if manager is None:
+            return ret
+
+        buf = ret.base_data


What about the mask buffer of the Column, or - in the case of nested columns - its children?

Good point.

I have simplified the code a bit. Instead of supporting any type, we now check that the instance is a RangeIndex. I haven't found any other use of cached_property where the cache can be the sole owner of a buffer, so I think we should limit the scope to RangeIndex for now.

Seems related to the question I asked above. I'm OK with not overengineering to support anything else, although it would be good to ensure that if we do need to support other types later it only requires adding new code and not restructuring the existing code (hence my comment above).

…th no mask.

…_spilling

python/cudf/cudf/core/buffer/spill_manager.py

vyasr · 2022-12-19T17:37:49Z

python/cudf/cudf/core/buffer/spill_manager.py

+                            s = func(*args, **kwargs)
+                            if s is not None:
+                                spilled += s
+                                self._spill_handlers.pop(buf, None)


It seems a bit odd that the spill handler would be removed when the buffer is spilled. Is that always what we want?

Also, the usage makes it clear that these spill handlers are only appropriate for device to host spilling. Should we key the spill_handlers dict on both the buffer and the source/target of the spill?

OK I see why it's removed, I guess it's being added every time the cached property is accessed.

python/cudf/cudf/core/buffer/utils.py

vyasr · 2022-12-19T18:26:32Z

python/cudf/cudf/core/buffer/utils.py

+        if manager is None:
+            return ret
+
+        buf = ret.base_data


Seems related to the question I asked above. I'm OK with not overengineering to support anything else, although it would be good to ensure that if we do need to support other types later it only requires adding new code and not restructuring the existing code (hence my comment above).

vyasr · 2022-12-19T18:27:18Z

python/cudf/cudf/core/buffer/utils.py

+    return nbytes
+
+
+class cached_property(functools.cached_property):


I would prefer to use a different name.

Maybe cached_property_no_spill or something.

vyasr · 2022-12-19T18:37:59Z

python/cudf/cudf/core/buffer/utils.py

+            cache_hit
+            or not isinstance(instance, cudf.RangeIndex)
+            or not isinstance(ret, cudf.core.column.NumericalColumn)
+            or ret.nullable


Why is the nullable check needed?

It is just to avoid having to handle a mask

vyasr · 2022-12-19T18:41:01Z

python/cudf/cudf/core/buffer/utils.py

+            return ret
+
+        manager = get_global_manager()
+        if manager is None:


Under what circumstances could this be None? Wouldn't the call to ret = super().__get__(instance, owner) trigger the creation of a SpillableBuffer (assuming this decorator is only applied to functions that return something that wraps a SpillableBuffer) which in turn would always lead to the creation of a global SpillManager if one isn't already initialized.

This is to handle the case where spilling is disabled. Notice, this decorator is also used when spilling is disabled.

vyasr · 2022-12-19T18:53:16Z

python/cudf/cudf/core/buffer/utils.py

+    # If `cached` is known outside of the cache, we cannot free any
+    # memory by clearing the cache. We know of three references:
+    # `instance.__dict__`, `cached`, and `sys.getrefcount`.
+    if sys.getrefcount(cached) > 3:


I find this line disconcerting, but I'm having trouble articulating exactly why so I'm just going to try to explain my thought process and see if we can resolve my confusion.

At the point where we call buf.spill (and now the spill handler) we already know that the buffer is spillable. A buffer is spillable according to our existing logic if it hasn't been handed out to an external consumer and if there are no spill locks around it. If we get to to this point and the number of references is greater than 3, doesn't that imply that we should just spill? It seems like at that point you're in the exact same scenario that you would be for any other SpillableBuffer where a future use could try to unspill it.

If we get to to this point and the number of references is greater than 3, doesn't that imply that we should just spill? It seems like at that point you're in the exact same scenario that you would be for any other SpillableBuffer where a future use could try to unspill it.

Correct, this is also what is done. By returning None, we are telling the spill manager to spill the buffer as usual.
We could move this responsibility to the handler always. It would then be up to the handler to spill the buffer?

Co-authored-by: Vyas Ramasubramani <vyas.ramasubramani@gmail.com>

shwina · 2023-01-26T16:44:59Z

Retargeted to 23.04

…_spilling

…nal_spilling

vyasr · 2024-01-19T16:45:21Z

@madsbk is this still something that we want? If so, at this point should it still wait until after you've done the final merging of COW/spilling to avoid conflicting code?

madsbk · 2024-01-22T07:13:10Z

@madsbk is this still something that we want? If so, at this point should it still wait until after you've done the final merging of COW/spilling to avoid conflicting code?

Yes and yes :)
The materialization of RangeIndex is something we want to avoid but let's wait until COW and spilling has been unified.

vyasr · 2024-05-21T18:26:52Z

Note, the above unification is done now so this work is unblocked if desired.

Impl. and use register_spill_handler

9baf521

madsbk added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 2, 2022

github-actions bot added the Python Affects Python cuDF API. label Dec 2, 2022

madsbk marked this pull request as ready for review December 2, 2022 16:39

madsbk requested a review from a team as a code owner December 2, 2022 16:39

madsbk requested review from shwina and skirui-source December 2, 2022 16:39

shwina reviewed Dec 2, 2022

View reviewed changes

python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved

Impl. cached_property_delete_column_when_spilled

8df1ada

madsbk added 2 commits December 12, 2022 11:01

generalize cached_property

4a13c5e

clean up

4ef7d60

use buf.nbytes for the size

f109b71

shwina reviewed Dec 12, 2022

View reviewed changes

madsbk added 3 commits December 13, 2022 13:33

limit the scope to RangeIndex

5024d5a

only handle RangeIndex instances that returns a numerical column wi…

6359bed

…th no mask.

Merge branch 'branch-23.02' of github.com:rapidsai/cudf into external…

ac0801b

…_spilling

vyasr requested changes Dec 19, 2022

View reviewed changes

madsbk and others added 3 commits December 20, 2022 08:04

Clean up

8a44dc9

Co-authored-by: Vyas Ramasubramani <vyas.ramasubramani@gmail.com>

style

c58d854

Merge branch 'branch-23.02' into external_spilling

dfb0983

shwina changed the base branch from branch-23.02 to branch-23.04 January 26, 2023 16:44

madsbk added 2 commits February 27, 2023 16:53

Merge branch 'branch-23.04' of github.com:rapidsai/cudf into external…

3bf949d

…_spilling

Merge branch 'external_spilling' of github.com:madsbk/cudf into exter…

156b3f2

…nal_spilling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom spilling handler #12287

Custom spilling handler #12287

madsbk commented Dec 2, 2022

codecov bot commented Dec 2, 2022 •

edited

Loading

madsbk commented Dec 2, 2022 via email

shwina commented Dec 2, 2022

madsbk commented Dec 6, 2022

shwina commented Dec 8, 2022

madsbk commented Dec 8, 2022 •

edited

Loading

madsbk commented Dec 12, 2022

shwina Dec 12, 2022

madsbk Dec 13, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

vyasr Dec 19, 2022

madsbk Dec 20, 2022

vyasr Dec 19, 2022

madsbk Dec 20, 2022

vyasr Dec 19, 2022

madsbk Dec 20, 2022

shwina commented Jan 26, 2023

vyasr commented Jan 19, 2024

madsbk commented Jan 22, 2024

vyasr commented May 21, 2024

		return nbytes


		class cached_property(functools.cached_property):

Custom spilling handler #12287

Are you sure you want to change the base?

Custom spilling handler #12287

Conversation

madsbk commented Dec 2, 2022

Motivation

Checklist

codecov bot commented Dec 2, 2022 • edited Loading

Codecov Report

madsbk commented Dec 2, 2022 via email

shwina commented Dec 2, 2022

madsbk commented Dec 6, 2022

shwina commented Dec 8, 2022

madsbk commented Dec 8, 2022 • edited Loading

madsbk commented Dec 12, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shwina commented Jan 26, 2023

vyasr commented Jan 19, 2024

madsbk commented Jan 22, 2024

vyasr commented May 21, 2024

codecov bot commented Dec 2, 2022 •

edited

Loading

madsbk commented Dec 8, 2022 •

edited

Loading