-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-127750: Fix singledispatchmethod caching (v2) #128648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127750: Fix singledispatchmethod caching (v2) #128648
Conversation
import weakref # see comment in singledispatch function | ||
self._method_cache = weakref.WeakKeyDictionary() | ||
def __set_name__(self, obj, name): | ||
self.attrname = name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check cached_property.__set_name__
, it has some more stuff in it - might be needed here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. The additions there prevent something like this:
@dataclass(frozen=True)
class A:
value: int
@singledispatchmethod
def dispatch(self, x):
return id(self)
renamed_dispatch = dispatch # allowed? if so, how should it behave
The corresponding test for the cached_property
for this is
cpython/Lib/test/test_functools.py
Line 3315 in 34e840f
def test_reuse_different_names(self): |
But on current main renaming is allowed for the singledispatchmethod.
I am not sure here what the desired behavior is (and why)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this implementation is desirable, maybe later someone who knows more about this can comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, the only reason cached properties can't be renamed is because the cache is keyed by the attribute's name.
Allowing a rebind would disconnect the cached property from it's cached value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think you might want to either ignore renames or do something along these lines (ignoring error handling):
if self.attrname:
cache[name] = cache.pop(self.attrname)
self.attrname = name
As far as I know, each binding shares the same instance of the descriptor, so as long as the cache key is constant, it should work no matter how many times it's been renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing a rebind would disconnect the cached property from it's cached value.
This is kind of the same situation.
If rename is allowed, then it would simply cache to the last attrname
. Drawback is that there is a small risk for unused cached methods.
I think it might be most straight forward to copy+paste cached_property.__set_name__
. It does seem a sensible restriction. It comes at expense of flexibility, but personally, I have never run into that TypeError
.
Also, it will be easier to address changes/improvements when 2 implementations that use the same caching approach are aligned.
if self._method_cache is not None: | ||
self._method_cache[obj] = _method | ||
if cache is not None: | ||
cache[self.attrname] = _method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not it create a reference loop? obj
refers to cache
, cache
refers to _method
, _method
refers to a cell which refers to obj
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But once there are no external references to the object obj
any more the garbage collector removes the objects. (the cache is on the object obj
, not on the singledispatchmethod itself or the class)
In the current main the caching is done on the singledispatchmethod
which keeps the generated methods alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the current situation is worse, it creates strong references singledispatchmethod -> _method -> obj.
Relying on the garbage collection is not good. This particular loop can be broken by using a weak reference to obj instead of obj. But a reference from a bound method to the object should be strong, otherwise some code will not work (there was a similar issue with TemporaryFile).
I am not sure how much this optimization saves. Are there other ways to achieve the same speed up, without creating reference loops?
Closing in favor of #130008 |
Version based on idea from @dg-pb in #127839. This version
__hash__
/__eq__
Regression in Django with singledispatchmethod on models #127750There is still a cache (stored on the object instances). Quick benchmark (windows, non-pgo):
(note that the alternative to this PR is not to keep main, but to revert #107148)