Replies: 8 comments 2 replies
-
I tried that change and measure the stats before and after. Some miss percentages went down significantly (e.g. 36.9% --> 22.9%), others increased a bit. |
Beta Was this translation helpful? Give feedback.
-
@markshannon has been discussing something similar offline. His design uses "exponential backoff" instead. As I understand it, we would use 4 bits of the counter as an exponent, and the other 12 as the counter value. Each time we un-specialize (or a fail to specialize) and convert back to the adaptive form, we increment the exponent. Then we treat the maximum counter value as So each time we convert back to the adaptive form, we cut the amount of churn in half. |
Beta Was this translation helpful? Give feedback.
-
I'm a little lost. If you're referring to the part where it dynamically sets Then again, this pattern happens in the real-world, so I can't just dismiss it :). In this case shouldn't it de-optimize immediately and not bother specializing again? Seems like Cinder possibly does that. |
Beta Was this translation helpful? Give feedback.
-
Yes, Having said that, it still performs well and the misses are a very small fraction of the overall instruction count. Exponential back off would be nice, but we don't want to back off for low miss rates. One scheme would be to increment the exponent every time we fail to specialize and reduce (down to the minimum) when we specialize successfully. The problem is that we keep succeeding here, and then missing because different object have different keys. Which makes me wonder, why does |
Beta Was this translation helpful? Give feedback.
-
Hmm, is Should be easy to test, just add a check when specializing that |
Beta Was this translation helpful? Give feedback.
-
It seems one fundamental issue is that it's hard to tell based on one call to So can we base the decision on two consecutive calls? If it gets to the point where it's possible to specialize to LOAD_METHOD_WITH_DICT, we could store all the appropriate cache data, but not actually change the opcode. Then on the next call to I'm not sure how much faster the adaptive opcode is than the |
Beta Was this translation helpful? Give feedback.
-
I'm puzzled as to how
This can only happen if either:
Case 1 seems highly unlikely, so maybe we are seeing case 2 in the benchmarks where this works. If we were to allow shared-key dicts for non-managed classes (which might make sense) then |
Beta Was this translation helpful? Give feedback.
-
I would expect them to mainly share the class cached keys (which would be empty, but not the same as the common empty keys), or for the dict to not exist at all. Dicts are lazily allocated for a lot of objects. I am guessing here, I don't have any data. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
99.7% of LOAD_METHOD_WITH_DICT instructions deoptimize in bm_go. All bm_go stats are here. Some LOAD_METHOD_WITH_DICT stats by benchmark are here.
It seems bm_chaos, bm_regex_compile, and bm_sympy make good use of this opcode, while bm_2to3, bm_go, and bm_tornado_http all have miss rates of 95% or more. Most others benchmarks have a miss rate of 16-17%, presumably from pyperf/pyperformance overhead.
Most of the bm_go deoptimization seems to occur at the following line in ceval.c (the last DEOPT_IF):
From what I've gathered, most of the misses occur when loading the method
Square.find
, andSquare
instances get assigned extra attributes after__init__
finishes. This maybe gives some insight into why the miss rate is what it is, but I don't know if there's any way to improve this. Is there some way to be more particular at specialization time?I wonder if there's any benefit to adding a more dynamic ADAPTIVE_CACHE_BACKOFF. Instead of one _Py_CODEUNIT that always starts at 64, there could be two uint8_t, one for the current counter and another for a counter start value that starts small (11 or something?) and then gets incremented at every miss, with a cap at 255.
Beta Was this translation helpful? Give feedback.
All reactions