Perform reference count analysis in the bytecode compiler #308
Replies: 4 comments 1 reply
-
This will have to wait until #235 is complete |
Beta Was this translation helpful? Give feedback.
-
Just as an aside, as I am interested in whether it was ever considered: would there be value in making refcount operations explicit bytecode ops rather than being implicit? As I can imagine that many of them could be optimized away at the function level. Or is the cost of those operations so negligible relative to other things that making them require a pass of the eval loop would far outweigh any benefit? |
Beta Was this translation helpful? Give feedback.
-
The cost of the extra instruction will generally cost more than the refcount operation, but the idea of lowering the bytecode to make refcounting explicit has value. You can consider instructions with refcounting to be a form of superinstruction. So, one way to implement a refcount removal scheme would be for the front-end to emit low-level bytecodes, then the optimizer would then try to remove as many as possible. The assembler would combine the remnants into larger instructions where possible.
The optimizer would sink the
In cases where the optimizer could not remove the |
Beta Was this translation helpful? Give feedback.
-
Just as a quick reference: I did implement this optimization (i.e., reference count operation elimination through quickening) in my DLS'10 paper (https://www.unibw.de/ucsrl/pubs/dls10.pdf/view). The performance improved somewhat relative to the ECOOP'10 paper, which presented the original idea, but not applied to a lot of instructions. Although the performance wasn't a huge improvement, contrary to what I expected initially, this approach did lead me to the multi-level quickening (MLQ) work. MLQ effectively subsumes reference-count operation elimination by manipulating native-machine data objects, which need not be managed through automatic memory management. By way of unboxing, data locality also increases and interpreter instruction implementation can be cut down to a few machine instructions. Once interpreter instructions are at that performance level, type-based superinstructions are the final step in eliminating instruction dispatch costs. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Using some localized dataflow (or using the global dataflow if we implement #306),
we can track some refcounts in the compiler instead of the interpreter.
Consider
x = a + b + c
, which compiles as follows:Now annotate with reference count operations
Now consider the ideal sequence
That's 8 refcount operations reduced to 2 (or 1 with #306)
This is admittedly an extreme case, but there is clearly room for improvement.
There is some cost to checking the "decref bits" on the operations, but there is also a potential advantage in allowing us to refine the
refcount == 1
trick to only those cases where there is a decrement of the refcount.Beta Was this translation helpful? Give feedback.
All reactions