-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 659: Specializing Adaptive Interpreter #1957
PEP 659: Specializing Adaptive Interpreter #1957
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some wording suggestions.
pep-0659.rst
Outdated
========== | ||
|
||
Python is widely acknowledged as slow. | ||
Whilst Python will never attain the performance of low-level langauges like C, Fortran, or even Java, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whilst Python will never attain the performance of low-level langauges like C, Fortran, or even Java, | |
Whilst Python will never attain the performance of low-level languages like C, Fortran, or even Java, |
pep-0659.rst
Outdated
There have been several ways of doing this proposed in the academic literature, | ||
but most attempt to optimize regions larger than a single bytecode [1]_ [2]_. | ||
Using larger regions than a single instruction, requires code to handle deoptimization in the middle of a region. | ||
Specialization at the level on individual bytecodes makes deoptimization trivial, as it cannot occur in the middle of a region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specialization at the level on individual bytecodes makes deoptimization trivial, as it cannot occur in the middle of a region. | |
Specialization at the level of individual bytecodes makes deoptimization trivial, as it cannot occur in the middle of a region. |
pep-0659.rst
Outdated
Specialization at the level on individual bytecodes makes deoptimization trivial, as it cannot occur in the middle of a region. | ||
|
||
By speculatively specializing individual bytecodes, we can gain significant performance improvements without anything but the most local, | ||
and trivial to implement, de-optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and trivial to implement, de-optimizations. | |
and trivial to implement, deoptimizations. |
|
||
Quickening is the process of replacing slow instructions with faster variants. | ||
|
||
Quickened code has number of advantages over the normal bytecode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quickened code has number of advantages over the normal bytecode: | |
Quickened code has a number of advantages over the normal bytecode: |
pep-0659.rst
Outdated
* It can use super-instructions that span lines and take multiple operands. | ||
* It does not need to handle tracing as it can fallback to the normal bytecode for that. | ||
|
||
In order that tracing can supported, and quickening performed quickly, the quickened instruction format should match the normal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order that tracing can supported, and quickening performed quickly, the quickened instruction format should match the normal | |
In order that tracing can be supported, and quickening performed quickly, the quickened instruction format should match the normal |
pep-0659.rst
Outdated
To do this, an array of specialization data entries will be maintained alongside the new instruction array. | ||
For instructions that need specialization data, the operand in the quickened array will serve as a partial index, | ||
along with the offset of the instruction, to find the first specialization data entry for that instruction. | ||
Each entry will be 8 bytes (for a 64 bit machine). The data in a entry, and the number of entries needed, will vary from instruction to instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each entry will be 8 bytes (for a 64 bit machine). The data in a entry, and the number of entries needed, will vary from instruction to instruction. | |
Each entry will be 8 bytes (for a 64 bit machine). The data in an entry, and the number of entries needed, will vary from instruction to instruction. |
pep-0659.rst
Outdated
Quickened instructions will be stored in an array (it is neither necessary not desirable to store them in a Python object) with the same | ||
format as the original bytecode. Ancillary data will be stored in a separate array. | ||
|
||
Each instruction will use 0 or more data entries. Each instruction within family must have the same amount of data allocated, although some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each instruction will use 0 or more data entries. Each instruction within family must have the same amount of data allocated, although some | |
Each instruction will use 0 or more data entries. Each instruction within a family must have the same amount of data allocated, although some |
pep-0659.rst
Outdated
Each instruction will use 0 or more data entries. Each instruction within family must have the same amount of data allocated, although some | ||
instructions may not use all of it. Instructions that connot be specialized, e.g. ``POP_TOP``, do not need any entries. | ||
Experiments show that 25% to 30% of instructions can be usefully specialized. | ||
Different families will need different amount of data, but most need 2 entries (16 bytes on a 64 bit machine). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different families will need different amount of data, but most need 2 entries (16 bytes on a 64 bit machine). | |
Different families will need different amounts of data, but most need 2 entries (16 bytes on a 64 bit machine). |
pep-0659.rst
Outdated
This is an obvious candidate for specialization. For example, the call in ``len(x)`` is repesented as the bytecode ``CALL_FUNCTION 1``. | ||
In this case we would always expect the object ``len`` to be the function. We probably don't want to specialize for ``len`` | ||
(although we might for ``type`` and ``isinstance``), but it would be beneficial to specialize for builtin functions taking a single argument. | ||
A fast check that the underlying function is a builtin function taking a single argument (``METHOD_O``) would allow use to avoid a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fast check that the underlying function is a builtin function taking a single argument (``METHOD_O``) would allow use to avoid a | |
A fast check that the underlying function is a builtin function taking a single argument (``METHOD_O``) would allow us to avoid a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep-0659.rst
@python/organization-owners Spam x2 👆 |
No description provided.