-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
gh-119786: move exception handling doc to InternalDocs #119815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 9 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
81a0e13
rename index->README and move exception handling doc to new folder
iritkatriel cd7725f
add link to exception handling file from index
iritkatriel 7719234
edit exception handling
iritkatriel 023ad9d
whitespace
iritkatriel fc5d217
formatting
iritkatriel 3436cdf
formatting
iritkatriel 49af193
add explanation of reraise and lasti. fix typos
iritkatriel d299a3e
address review - clarify pseudo-instructions/intermediate code
iritkatriel e93a253
fix markup
iritkatriel a3da517
typo
iritkatriel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
Description of exception handling | ||
--------------------------------- | ||
|
||
Python uses a technique known as "zero-cost" exception handling, which | ||
minimizes the cost of supporting exceptions. In the common case (where | ||
no exception is raised) the cost is reduced to zero (or close to zero). | ||
The cost of raising an exception is increased, but not by much. | ||
|
||
The following code: | ||
|
||
``` | ||
try: | ||
g(0) | ||
except: | ||
res = "fail" | ||
|
||
``` | ||
|
||
compiles into intermediate code like the following: | ||
|
||
``` | ||
RESUME 0 | ||
|
||
1 SETUP_FINALLY 8 (to L1) | ||
|
||
2 LOAD_NAME 0 (g) | ||
PUSH_NULL | ||
LOAD_CONST 0 (0) | ||
CALL 1 | ||
POP_TOP | ||
POP_BLOCK | ||
|
||
-- L1: PUSH_EXC_INFO | ||
|
||
3 POP_TOP | ||
|
||
4 LOAD_CONST 1 ('fail') | ||
STORE_NAME 1 (res) | ||
``` | ||
|
||
`SETUP_FINALLY` and `POP_BLOCK` are pseudo-instruction. This means | ||
that they can appear in intermediate code but they are not bytecode | ||
instructions. `SETUP_FINALLY` specifies that henceforth, exceptions | ||
are handled by the code at label L1. The `POP_BLOCK` instruction | ||
reverses the effect of the last `SETUP` instruction, so that the | ||
active exception handler reverts to what it was before. | ||
|
||
`SETUP_FINALLY` and `POP_BLOCK` have no effect when no exceptions | ||
are raised. The idea of zero-cost exception handling is to replace | ||
these pseudo-instructions by metadata which is stored alongside the | ||
bytecode, and which is inspected only when an exception occurs. | ||
This metadata is the exception table, and it is stored in the code | ||
object's `co_exceptiontable` field. | ||
|
||
When the pseudo-instructions are translated into bytecode, | ||
`SETUP_FINALLY` and `POP_BLOCK` are removed, and the exception | ||
table is constructed, mapping each instruction to the exception | ||
handler that covers it, if any. Instructions which are not | ||
covered by any exception handler within the same code object's | ||
bytecode, do not appear in the exception table at all. | ||
|
||
For the code object in our example above, the table has a single | ||
entry specifying that all instructions that were between the | ||
`SETUP_FINALLY` and the `POP_BLOCK` are covered by the exception | ||
handler located at label `L1`. | ||
|
||
Handling Exceptions | ||
------------------- | ||
|
||
At runtime, when an exception occurs, the interpreter looks up | ||
the offset of the current instruction in the exception table. If | ||
it finds a handler, control flow transfers to it. Otherwise, the | ||
exception bubbles up to the caller, and the caller's frame is | ||
checked for a handler covering the `CALL` instruction. This | ||
repeats until a handler is found or the topmost frame is reached. | ||
If no handler is found, the program terminates. During unwinding, | ||
the traceback is constructed as each frame is added to it. | ||
|
||
Along with the location of an exception handler, each entry of the | ||
exception table also contains the stack depth of the `try` instruction | ||
and a boolean `lasti` value, which indicates whether the instruction | ||
offset of the raising instruction should be pushed to the stack. | ||
|
||
Handling an exception, once an exception table entry is found, consists | ||
of the following steps: | ||
|
||
1. pop values from the stack until it matches the stack depth for the handler. | ||
2. if `lasti` is true, then push the offset that the exception was raised at. | ||
3. push the exception to the stack. | ||
4. jump to the target offset and resume execution. | ||
|
||
|
||
Reraising Exceptions and `lasti` | ||
-------------------------------- | ||
|
||
The purpose of pushing `lasti` to the stack is for cases where an exception | ||
needs to be re-raised, and be associated with the original instruction that | ||
raised it. This happens, for example, at the end of a `finally` block, when | ||
any in-flight exception needs to be propagated on. As the frame's instruction | ||
pointer now points into the finally block, a `RERAISE` instruction | ||
(with `oparg > 0`) sets it to the `lasti` value from the stack. | ||
|
||
Format of the exception table | ||
----------------------------- | ||
|
||
Conceptually, the exception table consists of a sequence of 5-tuples: | ||
``` | ||
1. `start-offset` (inclusive) | ||
2. `end-offset` (exclusive) | ||
3. `target` | ||
4. `stack-depth` | ||
5. `push-lasti` (boolean) | ||
``` | ||
|
||
All offsets and lengths are in code units, not bytes. | ||
|
||
We want the format to be compact, but quickly searchable. | ||
For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed. | ||
For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases. | ||
Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry. | ||
|
||
It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as: | ||
`start, size, target, depth, push-lasti`. | ||
|
||
Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each code unit takes 2 bytes. | ||
It also happens that depth is generally quite small. | ||
|
||
So, we need to encode: | ||
``` | ||
`start` (up to 30 bits) | ||
`size` (up to 30 bits) | ||
`target` (up to 30 bits) | ||
`depth` (up to ~8 bits) | ||
`lasti` (1 bit) | ||
``` | ||
|
||
We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set. | ||
Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets. | ||
Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8. | ||
The 8 bits of a byte are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset. | ||
|
||
In addition, we combine `depth` and `lasti` into a single value, `((depth<<1)+lasti)`, before encoding. | ||
|
||
For example, the exception entry: | ||
``` | ||
`start`: 20 | ||
`end`: 28 | ||
`target`: 100 | ||
`depth`: 3 | ||
`lasti`: False | ||
``` | ||
|
||
is encoded by first converting to the more compact four value form: | ||
``` | ||
`start`: 20 | ||
`size`: 8 | ||
`target`: 100 | ||
`depth<<1+lasti`: 6 | ||
``` | ||
|
||
which is then encoded as: | ||
``` | ||
148 (MSB + 20 for start) | ||
8 (size) | ||
65 (Extend bit + 1) | ||
36 (Remainder of target, 100 == (1<<6)+36) | ||
6 | ||
``` | ||
|
||
for a total of five bytes. | ||
|
||
|
||
Script to parse the exception table | ||
----------------------------------- | ||
|
||
``` | ||
def parse_varint(iterator): | ||
b = next(iterator) | ||
val = b & 63 | ||
while b&64: | ||
val <<= 6 | ||
b = next(iterator) | ||
val |= b&63 | ||
return val | ||
``` | ||
``` | ||
def parse_exception_table(code): | ||
iterator = iter(code.co_exceptiontable) | ||
try: | ||
while True: | ||
start = parse_varint(iterator)*2 | ||
length = parse_varint(iterator)*2 | ||
end = start + length - 2 # Present as inclusive, not exclusive | ||
target = parse_varint(iterator)*2 | ||
dl = parse_varint(iterator) | ||
depth = dl >> 1 | ||
lasti = bool(dl&1) | ||
yield start, end, target, depth, lasti | ||
except StopIteration: | ||
return | ||
``` |
This file was deleted.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.