forked from aosabook/500lines
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathinterpreter.tex
1448 lines (1216 loc) · 56.3 KB
/
interpreter.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\begin{aosachapter}{A Python Interpreter Written in Python}{s:interpreter}{Allison Kaptur}
\emph{Allison is an engineer at Dropbox, where she helps maintain one of
the largest networks of Python clients in the world. Before Dropbox, she
was a facilitator at the Recurse Center, a writers retreat for
programmers in New York. She's spoken at PyCon North America about
Python internals and loves weird bugs. She blogs at
\href{http://akaptur.com}{akaptur.com}.}
\aosasecti{Introduction}\label{introduction}
Byterun is a Python interpreter implemented in Python. Through my work
on Byterun, I was surprised and delighted to discover that the
fundamental structure of the Python interpreter fits easily into the
500-line size restriction. This chapter will walk through the structure
of the interpreter and give you enough context to explore it further.
The goal is not to explain everything there is to know about
interpreters --- like so many interesting areas of programming and
computer science, you could devote years to developing a deep
understanding of the topic.
Byterun was written by Ned Batchelder and myself, building on the work
of Paul Swartz. Its structure is similar to the primary implementation
of Python, CPython, so understanding Byterun will help you understand
interpreters in general and the CPython interpreter in particular. (If
you don't know which Python you're using, it's probably CPython.)
Despite its short length, Byterun is capable of running most simple
Python programs.
\aosasectii{A Python Interpreter}\label{a-python-interpreter}
Before we begin, let's narrow down what we mean by ``a Python
interpreter''. The word ``interpreter'' can be used in a variety of
different ways when discussing Python. Sometimes interpreter refers to
the Python REPL, the interactive prompt you get by typing
\texttt{python} at the command line. Sometimes people use ``the Python
interpreter'' more or less interchangeably with ``Python'' to talk about
executing Python code from start to finish. In this chapter,
``interpreter'' has a more narrow meaning: it's the last step in the
process of executing a Python program.
Before the interpreter takes over, Python performs three other steps:
lexing, parsing, and compiling. Together, these steps transform the
programmer's source code from lines of text into structured \emph{code
objects} containing instructions that the interpreter can understand.
The interpreter's job is to take these code objects and follow the
instructions.
You may be surprised to hear that compiling is a step in executing
Python code at all. Python is often called an ``interpreted'' language
like Ruby or Perl, as opposed to a ``compiled'' language like C or Rust.
However, this terminology isn't as precise as it may seem. Most
interpreted languages, including Python, do involve a compilation step.
The reason Python is called ``interpreted'' is that the compilation step
does relatively less work (and the interpreter does relatively more)
than in a compiled language. As we'll see later in the chapter, the
Python compiler has much less information about the behavior of a
program than a C compiler does.
\aosasectii{A Python Python
Interpreter}\label{a-python-python-interpreter}
Byterun is a Python interpreter written in Python. This may strike you
as odd, but it's no more odd than writing a C compiler in C. (Indeed,
the widely used C compiler gcc is written in C.) You could write a
Python interpreter in almost any language.
Writing a Python interpreter in Python has both advantages and
disadvantages. The biggest disadvantage is speed: executing code via
Byterun is much slower than executing it in CPython, where the
interpreter is written in C and carefully optimized. However, Byterun
was designed originally as a learning exercise, so speed is not
important to us. The biggest advantage to using Python is that we can
more easily implement \emph{just} the interpreter, and not the rest of
the Python run-time, particularly the object system. For example,
Byterun can fall back to ``real'' Python when it needs to create a
class. Another advantage is that Byterun is easy to understand, partly
because it's written in a high-level language (Python!) that many people
find easy to read. (We also exclude interpreter optimizations in Byterun
--- once again favoring clarity and simplicity over speed.)
\aosasecti{Building an Interpreter}\label{building-an-interpreter}
Before we start to look at the code of Byterun, we need some
higher-level context on the structure of the interpreter. How does the
Python interpreter work?
The Python interpreter is a \emph{virtual machine}, meaning that it is
software that emulates a physical computer. This particular virtual
machine is a stack machine: it manipulates several stacks to perform its
operations (as contrasted with a register machine, which writes to and
reads from particular memory locations).
The Python interpreter is a \emph{bytecode interpreter}: its input is
instruction sets called \emph{bytecode}. When you write Python, the
lexer, parser, and compiler generate code objects for the interpreter to
operate on. Each code object contains a set of instructions to be
executed --- that's the bytecode --- plus other information that the
interpreter will need. Bytecode is an \emph{intermediate representation}
of Python code: it expresses the source code that you wrote in a way the
interpreter can understand. It's analogous to the way that assembly
language serves as an intermediate representation between C code and a
piece of hardware.
\aosasectii{A Tiny Interpreter}\label{a-tiny-interpreter}
To make this concrete, let's start with a very minimal interpreter. This
interpreter can only add numbers, and it understands just three
instructions. All code it can execute consists of these three
instructions in different combinations. The three instructions are
these:
\begin{aosaitemize}
\item
\texttt{LOAD\_VALUE}
\item
\texttt{ADD\_TWO\_VALUES}
\item
\texttt{PRINT\_ANSWER}
\end{aosaitemize}
Since we're not concerned with the lexer, parser, and compiler in this
chapter, it doesn't matter how the instruction sets are produced. You
can imagine writing \texttt{7 + 5} and having a compiler emit a
combination of these three instructions. Or, if you have the right
compiler, you can write Lisp syntax that's turned into the same
combination of instructions. The interpreter doesn't care. All that
matters is that our interpreter is given a well-formed arrangement of
the instructions.
Suppose that
\begin{verbatim}
7 + 5
\end{verbatim}
produces this instruction set:
\begin{verbatim}
what_to_execute = {
"instructions": [("LOAD_VALUE", 0), # the first number
("LOAD_VALUE", 1), # the second number
("ADD_TWO_VALUES", None),
("PRINT_ANSWER", None)],
"numbers": [7, 5] }
\end{verbatim}
The Python interpreter is a \emph{stack machine}, so it must manipulate
stacks to add two numbers (\aosafigref{500l.interpreter.stackmachine}.)
The interpreter will begin by executing the first instruction,
\texttt{LOAD\_VALUE}, and pushing the first number onto the stack. Next
it will push the second number onto the stack. For the third
instruction, \texttt{ADD\_TWO\_VALUES}, it will pop both numbers off,
add them together, and push the result onto the stack. Finally, it will
pop the answer back off the stack and print it.
\aosafigure[240pt]{interpreter-images/interpreter-stack.png}{A stack machine}{500l.interpreter.stackmachine}
The \texttt{LOAD\_VALUE} instruction tells the interpreter to push a
number on to the stack, but the instruction alone doesn't specify which
number. Each instruction needs an extra piece of information, telling
the interpreter where to find the number to load. So our instruction set
has two pieces: the instructions themselves, plus a list of constants
the instructions will need. (In Python, what we're calling
``instructions'' is the bytecode, and the entire ``what to execute''
object below is the \emph{code object}.)
Why not just put the numbers directly in the instructions? Imagine if we
were adding strings together instead of numbers. We wouldn't want to
have the strings stuffed in with the instructions, since they could be
arbitrarily large. This design also means we can have just one copy of
each object that we need, so for example to add \texttt{7 + 7},
\texttt{"numbers"} could be just \texttt{{[}7{]}}.
You may be wondering why instructions other than
\texttt{ADD\_TWO\_VALUES} were needed at all. Indeed, for the simple
case of adding two numbers, the example is a little contrived. However,
this instruction is a building block for more complex programs. For
example, with just the instructions we've defined so far, we can already
add together three values --- or any number of values --- given the
right set of these instructions. The stack provides a clean way to keep
track of the state of the interpreter, and it will support more
complexity as we go along.
Now let's start to write the interpreter itself. The interpreter object
has a stack, which we'll represent with a list. The object also has a
method describing how to execute each instruction. For example, for
\texttt{LOAD\_VALUE}, the interpreter will push the value onto the
stack.
\begin{verbatim}
class Interpreter:
def __init__(self):
self.stack = []
def LOAD_VALUE(self, number):
self.stack.append(number)
def PRINT_ANSWER(self):
answer = self.stack.pop()
print(answer)
def ADD_TWO_VALUES(self):
first_num = self.stack.pop()
second_num = self.stack.pop()
total = first_num + second_num
self.stack.append(total)
\end{verbatim}
These three functions implement the three instructions our interpreter
understands. The interpreter needs one more piece: a way to tie
everything together and actually execute it. This method,
\texttt{run\_code}, takes the \texttt{what\_to\_execute} dictionary
defined above as an argument. It loops over each instruction, processes
the arguments to that instruction if there are any, and then calls the
corresponding method on the interpreter object.
\begin{verbatim}
def run_code(self, what_to_execute):
instructions = what_to_execute["instructions"]
numbers = what_to_execute["numbers"]
for each_step in instructions:
instruction, argument = each_step
if instruction == "LOAD_VALUE":
number = numbers[argument]
self.LOAD_VALUE(number)
elif instruction == "ADD_TWO_VALUES":
self.ADD_TWO_VALUES()
elif instruction == "PRINT_ANSWER":
self.PRINT_ANSWER()
\end{verbatim}
To test it out, we can create an instance of the object and then call
the \texttt{run\_code} method with the instruction set for adding 7 + 5
defined above.
\begin{verbatim}
interpreter = Interpreter()
interpreter.run_code(what_to_execute)
\end{verbatim}
Sure enough, it prints the answer: 12.
Although this interpreter is quite limited, this process is almost
exactly how the real Python interpreter adds numbers. There are a couple
of things to note even in this small example.
First of all, some instructions need arguments. In real Python bytecode,
about half of instructions have arguments. The arguments are packed in
with the instructions, much like in our example. Notice that the
arguments to the \emph{instructions} are different than the arguments to
the methods that are called.
Second, notice that the instruction for \texttt{ADD\_TWO\_VALUES} did
not require any arguments. Instead, the values to be added together were
popped off the interpreter's stack. This is the defining feature of a
stack-based interpreter.
Remember that given valid instruction sets, without any changes to our
interpreter, we can add more than two numbers at a time. Consider the
instruction set below. What do you expect to happen? If you had a
friendly compiler, what code could you write to generate this
instruction set?
\begin{verbatim}
what_to_execute = {
"instructions": [("LOAD_VALUE", 0),
("LOAD_VALUE", 1),
("ADD_TWO_VALUES", None),
("LOAD_VALUE", 2),
("ADD_TWO_VALUES", None),
("PRINT_ANSWER", None)],
"numbers": [7, 5, 8] }
\end{verbatim}
At this point, we can begin to see how this structure is extensible: we
can add methods on the interpreter object that describe many more
operations (as long as we have a compiler to hand us well-formed
instruction sets).
\aosasectiii{Variables}\label{variables}
Next let's add variables to our interpreter. Variables require an
instruction for storing the value of a variable, \texttt{STORE\_NAME};
an instruction for retrieving it, \texttt{LOAD\_NAME}; and a mapping
from variable names to values. For now, we'll ignore namespaces and
scoping, so we can store the variable mapping on the interpreter object
itself. Finally, we'll have to make sure that \texttt{what\_to\_execute}
has a list of the variable names, in addition to its list of constants.
\begin{verbatim}
>>> def s():
... a = 1
... b = 2
... print(a + b)
# a friendly compiler transforms `s` into:
what_to_execute = {
"instructions": [("LOAD_VALUE", 0),
("STORE_NAME", 0),
("LOAD_VALUE", 1),
("STORE_NAME", 1),
("LOAD_NAME", 0),
("LOAD_NAME", 1),
("ADD_TWO_VALUES", None),
("PRINT_ANSWER", None)],
"numbers": [1, 2],
"names": ["a", "b"] }
\end{verbatim}
Our new implementation is below. To keep track of what names are bound
to what values, we'll add an \texttt{environment} dictionary to the
\texttt{\_\_init\_\_} method. We'll also add \texttt{STORE\_NAME} and
\texttt{LOAD\_NAME}. These methods first look up the variable name in
question and then use the dictionary to store or retrieve its value.
The arguments to an instruction can now mean two different things: They
can either be an index into the ``numbers'' list, or they can be an
index into the ``names'' list. The interpreter knows which it should be
by checking what instruction it's executing. We'll break out this logic
--- and the mapping of instructions to what their arguments mean ---
into a separate method.
\begin{verbatim}
class Interpreter:
def __init__(self):
self.stack = []
self.environment = {}
def STORE_NAME(self, name):
val = self.stack.pop()
self.environment[name] = val
def LOAD_NAME(self, name):
val = self.environment[name]
self.stack.append(val)
def parse_argument(self, instruction, argument, what_to_execute):
""" Understand what the argument to each instruction means."""
numbers = ["LOAD_VALUE"]
names = ["LOAD_NAME", "STORE_NAME"]
if instruction in numbers:
argument = what_to_execute["numbers"][argument]
elif instruction in names:
argument = what_to_execute["names"][argument]
return argument
def run_code(self, what_to_execute):
instructions = what_to_execute["instructions"]
for each_step in instructions:
instruction, argument = each_step
argument = self.parse_argument(instruction, argument, what_to_execute)
if instruction == "LOAD_VALUE":
self.LOAD_VALUE(argument)
elif instruction == "ADD_TWO_VALUES":
self.ADD_TWO_VALUES()
elif instruction == "PRINT_ANSWER":
self.PRINT_ANSWER()
elif instruction == "STORE_NAME":
self.STORE_NAME(argument)
elif instruction == "LOAD_NAME":
self.LOAD_NAME(argument)
\end{verbatim}
Even with just five instructions, the \texttt{run\_code} method is
starting to get tedious. If we kept this structure, we'd need one branch
of the \texttt{if} statement for each instruction. Here, we can make use
of Python's dynamic method lookup. We'll always define a method called
\texttt{FOO} to execute the instruction called \texttt{FOO}, so we can
use Python's \texttt{getattr} function to look up the method on the fly
instead of using the big \texttt{if} statement. The \texttt{run\_code}
method then looks like this:
\begin{verbatim}
def execute(self, what_to_execute):
instructions = what_to_execute["instructions"]
for each_step in instructions:
instruction, argument = each_step
argument = self.parse_argument(instruction, argument, what_to_execute)
bytecode_method = getattr(self, instruction)
if argument is None:
bytecode_method()
else:
bytecode_method(argument)
\end{verbatim}
\aosasecti{Real Python Bytecode}\label{real-python-bytecode}
At this point, we'll abandon our toy instruction sets and switch to real
Python bytecode. The structure of bytecode is similar to our toy
interpreter's verbose instruction sets, except that it uses one byte
instead of a long name to identify each instruction. To understand this
structure, we'll walk through the bytecode of a short function. Consider
the example below:
\begin{verbatim}
>>> def cond():
... x = 3
... if x < 5:
... return 'yes'
... else:
... return 'no'
...
\end{verbatim}
Python exposes a boatload of its internals at run time, and we can
access them right from the REPL. For the function object \texttt{cond},
\texttt{cond.\_\_code\_\_} is the code object associated it, and
\texttt{cond.\_\_code\_\_.co\_code} is the bytecode. There's almost
never a good reason to use these attributes directly when you're writing
Python code, but they do allow us to get up to all sorts of mischief ---
and to look at the internals in order to understand them.
\begin{verbatim}
>>> cond.__code__.co_code # the bytecode as raw bytes
b'd\x01\x00}\x00\x00|\x00\x00d\x02\x00k\x00\x00r\x16\x00d\x03\x00Sd\x04\x00Sd\x00
\x00S'
>>> list(cond.__code__.co_code) # the bytecode as numbers
[100, 1, 0, 125, 0, 0, 124, 0, 0, 100, 2, 0, 107, 0, 0, 114, 22, 0, 100, 3, 0, 83,
100, 4, 0, 83, 100, 0, 0, 83]
\end{verbatim}
When we just print the bytecode, it looks unintelligible --- all we can
tell is that it's a series of bytes. Luckily, there's a powerful tool we
can use to understand it: the \texttt{dis} module in the Python standard
library.
\texttt{dis} is a bytecode disassembler. A disassembler takes low-level
code that is written for machines, like assembly code or bytecode, and
prints it in a human-readable way. When we run \texttt{dis.dis}, it
outputs an explanation of the bytecode it has passed.
\begin{verbatim}
>>> dis.dis(cond)
2 0 LOAD_CONST 1 (3)
3 STORE_FAST 0 (x)
3 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (5)
12 COMPARE_OP 0 (<)
15 POP_JUMP_IF_FALSE 22
4 18 LOAD_CONST 3 ('yes')
21 RETURN_VALUE
6 >> 22 LOAD_CONST 4 ('no')
25 RETURN_VALUE
26 LOAD_CONST 0 (None)
29 RETURN_VALUE
\end{verbatim}
The first column shows the line numbers in our Python source code. The
second column is an index into the bytecode, telling us that the
\texttt{LOAD\_FAST} instruction appears at position zero. The third
column is the instruction itself, mapped to its human-readable name. The
fourth column, when present, is the argument to that instruction. The
fifth column, when present, is a hint about what the argument means.
Consider the first few bytes of this bytecode: {[}100, 1, 0, 125, 0,
0{]}. These six bytes represent two instructions with their arguments.
We can use \texttt{dis.opname}, a mapping from bytes to intelligible
strings, to find out what instructions 100 and 125 map to:
\begin{verbatim}
>>> dis.opname[100]
'LOAD_CONST'
>>> dis.opname[125]
'STORE_FAST'
\end{verbatim}
The second and third bytes --- 1, 0 --- are arguments to
\texttt{LOAD\_CONST}, while the fifth and sixth bytes --- 0, 0 --- are
arguments to \texttt{STORE\_FAST}. Just like in our toy example,
\texttt{LOAD\_CONST} needs to know where to find its constant to load,
and \texttt{STORE\_FAST} needs to find the name to store. (Python's
\texttt{LOAD\_CONST} is the same as our toy interpreter's
\texttt{LOAD\_VALUE}, and \texttt{LOAD\_FAST} is the same as
\texttt{LOAD\_NAME}.) So these six bytes represent the first line of
code, \texttt{x = 3}. (Why use two bytes for each argument? If Python
used just one byte to locate constants and names instead of two, you
could only have 256 names/constants associated with a single code
object. Using two bytes, you can have up to 256 squared, or 65,536.)
\aosasectii{Conditionals and Loops}\label{conditionals-and-loops}
So far, the interpreter has executed code simply by stepping through the
instructions one by one. This is a problem; often, we want to execute
certain instructions many times, or skip them under certain conditions.
To allow us to write loops and if statements in our code, the
interpreter must be able to jump around in the instruction set. In a
sense, Python handles loops and conditionals with \texttt{GOTO}
statements in the bytecode! Look at the disassembly of the function
\texttt{cond} again:
\begin{verbatim}
>>> dis.dis(cond)
2 0 LOAD_CONST 1 (3)
3 STORE_FAST 0 (x)
3 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (5)
12 COMPARE_OP 0 (<)
15 POP_JUMP_IF_FALSE 22
4 18 LOAD_CONST 3 ('yes')
21 RETURN_VALUE
6 >> 22 LOAD_CONST 4 ('no')
25 RETURN_VALUE
26 LOAD_CONST 0 (None)
29 RETURN_VALUE
\end{verbatim}
The conditional \texttt{if x \textless{} 5} on line 3 of the code is
compiled into four instructions: \texttt{LOAD\_FAST},
\texttt{LOAD\_CONST}, \texttt{COMPARE\_OP}, and
\texttt{POP\_JUMP\_IF\_FALSE}. \texttt{x \textless{} 5} generates code
to load \texttt{x}, load 5, and compare the two values. The instruction
\texttt{POP\_JUMP\_IF\_FALSE} is responsible for implementing the
\texttt{if}. This instruction will pop the top value off the
interpreter's stack. If the value is true, then nothing happens. (The
value can be ``truthy'' --- it doesn't have to be the literal
\texttt{True} object.) If the value is false, then the interpreter will
jump to another instruction.
The instruction to land on is called the jump target, and it's provided
as the argument to the \texttt{POP\_JUMP} instruction. Here, the jump
target is 22. The instruction at index 22 is \texttt{LOAD\_CONST} on
line 6. (\texttt{dis} marks jump targets with
\texttt{\textgreater{}\textgreater{}}.) If the result of
\texttt{x \textless{} 5} is False, then the interpreter will jump
straight to line 6 (\texttt{return "no"}), skipping line 4
(\texttt{return "yes"}). Thus, the interpreter uses jump instructions to
selectively skip over parts of the instruction set.
Python loops also rely on jumping. In the bytecode below, notice that
the line \texttt{while x \textless{} 5} generates almost identical
bytecode to \texttt{if x \textless{} 10}. In both cases, the comparison
is calculated and then \texttt{POP\_JUMP\_IF\_FALSE} controls which
instruction is executed next. At the end of line 4 --- the end of the
loop's body --- the instruction \texttt{JUMP\_ABSOLUTE} always sends the
interpreter back to instruction 9 at the top of the loop. When x
\textless{} 10 becomes false, then \texttt{POP\_JUMP\_IF\_FALSE} jumps
the interpreter past the end of the loop, to instruction 34.
\begin{verbatim}
>>> def loop():
... x = 1
... while x < 5:
... x = x + 1
... return x
...
>>> dis.dis(loop)
2 0 LOAD_CONST 1 (1)
3 STORE_FAST 0 (x)
3 6 SETUP_LOOP 26 (to 35)
>> 9 LOAD_FAST 0 (x)
12 LOAD_CONST 2 (5)
15 COMPARE_OP 0 (<)
18 POP_JUMP_IF_FALSE 34
4 21 LOAD_FAST 0 (x)
24 LOAD_CONST 1 (1)
27 BINARY_ADD
28 STORE_FAST 0 (x)
31 JUMP_ABSOLUTE 9
>> 34 POP_BLOCK
5 >> 35 LOAD_FAST 0 (x)
38 RETURN_VALUE
\end{verbatim}
\aosasectii{Explore Bytecode}\label{explore-bytecode}
I encourage you to try running \texttt{dis.dis} on functions you write.
Some interesting questions to explore are:
\begin{aosaitemize}
\item
What's the difference between a for loop and a while loop to the
Python interpreter?
\item
How can you write different functions that generate identical
bytecode?
\item
How does \texttt{elif} work? What about list comprehensions?
\end{aosaitemize}
\aosasecti{Frames}\label{frames}
So far, we've learned that the Python virtual machine is a stack
machine. It steps and jumps through instructions, pushing and popping
values on and off a stack. There are still some gaps in our mental
model, though. In the examples above, the last instruction is
\texttt{RETURN\_VALUE}, which corresponds to the \texttt{return}
statement in the code. But where does the instruction return to?
To answer this question, we must add one additional layer of complexity:
the frame. A frame is a collection of information and context for a
chunk of code. Frames are created and destroyed on the fly as your
Python code executes. There's one frame corresponding to each
\emph{call} of a function --- so while each frame has one code object
associated with it, a code object can have many frames. If you had a
function that called itself recursively ten times, you'd have eleven
frames --- one for each level of recursion and one for the module you
started from. In general, there's a frame for each scope in a Python
program. For example, each module, each function call, and each class
definition has a frame.
Frames live on the \emph{call stack}, a completely different stack from
the one we've been discussing so far. (The call stack is the stack
you're most familiar with already --- you've seen it printed out in the
tracebacks of exceptions. Each line in a traceback starting with ``File
`program.py', line 10'' corresponds to one frame on the call stack.) The
stack we've been examining --- the one the interpreter is manipulating
while it executes bytecode --- we'll call the \emph{data stack}. There's
also a third stack, called the \emph{block stack}. Blocks are used for
certain kinds of control flow, particularly looping and exception
handling. Each frame on the call stack has its own data stack and block
stack.
Let's make this concrete with an example. Suppose the Python interpreter
is currently executing the line marked 3 below. The interpreter is in
the middle of a call to \texttt{foo}, which is in turn calling
\texttt{bar}. The diagram shows a schematic of the call stack of frames,
the block stacks, and the data stacks. (This code is written like a REPL
session, so we've first defined the needed functions.) At the moment
we're interested in, the interpreter is executing \texttt{foo()}, at the
bottom, which then reaches in to the body of \texttt{foo} and then up
into \texttt{bar}.
\begin{verbatim}
>>> def bar(y):
... z = y + 3 # <--- (3) ... and the interpreter is here.
... return z
...
>>> def foo():
... a = 1
... b = 2
... return a + bar(b) # <--- (2) ... which is returning a call to bar ...
...
>>> foo() # <--- (1) We're in the middle of a call to foo ...
3
\end{verbatim}
\aosafigure[240pt]{interpreter-images/interpreter-callstack.png}{The call stack}{500l.interpreter.callstack}
At this point, the interpreter is in the middle of the function call to
\texttt{bar}. There are three frames on the call stack: one for the
module level, one for the function \texttt{foo}, and one for
\texttt{bar} (\aosafigref{500l.interpreter.callstack}.) Once
\texttt{bar} returns, the frame associated with it is popped off the
call stack and discarded.
The bytecode instruction \texttt{RETURN\_VALUE} tells the interpreter to
pass a value between frames. First it will pop the top value off the
data stack of the top frame on the call stack. Then it pops the entire
frame off the call stack and throws it away. Finally, the value is
pushed onto the data stack on the next frame down.
When Ned Batchelder and I were working on Byterun, for a long time we
had a significant error in our implementation. Instead of having one
data stack on each frame, we had just one data stack on the entire
virtual machine. We had dozens of tests made up of little snippets of
Python code which we ran through Byterun and through the real Python
interpreter to make sure the same thing happened in both interpreters.
Nearly all of these tests were passing. The only thing we couldn't get
working was generators. Finally, reading the CPython code more
carefully, we realized the mistake\footnote{My thanks to Michael
Arntzenius for his insight on this bug.}. Moving a data stack onto
each frame fixed the problem.
Looking back on this bug, I was amazed at how little of Python relied on
each frame having a different data stack. Nearly all operations in the
Python interpreter carefully clean up the data stack, so the fact that
the frames were sharing the same stack didn't matter. In the example
above, when \texttt{bar} finishes executing, it'll leave its data stack
empty. Even if \texttt{foo} shared the same stack, the values would be
lower down. However, with generators, a key feature is the ability to
pause a frame, return to some other frame, and then return to the
generator frame later and have it be in exactly the same state that you
left it.
\aosasecti{Byterun}\label{byterun}
We now have enough context about the Python interpreter to begin
examining Byterun.
There are four kinds of objects in Byterun:
\begin{aosaitemize}
\item
A \texttt{VirtualMachine} class, which manages the highest-level
structure, particularly the call stack of frames, and contains a
mapping of instructions to operations. This is a more complex version
of the \texttt{Intepreter} object above.
\item
A \texttt{Frame} class. Every \texttt{Frame} instance has one code
object and manages a few other necessary bits of state, particularly
the global and local namespaces, a reference to the calling frame, and
the last bytecode instruction executed.
\item
A \texttt{Function} class, which will be used in place of real Python
functions. Recall that calling a function creates a new frame in the
interpreter. We implement Function so that we control the creation of
new Frames.
\item
A \texttt{Block} class, which just wraps the three attributes of
blocks. (The details of blocks aren't central to the Python
interpreter, so we won't spend much time on them, but they're included
here so that Byterun can run real Python code.)
\end{aosaitemize}
\aosasectii{The \texttt{VirtualMachine}
Class}\label{the-virtualmachine-class}
Only one instance of \texttt{VirtualMachine} will be created each time
the program is run, because we only have one Python interpreter.
\texttt{VirtualMachine} stores the call stack, the exception state, and
return values while they're being passed between frames. The entry point
for executing code is the method \texttt{run\_code}, which takes a
compiled code object as an argument. It starts by setting up and running
a frame. This frame may create other frames; the call stack will grow
and shrink as the program executes. When the first frame eventually
returns, execution is finished.
\begin{verbatim}
class VirtualMachineError(Exception):
pass
class VirtualMachine(object):
def __init__(self):
self.frames = [] # The call stack of frames.
self.frame = None # The current frame.
self.return_value = None
self.last_exception = None
def run_code(self, code, global_names=None, local_names=None):
""" An entry point to execute code using the virtual machine."""
frame = self.make_frame(code, global_names=global_names,
local_names=local_names)
self.run_frame(frame)
\end{verbatim}
\aosasectii{The \texttt{Frame} Class}\label{the-frame-class}
Next we'll write the \texttt{Frame} object. The frame is a collection of
attributes with no methods. As mentioned above, the attributes include
the code object created by the compiler; the local, global, and builtin
namespaces; a reference to the previous frame; a data stack; a block
stack; and the last instruction executed. (We have to do a little extra
work to get to the builtin namespace because Python treats this
namespace differently in different modules; this detail is not important
to the virtual machine.)
\begin{verbatim}
class Frame(object):
def __init__(self, code_obj, global_names, local_names, prev_frame):
self.code_obj = code_obj
self.global_names = global_names
self.local_names = local_names
self.prev_frame = prev_frame
self.stack = []
if prev_frame:
self.builtin_names = prev_frame.builtin_names
else:
self.builtin_names = local_names['__builtins__']
if hasattr(self.builtin_names, '__dict__'):
self.builtin_names = self.builtin_names.__dict__
self.last_instruction = 0
self.block_stack = []
\end{verbatim}
Next, we'll add frame manipulation to the virtual machine. There are
three helper functions for frames: one to create new frames (which is
responsible for sorting out the namespaces for the new frame) and one
each to push and pop frames on and off the frame stack. A fourth
function, \texttt{run\_frame}, does the main work of executing a frame.
We'll come back to this soon.
\begin{verbatim}
class VirtualMachine(object):
[... snip ...]
# Frame manipulation
def make_frame(self, code, callargs={}, global_names=None, local_names=None):
if global_names is not None and local_names is not None:
local_names = global_names
elif self.frames:
global_names = self.frame.global_names
local_names = {}
else:
global_names = local_names = {
'__builtins__': __builtins__,
'__name__': '__main__',
'__doc__': None,
'__package__': None,
}
local_names.update(callargs)
frame = Frame(code, global_names, local_names, self.frame)
return frame
def push_frame(self, frame):
self.frames.append(frame)
self.frame = frame
def pop_frame(self):
self.frames.pop()
if self.frames:
self.frame = self.frames[-1]
else:
self.frame = None
def run_frame(self):
pass
# we'll come back to this shortly
\end{verbatim}
\aosasectii{The \texttt{Function} Class}\label{the-function-class}
The implementation of the \texttt{Function} object is somewhat twisty,
and most of the details aren't critical to understanding the
interpreter. The important thing to notice is that calling a function
--- invoking the \texttt{\_\_call\_\_} method --- creates a new
\texttt{Frame} object and starts running it.
\begin{verbatim}
class Function(object):
"""
Create a realistic function object, defining the things the interpreter expects.
"""
__slots__ = [
'func_code', 'func_name', 'func_defaults', 'func_globals',
'func_locals', 'func_dict', 'func_closure',
'__name__', '__dict__', '__doc__',
'_vm', '_func',
]
def __init__(self, name, code, globs, defaults, closure, vm):
"""You don't need to follow this closely to understand the interpreter."""
self._vm = vm
self.func_code = code
self.func_name = self.__name__ = name or code.co_name
self.func_defaults = tuple(defaults)
self.func_globals = globs
self.func_locals = self._vm.frame.f_locals
self.__dict__ = {}
self.func_closure = closure
self.__doc__ = code.co_consts[0] if code.co_consts else None
# Sometimes, we need a real Python function. This is for that.
kw = {
'argdefs': self.func_defaults,
}
if closure:
kw['closure'] = tuple(make_cell(0) for _ in closure)
self._func = types.FunctionType(code, globs, **kw)
def __call__(self, *args, **kwargs):
"""When calling a Function, make a new frame and run it."""
callargs = inspect.getcallargs(self._func, *args, **kwargs)
# Use callargs to provide a mapping of arguments: values to pass into the new
# frame.
frame = self._vm.make_frame(
self.func_code, callargs, self.func_globals, {}
)
return self._vm.run_frame(frame)
def make_cell(value):
"""Create a real Python closure and grab a cell."""
# Thanks to Alex Gaynor for help with this bit of twistiness.
fn = (lambda x: lambda: x)(value)
return fn.__closure__[0]
\end{verbatim}
Next, back on the \texttt{VirtualMachine} object, we'll add some helper
methods for data stack manipulation. The bytecodes that manipulate the
stack always operate on the current frame's data stack. This will make
our implementations of \texttt{POP\_TOP}, \texttt{LOAD\_FAST}, and all
the other instructions that touch the stack more readable.
\begin{verbatim}
class VirtualMachine(object):
[... snip ...]
# Data stack manipulation
def top(self):
return self.frame.stack[-1]
def pop(self):
return self.frame.stack.pop()
def push(self, *vals):
self.frame.stack.extend(vals)
def popn(self, n):
"""Pop a number of values from the value stack.
A list of `n` values is returned, the deepest value first.
"""
if n:
ret = self.frame.stack[-n:]
self.frame.stack[-n:] = []
return ret
else:
return []
\end{verbatim}
Before we get to running a frame, we need two more methods.
The first, \texttt{parse\_byte\_and\_args}, takes a bytecode, checks if
it has arguments, and parses the arguments if so. This method also
updates the frame's attribute \texttt{last\_instruction}, a reference to
the last instruction executed. A single instruction is one byte long if
it doesn't have an argument, and three bytes if it does have an
argument; the last two bytes are the argument. The meaning of the
argument to each instruction depends on which instruction it is. For
example, as mentioned above, for \texttt{POP\_JUMP\_IF\_FALSE}, the
argument to the instruction is the jump target. For
\texttt{BUILD\_LIST}, the argument is the number of elements in the
list. For \texttt{LOAD\_CONST}, it's an index into the list of
constants.
Some instructions use simple numbers as their arguments. For others, the
virtual machine has to do a little work to discover what the arguments
mean. The \texttt{dis} module in the standard library exposes a
cheatsheet explaining what arguments have what meaning, which makes our
code more compact. For example, the list \texttt{dis.hasname} tells us
that the arguments to \texttt{LOAD\_NAME}, \texttt{IMPORT\_NAME},
\texttt{LOAD\_GLOBAL}, and nine other instructions have the same
meaning: for these instructions, the argument represents an index into
the list of names on the code object.
\begin{verbatim}
class VirtualMachine(object):
[... snip ...]
def parse_byte_and_args(self):
f = self.frame
opoffset = f.last_instruction
byteCode = f.code_obj.co_code[opoffset]
f.last_instruction += 1
byte_name = dis.opname[byteCode]
if byteCode >= dis.HAVE_ARGUMENT:
# index into the bytecode
arg = f.code_obj.co_code[f.last_instruction:f.last_instruction+2]
f.last_instruction += 2 # advance the instruction pointer
arg_val = arg[0] + (arg[1] * 256)
if byteCode in dis.hasconst: # Look up a constant
arg = f.code_obj.co_consts[arg_val]
elif byteCode in dis.hasname: # Look up a name
arg = f.code_obj.co_names[arg_val]
elif byteCode in dis.haslocal: # Look up a local name
arg = f.code_obj.co_varnames[arg_val]
elif byteCode in dis.hasjrel: # Calculate a relative jump
arg = f.last_instruction + arg_val
else:
arg = arg_val
argument = [arg]
else:
argument = []
return byte_name, argument
\end{verbatim}
The next method is \texttt{dispatch}, which looks up the operations for
a given instruction and executes them. In the CPython interpreter, this
dispatch is done with a giant switch statement that spans 1,500 lines!
Luckily, since we're writing Python, we can be more compact. We'll
define a method for each byte name and then use \texttt{getattr} to look
it up. Like in the toy interpreter above, if our instruction is named
\texttt{FOO\_BAR}, the corresponding method would be named
\texttt{byte\_FOO\_BAR}. For the moment, we'll leave the content of
these methods as a black box. Each bytecode method will return either
\texttt{None} or a string, called \texttt{why}, which is an extra piece
of state the interpreter needs in some cases. These return values of the
individual instruction methods are used only as internal indicators of
interpreter state --- don't confuse these with return values from
executing frames.
\begin{verbatim}
class VirtualMachine(object):
[... snip ...]
def dispatch(self, byte_name, argument):
""" Dispatch by bytename to the corresponding methods.
Exceptions are caught and set on the virtual machine."""
# When later unwinding the block stack,
# we need to keep track of why we are doing it.
why = None
try:
bytecode_fn = getattr(self, 'byte_%s' % byte_name, None)
if bytecode_fn is None: