@@ -18,59 +18,79 @@ gcc -g opcodes.cpp -o opcodes
18
18
gdb opcodes -batch -ex " set disassembly-flavor intel" -ex " disass /r opcodes" > opcodes.intel
19
19
20
20
# Parse disassembly and generate code
21
- cat opcodes.intel | dotnet run > ../amd64InstrDecode.h
21
+ # Build as a separate step so it will display build errors, if any.
22
+ ../../../../../../dotnet.sh build
23
+ cat opcodes.intel | ../../../../../../dotnet.sh run > new_amd64InstrDecode.h
22
24
```
23
25
26
+ After checking it, copy the generated new_amd64InstrDecode.h to ../amd64InstrDecode.h.
27
+
28
+ This process can be run using the ` createTables.sh ` script in this directory.
29
+
24
30
## Technical design
25
31
26
- ` amd64InstrDecode.h ` 's primary purpose is to provide a reliable
27
- and accurately mechanism to implement
28
- ` Amd64 NativeWalker::DecodeInstructionForPatchSkip(..)` .
32
+ The primary purpose of ` amd64InstrDecode.h ` is to provide a reliable
33
+ and accurate mechanism to implement the ` amd64 `
34
+ ` NativeWalker::DecodeInstructionForPatchSkip(..) ` function .
29
35
30
36
This function needs to be able to decode an arbitrary ` amd64 `
31
37
instruction. The decoder currently must be able to identify:
32
38
33
- - Whether the instruction includes an instruction pointer relative memory access
39
+ - Whether the instruction includes an instruction pointer relative memory access (RIP relative addressing)
34
40
- The location of the memory displacement within the instruction
35
41
- The instruction length in bytes
36
42
- The size of the memory operation in bytes
37
43
38
- To get this right is complicated, because the ` amd64 ` instruction set is
39
- complicated.
44
+ To get this right is complicated, because the ` amd64 ` instruction set is complicated.
45
+
46
+ A high level view of the ` amd64 ` instruction set can be seen by looking at:
47
+
48
+ ` AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions `
49
+ ` Section 1.1 Instruction Encoding Overview `
50
+ ` Figure 1-1. Instruction Encoding Syntax `
51
+
52
+ also:
53
+
54
+ ` Intel(R) 64 and IA-32 Architectures Software Developer's Manual `
55
+ ` Volume 2: Instruction Set Reference, A-Z `
56
+ ` Chapter 2 Instruction Format `
40
57
41
- A high level view of the ` amd64 ` instruction set can be seen by looking at
42
- `AMD64 Architecture Programmer’s
43
- Manual Volume 3:
44
- General-Purpose and System Instructions`
45
- ` Section 1.1 Instruction Encoding Overview `
46
- ` Figure 1-1. Instruction Encoding Syntax `
58
+ Also useful in the manuals:
59
+
60
+ AMD: ` Appendix A: Opcode and Operand Encodings `
61
+ Intel: ` Volume 2, Appendix A: Opcode Map `
47
62
48
63
The general behavior of each instruction can be modified by many of the
49
- bytes in the 1-15 byte instruction.
64
+ bytes in the 1-15 byte instruction (15 is the maximum byte length of an instruction) .
50
65
51
66
This set of files generates a metadata table by extracting the data from
52
67
sample instruction disassembly.
53
68
54
- The process entails
69
+ The process entails:
55
70
- Generating a necessary set of instructions
56
71
- Generating parsable disassembly for the instructions
57
72
- Parsing the disassembly
73
+ - Generating the tables
58
74
59
75
### Generating a necessary set of instructions
60
76
77
+ What set of possible instruction encodings are needed to extract the information
78
+ needed in the tables?
79
+
61
80
#### The necessary set
62
81
63
- - All instruction forms which use instruction pointer relative memory accesses.
82
+ - All instruction forms which use RIP relative memory accesses.
64
83
- All combinations of modifier bits which affect the instruction form
65
84
- presence and/or size of the memory access
66
85
- size or presence of immediates
86
+ - vector size bits
67
87
68
- So with modrm.mod = 0, modrm.rm = 0x5 (instruction pointer relative memory access)
88
+ So with modrm.mod = 0, modrm.rm = 0x5 (RIP relative memory access)
69
89
we need all combinations of:
70
90
- ` opcodemap `
71
91
- ` opcode `
72
92
- ` modrm.reg `
73
- - ` pp ` , ` W ` , ` L `
93
+ - ` pp ` , ` W ` , ` L ` , ` L'L `
74
94
- Some combinations of ` vvvv `
75
95
- Optional prefixes: ` repe ` , ` repne ` , ` opSize `
76
96
@@ -80,7 +100,7 @@ We will iterate through all the necessary set. Many of these combinations
80
100
will lead to invalid/undefined encodings. This will cause the disassembler
81
101
to give up and mark the disassemble as bad.
82
102
83
- The disassemble will then resume trying to disassemble at the next boundary.
103
+ The disassembly will then resume trying to disassemble at the next boundary.
84
104
85
105
To make sure the disassembler attempts to disassemble every instruction,
86
106
we need to make sure the preceding instruction is always valid and terminates
@@ -97,8 +117,7 @@ instruction.
97
117
98
118
Using a fixed suffix makes disassembly parsing simpler.
99
119
100
- After the modrm byte, the generated instructions always include a
101
- ` postamble ` ,
120
+ After the modrm byte, the generated instructions always include a ` postamble ` ,
102
121
103
122
``` C++
104
123
const char * postamble = " 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59,\n " ;
@@ -109,14 +128,14 @@ This meets the padding consistency needs.
109
128
#### Ordering
110
129
111
130
As a convenience to the parser the encoded instructions are logically
112
- ordered. The ordering is generally, but can vary slightly depending on
131
+ ordered. The ordering is generally as follows , but can vary slightly depending on
113
132
the needs of the particular opcode map:
114
133
115
134
- map
116
135
- opcode
117
136
- pp & some prefixes
118
137
- modrm.reg
119
- - W, L, vvvv
138
+ - W, L, L'L, vvvv
120
139
121
140
This is to keep related instruction grouped together.
122
141
@@ -160,6 +179,7 @@ their sizes. For instance:
160
179
- "OWORD PTR [ rip+0x53525150] "
161
180
- "XMMWORD PTR [ rip+0x53525150] "
162
181
- "YMMWORD PTR [ rip+0x53525150] "
182
+ - "ZMMWORD PTR [ rip+0x53525150] "
163
183
- "FWORD PTR [ rip+0x53525150] "
164
184
- "TBYTE PTR [ rip+0x53525150] "
165
185
@@ -177,7 +197,7 @@ gdb opcodes -batch -ex "set disassembly-flavor intel" -ex "disass /r opcodes" >
177
197
178
198
#### Alternative disassemblers
179
199
180
- It seems ` objdump ` could provide similar results. Untested, the parser may need to
200
+ It seems ` objdump ` could provide similar results. This is untested. The parser may need to
181
201
be modified for subtle differences.
182
202
``` bash
183
203
objdump -D -M intel -b --insn-width=15 -j .data opcodes
@@ -186,27 +206,28 @@ objdump -D -M intel -b --insn-width=15 -j .data opcodes
186
206
The lldb parser aborts parsing when it observes bad instruction. It
187
207
might be usable with additional python scripts.
188
208
189
- Windows disassembler may also work. Not attempted .
209
+ Windows disassembler may also work. It has not been tried .
190
210
191
211
### Parsing the disassembly
212
+
192
213
``` bash
193
214
# Parse disassembly and generate code
194
215
cat opcodes.intel | dotnet run > ../amd64InstrDecode.h
195
216
```
217
+
196
218
#### Finding relevant disassembly lines
197
219
198
220
We are not interested in all lines in the disassembly. The disassembler
199
- stray comments, recovery and our padding introduce lines we need to ignore.
221
+ stray comments, recovery, and our padding introduce lines we need to ignore.
200
222
201
223
We filter out and ignore non-disassembly lines using a ` Regex ` for a
202
224
disassembly line.
203
225
204
226
We expect the generated instruction samples to be in a group. The first
205
- instruction in the group is the only one we are interested in. This is
206
- the one we are interested in.
227
+ instruction in the group is the only one we are interested in.
207
228
208
229
The group is terminated by a pair of instructions. The first terminal
209
- instruction must have ` 0x58 ` as the last byte in it encoding. The final
230
+ instruction must have ` 0x58 ` as the last byte in its encoding. The final
210
231
terminal instruction must be a ` 0x59\tpop ` .
211
232
212
233
We continue parsing the first line of each group.
@@ -216,7 +237,7 @@ We continue parsing the first line of each group.
216
237
Many encodings are not valid. For ` gdb ` , these instructions are marked
217
238
` (bad) ` . We filter and ignore these.
218
239
219
- #### Parsing the disassambly for each instruction sample
240
+ #### Parsing the disassembly for each instruction sample
220
241
221
242
For each sample, we need to calculate the important properties:
222
243
- mnemonic
@@ -226,9 +247,9 @@ For each sample, we need to calculate the important properties:
226
247
- map
227
248
- opcode position
228
249
- Encoding Flags
229
- - pp, W, L, prefix and encoding flags
250
+ - pp, W, L, L'L, prefix and encoding flags
230
251
- ` SuffixFlags `
231
- - presence of instruction relative accesses
252
+ - presence of RIP relative accesses
232
253
- size of operation
233
254
- position in the list of operands
234
255
- number of immediate bytes
@@ -242,37 +263,32 @@ unknown sizes.
242
263
243
264
#### ` opCodeExt `
244
265
245
- To facilitate identifying sets of instructions, the creates an ` opCodeExt ` .
266
+ To facilitate identifying sets of instructions, the tool creates an ` opCodeExt ` .
246
267
247
268
For the ` Primary ` map this is simply the encoded opcode from the instruction
248
269
shifted left by 4 bits.
249
270
250
- For the 3D Now ` NOW3D ` map this is simply the encoded immediate from the
251
- instruction shifted left by 4 bits.
252
-
253
- For the ` Secondary ` ` F38 ` , and ` F39 ` maps this is the encoded opcode from
254
- the instruction shifted left by 4 bits orred with a synthetic ` pp ` . The
271
+ For the ` Secondary ` , ` F38 ` , and ` F39 ` maps this is the encoded opcode from
272
+ the instruction shifted left by 4 bits or'ed with a synthetic ` pp ` . The
255
273
synthetic ` pp ` is constructed to match the rules of
256
274
` Table 1-22. VEX/XOP.pp Encoding ` from the
257
- `AMD64 Architecture Programmer’s
258
- Manual Volume 3:
259
- General-Purpose and System Instructions`. For the case where the opSize
260
- 0x66 prefix is present with a ` rep* ` prefix, the ` rep* ` prefix is used
261
- to encode ` pp ` .
275
+ ` AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions ` .
276
+ For the case where the opSize 0x66 prefix is present with a ` rep* ` prefix, the ` rep* ` prefix is used
277
+ to encode ` pp ` .
262
278
263
- For the ` VEX* ` and ` XOP* ` maps this is the encoded opcode from
264
- the instruction shifted left by 4 bits orred with ` pp ` .
279
+ For the ` VEX* ` maps this is the encoded opcode from
280
+ the instruction shifted left by 4 bits or'ed with ` pp ` .
265
281
266
282
#### Identifying sets of instructions
267
283
268
- For most instructions, the opCodeExt will uniquely identify the instruction.
284
+ For most instructions, the ` opCodeExt ` will uniquely identify the instruction.
269
285
270
- For many instructions, ` modrm.reg ` is used to uniquely identify the instruction.
286
+ For many instructions, ` modrm.reg ` is used to help uniquely identify the instruction.
271
287
These instruction typically change mnemonic and behavior as ` modrm.reg `
272
288
changes. These become problematic, when the form of these instructions vary.
273
289
274
- For a few other instructions the ` L ` , ` W ` , ` vvvv ` value may the instruction
275
- change behavior. Usually these do not change mnemonic.
290
+ For a few other instructions the ` L ` , ` W ` , ` vvvv ` values may change the instruction
291
+ behavior. Usually these do not change mnemonic.
276
292
277
293
The set of instructions is therefore usually grouped by the opcode map and
278
294
` opCodeExt ` generated above. For these a change in ` opCodeExt ` or ` map `
@@ -291,7 +307,9 @@ set based on the encoding flags. These are the `sometimesFlags`
291
307
` sometimesFlags ` , check each rule by calling ` TestHypothesis ` . This
292
308
determines if the rule corresponds to the set of observations.
293
309
294
- Encode the rule as a string.
310
+ Encode the rule as a string. The rule might encode that the ` W ` bit or ` L `
311
+ bit causes a different memory/immediate behavior for the particular
312
+ ` <map, opCodeExt> ` entry.
295
313
296
314
Add the rule to the set of all observed rules.
297
315
Add the set's rule with comment to a dictionary.
@@ -328,7 +346,7 @@ samples, is restricted to a max compilation/link unit size. Early drafts
328
346
were generating more instructions, and couldn't be compiled.
329
347
330
348
However, there is no restriction that all the samples must come from
331
- single executable. These could easily be separated by opcode map...
349
+ single executable. These could easily be separated by opcode map.
332
350
333
351
## Risks
334
352
@@ -342,7 +360,7 @@ Until a reasonably featured disassembler is created, the new instruction
342
360
set can not be supported by this methodology.
343
361
344
362
The previous methodology of manually encoding these new instruction set
345
- would still be possible....
363
+ would still be possible.
346
364
347
365
### Disassembler errors
348
366
@@ -351,6 +369,7 @@ of the disassembler may have disassembly bugs. Using newer disassemblers
351
369
would mitigate this to some extent.
352
370
353
371
### Bugs
372
+
354
373
- Inadequate samples. Are there other bits which modify instruction
355
374
behavior which we missed?
356
375
- Parser/Table generator implementation bugs. Does the parser do what it
@@ -368,4 +387,4 @@ Regenerate and compare.
368
387
369
388
### New debugger feature requires more metadata
370
389
371
- Add new feature code, regenerate
390
+ Add new feature code, regenerate.
0 commit comments