-
Notifications
You must be signed in to change notification settings - Fork 169
/
Copy pathpe_file_format.tex.bak
457 lines (381 loc) · 17.9 KB
/
pe_file_format.tex.bak
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
\subsection{Overview and Headers}
\begin{frame}
\frametitle{Portable Executable File Format}
\begin{block}{History}
Microsoft based the PE file format on the Unix COFF file format. As such it is sometimes referred to as PE/COFF.
\end{block}
\begin{itemize}
\item \textit{Portable} in PE means
\pedbullet{Supports both 32-bit and 64-bit}
\pedbullet{Supports MIPS, DEC Alpha, PowerPC and ARM}
\item Files with .exe extensions are PE
\item Dynamic Link Libraries (DLLs) are PE
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{PE Format Layout}
\begin{center}
\includegraphics<1>[height=7cm]{images/pe_format/PE_Format.pdf}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{DOS and NT Headers Overview}
\begin{itemize}
\item These headers contain the very basic information to process \emph{PE} files
\item A \emph{PE} file begins with the \emph{DOS} stub, usually responsible for the \emph{
ÒThis program cannot be run in DOS modeÓ} message as well as location the \emph{PE} headers
\item The PE headers contain the bulk of the information about the PE file
\item Different sets of headers will be present depending on the type of data the PE file represents (an executable, a DLL, an object file)
\end{itemize}
\end{frame}
\begin{frame}[t]
\frametitle{DOS and NT Headers View}
\begin{columns}[T]
\begin{column}{6cm}
\begin{itemize}
\item PE files start with the DOS Header.
\begin{itemize} \item e\_magic = \emph{4D5Ah} \alert{MZ} \end{itemize}
\item NT headers comprise the FILE and the OPTIONAL headers.
\begin{itemize} \item Signature = \emph{5045h} \alert{PE} \end{itemize}
\end{itemize}
\end{column}
\begin{column}{7cm}
\includegraphics<1>[width=6cm]{images/pe_format/PE-DOS_Header.pdf}%
\end{column}
\end{columns}
\end{frame}
\begin{frame}
\frametitle{NT Headers: File Header View}
\begin{center}
\pgfimage[width=11cm]{images/pe_format/PE-FILE_Header}
\end{center}
\end{frame}
\begin{frame}
\frametitle{NT Headers: File Header}
\begin{itemize}
\item It's the first of the \emph{NT Headers} and \emph{File Header} follows immediately after the \emph{PE Signature}
\item Contains some interesting fields
\begin{itemize}
\item \emph{Machine} indicates the target architecure for this file
\item \emph{NumberOfSections}, the number of sections in the \emph{PE file}. This value is needed when exploring the section headers
\item \emph{TimeDateStamp} is not of a critical importance, but some malware actually seems not to zero it so it might give some insight on the approximate release time... can be easily faked too
\item \emph{SizeOfOptionalHeader} is an important element. Provides the exact size of the \emph{Optional Header} which is needed in order to properly parse the \emph{PE} file
\end{itemize}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{NT Headers: Optional Header View}
\begin{center}
\pgfimage[height=7cm]{images/pe_format/PE-OPTIONAL_Header}
\end{center}
\end{frame}
\begin{frame}
\frametitle{NT Headers: Optional Header (1)}
\begin{itemize}
\item \emph{Magic} = \emph{10Bh} (PE32+ \emph{0x20b})
\item \emph{AddressOfEntryPoint} is where execution of the executable code will begin (\alert{it's possible for other code within the executable to gain control before the entry point})
\item \emph{ImageBase}. All relative address are based on this one. It's also usually possible to find the \emph{PE} header of the executable at this address in memory (unless it has been intentionally deleted)
\item \emph{SectionAlignment} is the alignment of the sections in memory
\item \emph{FileAlignment} is the alignment on disk
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{NT Headers: Optional Header (2)}
\begin{itemize}
\item \emph{Operating system related fields} containing version specific information
\item \emph{NumberOfRvaAndSizes} is the number of directory entries in the following array. Depending on how many there are the size of the \emph{Optional Header} will vary, something that some tools sometimes forget (assuming a constant default size)
\item \emph{DataDirectory} is an array of structures pointing to additional information such as the \emph{Imports} and \emph{Exports} tables.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Section Header View}
\begin{center}
\pgfimage[width=10cm]{images/pe_format/PE-SECTION_Header}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Section Header}
\begin{itemize}
\item \emph{VirtualSize} is the size of the section once loaded in memory (can be bigger than \emph{SizeOfRawData}, in that case it's zero padded)
\item \emph{VirtualAddress} is the address of the section in memory, relative to the \emph{ImageBase}
\item \emph{SizeOfRawData} is the size of the section on disk (can be bigger than \emph{VirtualSize} due that its size is rounded at a \emph{FileAlignment} multiple)
\item \emph{PointerToRawData} is the offset within the file to the contents to be loaded in memory (\emph{should} be a multiple of \emph{VirtualSize})
\item \emph{Characteristics} contains flags with information such as whether the section can be executed, read, written into, etc.
\end{itemize}
\end{frame}
\subsection{Interactive Walkthrough}
\begin{frame}
\frametitle{DOS Header (1)}
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_01}
\end{center}
\end{frame}
\begin{frame}
\frametitle{DOS Header (2)}
\framezoom<1><2>[border](2.6cm,0cm)(7cm,3cm)
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_01A}
\end{center}
\end{frame}
\begin{frame}
\frametitle{NT Headers (1)}
\framezoom<1><2>[border](2.45cm,.5cm)(6.8cm,3cm)
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_02}
\end{center}
\end{frame}
\begin{frame}
\frametitle{NT Headers (2)}
\framezoom<1><2>[border](2.45cm,0.4cm)(7.3cm,3cm)
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_03}
\end{center}
\end{frame}
\begin{frame}
\frametitle{NT Headers (3)}
\framezoom<1><2>[border](2.28cm,.6cm)(8cm,3cm)
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_03A}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Optional Header (1)}
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_04}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Optional Header (2)}
\framezoom<1><2>[border](1.3cm,2.67cm)(7cm,3cm)
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_05}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Directories (1)}
\begin{center}
\pgfimage[height=8cm]{images/pe_file_inspection/slide_06}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Directories (2)}
\framezoom<1><2>[border](1.33cm,2.5cm)(7.9cm,4.6cm)
\framezoom<1><3>[border](1cm,2.88cm)(2.7cm,3.5cm)
\framezoom<1><4>[border](4.1cm,2.7cm)(2.5cm,3.2cm)
\begin{center}
\pgfimage[height=7cm]{images/pe_file_inspection/slide_07}
\end{center}%
\end{frame}
\begin{frame}
\frametitle{Section Headers (1)}
\begin{center}
\pgfimage[height=7cm]{images/pe_file_inspection/slide_08}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Section Headers (2)}
\begin{center}
\pgfimage[height=7cm]{images/pe_file_inspection/slide_09}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Section Headers (3)}
\framezoom<1><2>[border](2.67cm,3.76cm)(5.6cm,3.3cm)
\begin{center}
\pgfimage[height=7cm]{images/pe_file_inspection/slide_10}
\end{center}
\end{frame}
\subsection{Import/Export Address Tables}
\begin{frame}
\frametitle{Overview of the Import Address Table}
\begin{itemize}
\item The primary function of the \emph{Import Table} is to provide enough information to the loader to locate the API functions and other symbols needed by the executable
\item It also provides us with a summary of the range of actions used by the executable
\item Therefore hiding/obfuscating the \emph{Import Address Table} (\emph{IAT}) is a common technique in order to deprive analysts of a quick outlook
\item The \emph{IAT} can be rebuilt by different packers/obfucators with varying degrees of complexity
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The Import Address Table Structures View}
\begin{center}
\includegraphics<1>[width=11cm]{images/pe_format/PE-Imports.pdf}
\end{center}
\end{frame}
\begin{frame}
\frametitle{The Import Address Table Structures Commented}
\begin{block}{}
The \emph{Import Address Table} information is distributed among three different structures. Repeated as necessary to describe the composing elements
\end{block}
\begin{itemize}
\item The \emph{IMAGE\_IMPORT\_DESCRIPTOR} contains information about the \emph{DLL} containing the symbols to import
\item The \emph{IMAGE\_THUNK\_DATA} contains information about the specific symbol imported
\emph And finally, \emph{IMAGE\_IMPORT\_BY\_NAME} contains the name of the imported symbol if it's imported by name and not by ordinal alone
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_IMPORT\_DESCRIPTOR Structure}
\begin{center}
\includegraphics<1>[width=11cm]{images/pe_format/PE-Image_Import_Descriptor.pdf}
\end{center}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_IMPORT\_DESCRIPTOR Structure Commented}
\begin{block}{}
The \emph{IMAGE\_IMPORT\_DESCRIPTOR} contains information about the \emph{DLL} containing the symbols to import
\end{block}
\begin{itemize}
\item \emph{OriginalFirstThunk} contains the relative address of the import table (a \emph{NULL} terminated array of \emph{IMAGE\_THUNK\_DATA} structures) containing the symbols to be imported
\item \emph{Name} is the relative address of the name of the \emph{DLL} from which to import the symbols
\item \emph{FirstThunk} is normally identical to \emph{OriginalFirstThunk} except after the imports have been resolved, when it will contain the addressses of the symbols
\item If the imports are bound the \emph{TimeDateStamp} field will contain the timestamp of the referred DLL
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_THUNK\_DATA Structure}
\begin{center}
\includegraphics<1>[width=11cm]{images/pe_format/PE-Image_Thunk_Data.pdf}
\end{center}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_THUNK\_DATA Structure Commented}
\begin{block}{}
The \emph{IMAGE\_THUNK\_DATA} contains information about the specific symbol imported
\end{block}
\begin{itemize}
\item \emph{ForwarderString} is not really used, \emph{Microsoft}'s docs no longer even mention this field
\item \emph{Function} points to the data for the imported symbol when the image is bound or it has been resolved
\item \emph{Ordinal} of the symbol to import
\item \emph{AddressOfData} points to a \emph{IMAGE\_IMPORT\_BY\_NAME} structure with the name of the symbol to import
\item The symbol is either imported by ordinal or name, this is indicated by the most significant bit, if set indicates the entry should be imported by ordinal and by name otherwise
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_IMPORT\_BY\_NAME Structure Commented}
\begin{block}{}
\emph{IMAGE\_IMPORT\_BY\_NAME} contains the name of the symbol to import
\end{block}
\begin{itemize}
\item \emph{Hint} is an index into the exported symbols table of the imported \emph{DLL}. Its purpose is to speed up load, if the symbol at that index matches the name then a sequential lookup can be skipped
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The Import Address Table}
\begin{block}{}
Executables wanting to hide imported symbols can resort to a large number of tricks. Usually they will resolve the imported symbols themselves and thus the \emph{IAT} will appear nearly empty. Some of the most popular ways of building the \emph{IAT} are
\end{block}
\begin{itemize}
\item Manually going through the \emph{LoadLibrary}, \emph{GetProcAddress} sequence for all symbols
\item Looking them up through hashes of their names
\item Looking them up through signatures of their code
\end{itemize}
Once mapped, they can be integrated into the binary through
\begin{itemize}
\item Peculiar jump tables
\item Skipping the \emph{DLL} function's entry point. It confuses import rebuilding techniques \emph{searching for} or \emph{hooking at} known \emph{DLL} function entry points
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The Export Table}
\begin{block}{}
Executable files such as applications and \emph{DLL}s can export symbols for other components to import
\end{block}
\begin{itemize}
\item Both EXEs and DLLs can export symbols although EXEs rarely do so
\item DLLs need to export symbols in order for other executables to import them
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The Export Table Structures}
\begin{center}
\includegraphics<1>[width=11cm]{images/pe_format/PE-Exports.pdf}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_EXPORT\_DIRECTORY Structure Commented}
\begin{block}{}
The \emph{IMAGE\_EXPORT\_DIRECTORY} contains the information and pointers to code, the name and the ordinal for all exported symbols in the executable
\end{block}
\begin{itemize}
\item \emph{Characteristics}, this field should always be 0 according to the specification
\item \emph{TimeDateStamp}, \emph{MajorVersion} and \emph{MinorVersion} are self explanatory
\item \emph{Name} points to a relative address containing the name of the \emph{DLL}
\item \emph{NumberOfFunctions} is the number of pointers in the \emph{AddressOfFunctions} table
\item \emph{NumberOfNames} is the number of entries in the \emph{AddresOfNames} table
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The IMAGE\_EXPORT\_DIRECTORY Structure Commented}
\begin{block}{}
\emph{AddressOfNames} and \emph{AddressOfNameOrdinals} run parallel, having an entry for each exported symbol, the name can be \emph{NULL}, the ordinal in \emph{AddressOfNameOrdinals} is used to find the symbol's data by indexing with it \emph{AddressOfFunctions}
\end{block}
\begin{itemize}
\item \emph{AddressOfFunctions} points to an array of pointers containing the exported entries, it's indexed by an ordinal
\item \emph{AddressOfNames} points to an array of pointers containing the names of the exported entries
\item \emph{AddressOfNameOrdinals} points to an array of ordinals used to find the address of the exported data
\end{itemize}
\end{frame}
\subsection{Updated PE32+ and Usage Examples}
\begin{frame}
\frametitle{The Updated PE32+}
\begin{itemize}
\item The PE format has been expanded by Microsoft to accomodate for 64-bit architectures
\item While on some other aspects the executables have changed, most of the PE headers remain largely untouched
\item As a rule of thumb fields involving absolute addresses have been expanded to 8 bytes to accomodate for the 64-bit wide address space
\item Fields containing RVAs remain as 4 bytes as the maximum image size is limited to 2GB and only 31-bits are necessary to address it all with relative addresses
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Updated Fields in the Optional Header}
\begin{itemize}
\item Optional header's \emph{Magic} number is in PE32+ \emph{0x20b}
\item \emph{BaseOfData} has been removed from the Optional Header
\item The following are now 64-bit wide \emph{ImageBase, SizeOfStackReserve, SizeOfStackCommit, SizeOfHeapReserve, SizeOfHeapCommit}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Other Updated Fields}
\begin{itemize}
\item The following items have been updated in the IMAGE\_TLS\_DIRECTORY structure: \emph{StartAddressOfRawData, EndAddressOfRawData, AddressOfIndex, AddressOfCallBacks}
\item And the following in the IMAGE\_LOAD\_CONFIG\_DIRECTORY: \emph{SecurityCookie, SEHandlerTable, SEHandlerCount}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Curiosities of PE32+}
\begin{itemize}
\item Due to the new way of handling exceptions, the Exception Directory is supposed to contain most of the functions of the binary, more specifically, the "non-leaf" ones
\item This enables tools to readily know most of the functions within a binary without having to rely on discovery through disassembly
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{The Tiny PE challenge}
Solareclipse took on the \emph{Tiny PE} challenge set by Dil Dabah about creating the smallest valid PE file. The result was:
\begin{itemize}
\item The smallest possible PE file: 97 bytes
\item The smallest possible PE file on Windows 2000: 133 bytes
\item The smallest possible PE file that downloads and executes a file over WebDAV: 133 bytes
\end{itemize}
\pedref{tinype}
\end{frame}
\begin{frame}
\frametitle{The Tiny PE challenge}
In order to achieve such small sizes the following steps were taken:
\begin{itemize}
\item Decreasing file alignment
\item Removing the DOS stub
\item Removing data directories
\item Merging the section header within the Optional Header
\item Merging the import table within the Optional Header
\item Merging the IAT and DLL name
\item Reusing the storage provided by non-used fields, only two matter in the DOS header
\item Reducing the Optional Header size by truncating it at the minimum necessary length
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Additional Resources}
\begin{block}<1->{Microsoft Portable Executable and Common Object File Format Specification}
\pedref{MSPECOFF}
\end{block}
\begin{block}<1->{Portable Executable File Format Ð A Reverse Engineer View}
\pedref{CBJ-PE}
\end{block}
\end{frame}