forked from moyix/virtuoso
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.txt
144 lines (128 loc) · 7.79 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
iferret-logging-new - Tracing and logging version of QEMU
This contains the modified version of QEMU (branched from iFerret
many moons ago) that performs extensive logging. Thanks to
integration of Tim & Michael's op-switching patch, it's not too
slow.
There are scripts called startqemu{vnc,linux,haiku,osx}.sh in this
directory that will boot QEMU with the appropriate image and load
the snapshot called "introprog", which I usually have in a state
where the program I want to trace is ready to run. The scripts take
one argument, which is the name to use for the log file.
This directory also includes code to read the log files generated by
iFerret. oiferret (created by Makefile-oiferret) reads in the log
and prints the entries to stdout. By using Makefile-iferretso, you
can also build a shared library that provides access to the logfiles
in python. The resulting .so file is symlinked to the dynslicer
directory, which also has a python wrapper for the library using
ctypes.
Finally, make_iferret_code.pl has been updated so that it also
outputs a python module that contains the current op enumeration.
This, too, is symlinked into the dynslicer directory.
dynslicer - Slicing and translation code
This is where the bulk of the magic happens. The main code, which
lives in newslice.py, does roughly the following:
1. Load the trace from a binary file (using iferretpy.py)
2. Find the inputs and outputs
3. Filter out interrupts
4. Replace malloc-related functions with summaries
5. Perform some QEMU-specific fixups on the trace (e.g.,
splitting TBs that have internal jumps).
6. Do an initial slice on the trace with respect to the output
buffer.
7. Examine the CFG, and do a slice on any control-flow
instructions that were observed to go more than one
direction.
8. Ensure that if an instruction is marked in *any* instance of
a TB, it is marked in *all* instances of that TB. Do
additional slicing to pull in the dependencies of these newly
added instructions, and iterate until we hit a fixed point.
9. Read the userland memory (by looking at LD instructions), and
store it in a dictionary, so that modules that require static
data (for example, a static string) can still function
correctly.
10. Translate the code, using the translations defined in
translate_uop.py.
This main script is supported by several auxiliary modules:
qemu_data.py : data flow models for the slicing algorithm. These
are stored in a dictionary keyed by the name of the op. The
values are a pair of functions that, given an op's arguments,
will return the set of defs and uses, respectively. It also
contains some convenience functions like is_jcc and
is_dynjump.
translate_uop.py : translations used to turn QEMU micro-ops into
Python. These, too, are stored in a dictionary keyed by the
op name, and the values are functions that take the arguments
to the op.
qemu_trace.py : data structures for working with a trace. The
main thing in here is the TraceEntry data structure, which
has the same interface as the log entries stored by the
iferret.so, so the two can be mixed in the same trace.
summary_functions.py : replacements for malloc-related
functions. These typically require the current value of ESP
(so that data flow can be implemented for them), so
newslice.py determines it from the trace and passes it to
the summary function generator to get back the actual
instance of the summary.
iferretpy.py : wrapper for iferret.so. The trace returned by
this module's load_trace is a Python object that wraps the
underlying C array. It implements all the methods needed to
make it appear as a mutable sequence to Python, so you can
modify the trace using standard Python list operations.
newslice.py takes one required argument, the base filename of the
trace, and one optional switch that specifies the OS that the trace
was generated under (-o). This is currently only used to determine
the locations of mallocs to replace; if it is not set via the
command line, xpsp2 is assumed. Valid values are currently:
[xpsp2, haiku, osx, linux]
The output of newslice.py is a .pkl file (serialized Python
objects), which consists of a tuple of two dictionaries:
transdict: the translated code blocks, keyed by starting EIP
userland: static data read from userland
This .pkl file can be fed into the newmicrodo plugin under
Volatility to run the translated code (with the -m option).
Volatility-1.3_Beta - A version of Volatility for running generated code
The main feature of this directory is the newmcirodo.py plugin,
symlinked from the dynslicer directory. It allows running translated
uOps created by newslice.py. The plugin is currently set up to use
flat (dd-style) memory images, along with a register dump from QEMU.
Register dumps for the various OSes (Haiku, Linux, Windows, and
OS X), are available in this directory (named ending in .env). See
the help for newmicrodo for more options.
The only non-obvious feature of newmcirodo is its support for output
decoding. This feature (which corresponds to the -i switch), allows
the user to pass in a Python function (as a string) that will be
applied to the output data to decode it. The string is decoded as a
Python string before being executed, so you can include escape
characters (such as tabs and newlines). An example that decodes the
buffer as a list of DWORDs:
'def f(x):\n\tprint unpack("<%dI" % (len(x)/4), x)'
This version of Volatility has one modification that was needed to
support newmicrodo.py: write support has been added to the address
space classes. This was necessary to get the appropriate layering
for having copy-on-write at the physical address space layer.
iflogs/traces - Saved traces for various introspections
You'll note a mix of two formats here -- the newer binary log file
format and the older .trc/.trace format (text-based). Most of these
are fairly useless -- the .trc files are no longer supported
(loading and parsing them took too long), and the binary logfiles
become unreadable whenever a change is made to the logfile format.
So really only the most recent traces are likely to be useful. Once
this gets more stable, we can go back and regenerate the older
traces so they're all in a ready-to-analyze format.
scripts - Miscellaneous scripts I find useful
editpkl.sh: allows editing the translated code in .pkl files. Very
useful for adding debugging statements in the generated code.
mkiso.sh : generates an ISO of a directory that can be loaded by the
QEMU guest
introprog - Introspection programs to run in the guest
These are separated out into each OS we've tried. The programs all
use vm_mark_buf_{in,out} to signal to QEMU that the trace is
beginning / ending, and to give the location of inputs and outputs.
These two macros are defined in vmnotify.h, which exists in two
flavors: the Windows version, which uses assembly macros in MSVC
format, and the *nix version, which uses gcc-style inline assembly.
The programs themselves are quite simple, and should be
self-explanatory.
images - Hard drive and memory images used in training
The hard drive images are in qcow2 format. The memory images are in
raw dd format, suitable for use with Volatility.