-
Notifications
You must be signed in to change notification settings - Fork 2.1k
/
EXTERNAL
221 lines (166 loc) · 10.3 KB
/
EXTERNAL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
Defining an external mode.
To define an external cracking mode you need to create a configuration
file section called [List.External:MODE], where MODE is any name that
you assign to the mode. The section should contain some functions
programmed in a C-like language. John will compile and use the
functions if you enable this cracking mode via the command line.
External functions.
The following functions are currently used by John:
init() called at startup, should initialize global variables
filter() called for each word to be tried, can filter some words out
generate() called to generate words, when no other cracking modes used
restore() called when restoring an interrupted session
new() called when starting each base word (hybrid cracking mode)
next() called to iterate during the hybrid cracking (see hybrid cracking)
All of them are of type "void", with no arguments, and should use the
global variable "word" (pre-defined as "int word[]"), except for init()
which is called before "word" is initialized. The variable "word"
contains the current candidate password to be tried, one character in
each array element, terminated with a zero.
The functions, if defined, should do the following with "word":
* filter() can modify the word, or zero out "word[0]" to skip it;
* generate() should set "word" to the next word to be tried, or zero out
"word[0]" when cracking is complete (this will cause John to terminate);
* restore() should set global variables to continue from the "word". If
there are no global variables needing restoring, an empty stub function
should be provided, or John will refuse to resume a session.
You can use an external mode on its own or with some other cracking
mode, in which case only init() and filter() will be used (and only
filter() will be required). Using an external filter is compatible with
all the other cracking modes and with the "--make-charset" command line
option.
It is recommended that you don't use filter() or at least don't filter
too many words out when using an external mode with your own generate().
It is better to modify generate() not to generate words that would get
filtered out.
Pre-defined variables.
Besides the "word" variable documented above, there are numerous more
variables, all of type "int":
"abort" and "status": When set to 1 by an external mode, these cause the
current cracking session to be aborted or the status line to be
displayed (just like on a keypress), respectively. These actions are
taken after having tested at least all of the candidate passwords that
were in external mode's "word" so far. In other words, the actions may
be delayed in order to process any buffered candidate passwords. Setting
"status" to 2 results in printing of detailed multi-line status.
"cipher_limit" contains the maximum password length in bytes, either from
the format definition or from --stdout[=LENGTH]. This variable should
not be changed by the external mode. Instead, it can be used to stop
generating new candidates should the password length get larger than
"cipher_limit".
"req_minlen" (requested min. length) and "req_maxlen" (requested max.
length), reflects the --min-length=N and --max-length=N options if used,
and are otherwise 0.
"session_start_time" is equivalent of time(NULL) at start, ie. the current
time expressed in seconds after Jan 1 1970. This variable is initialized
once, not updated. On a session resume you will end up getting a new value
though.
"utf32" can be set from an external mode, indicating that the word is made
of UTF-32 characters. It will then be re-encoded to whatever target encoding
is in use.
"target_utf8" can be read by an external mode and indicates whether the
target encoding is UTF-8 or not (some modes may take that into account when
considering output length). For examples of use of this and the "utf32"
variable, see eg. Repeats32 external mode in run/repeats32.conf.
"hybrid_total" and "hybrid_resume" are for hybrid external modes: If it is
easy for the script to compute total words which will be produced for a
base-word, and to adjust itself at restore() call to get back to where things
were left off, then these variables should be used. Within new() the total
count should be assigned to "hybrid_total" before the function returns.
Within restore(), hybrid_total should be set, AND hybrid_resume is the number
of iterations which were done in the prior run. restore() should use this to
change whatever is required within the script's global state, so that the next
call to next() will produce the correct next word. If the script can not
easily know how many candidate a word produces, or adjust its global state to
start from some random location, then the script should NOT set "hybrid_total"
or "hybrid_total". In this case, the cracker will call new() with the word,
and then call next() hybrid_resume times (throwing away those words), and then
start processing. The code was written this way, so that the entire code of
new() and next() did not have to be put into restore() in this type of script.
The language.
The compiler supports a C-like language, which includes only a basic
subset of C and differs from C in multiple ways. The supported keywords
are: break, continue, else, if, int, return, void, while.
You can define functions to be called by John (the ones described
above), define global and local variables (including one-dimensional
arrays), use all the integer operations supported in C, and use C and
C++ comments.
The following C features or traits are missing from John's compiler:
* short-circuit evaluation (instead, full expressions are currently
evaluated, with any side effects they might have);
* function calls (any functions defined as part of an external mode can
only be called by John core, but not by other external mode functions);
* "do" and "for" loops (there's only "while");
* data type declarations other than "int", one-dimensional arrays of
"int", and "void";
* "static" local variables (as it happens, all local variables are
currently static by default);
* pointers (array name alone refers to the first element);
* type casting (there's no use for it when we only have "int" anyway);
* comma operator;
* probably a lot more...
You can find some external mode examples in the default configuration
file supplied with John.
Limited portability, and undefined behavior.
The "int" data type is currently implemented in John using the system's
native C "int" data type. Thus, its size can vary, but on systems
supported by John it can be assumed to be at least 32 bits (and usually
is exactly 32 bits). Also, being a signed integer type, its behavior on
overflow is undefined (but in practice will usually be two's complement
wraparound).
Literal character constants (given in single quotes, such as 'a') are
treated as "unsigned char" cast to "int", thus producing values in the
range 0 to 255. We intend to preserve this behavior in future versions.
However, in pre-1.8.0 versions, literal character constants with the 8th
bit set could appear as negative "int" values.
While the current implementation and all older versions of John so far
consistently lack short-circuit evaluation, treat local variables as
"static", and treat array name alone as referring to the first element,
these properties might change in the future.
Limited robustness, and insecurity.
While the current implementation is meant to be reasonably robust for
use, it lacks array bounds checking. It is external mode authors' and
users' responsibility to ensure array sizes and indices are correct.
There's currently no compile-time checking that expressions being
assigned to or modified are valid lvalues. Attempting to assign to or
modify an expression that isn't a valid lvalue results in John crashing.
In general, external mode programs (and John configuration files in
general) are treated as trusted input, much like you would treat a C
file in John source tree. A malicious external mode can use out of
bounds array accesses to trigger arbitrary machine code execution.
External Hybrid Scripting.
This is a newer additional mode. It is somewhat like generate, however
instead of fully 'generating' words, the Hybrid mode is given a word
and then called repeatedly to morph that original word. To do this,
there were 2 additional functions created, and 2 additional ext_global
variables added to aid in session resuming.
Hybrid Functions:
* new() This function is called at the start of each NEW word to be worked on.
The script should do whatever is needed to setup its state. This may be
counting letters in the base word, saving off the base word, or many other
things. The only time that new() is NOT called, is upon a restore() call
when the script knows how to easily restore. In that case restore() should
have things setup so that the call to next() returns the right morph of
the base-word. So new() is not called, which would setup the script to
start over on the word. If the format DOES know how to restore itself
to an arbitrary location, then the 'hybrid_total' ext_global variable
should be set before new() function returns.
* next() This function is called 1 or more times. It should do whatever
transformation is needed, and the word[] array should contain a null
terminated representation of the transformed word. When all iterations
have been completed for a specific word, next() should set word[0]=0 as a
signal to the cracker that this base-word has been completed.
* restore() This function can be written to do nothing. This should ONLY be
done, if there is no way to easily count total candidates or to 'seek' to an
arbitrary location withing the morphing results of the base-word. The cracker
code will run a brute force method to get to the restore location in this
case. However, if there is an easier way to know how many total candidates
are created for a base-word, and to properly adjust global data so that the
next call to next() produces the right word (as do all subsquent calls to
next()), then this function must do that, AND set 'hybrid_total' to the
total ORIGINAL number of candidates generated from the base-word (not just
how many more are to come). The cracker uses this value as a check that
the script actually is setup correctly, and if so, cracker will not call
new()/next()/next().... and know that the script restored itself properly.
Following the above rules and functions, a external-hybrid script can do
pretty much anything required to transform/morph/mangle words.