-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added dynamic-compiler RDP format when script builds incorrectly #3568
added dynamic-compiler RDP format when script builds incorrectly #3568
Conversation
Cool! I will try this out and give feedback, just don't hold your breath cause I've got all sorts of things going on around here! |
Looks like circle-CI is now working again , yea! |
It was actually not that bad (other than figuring out how to test the script generated properly). The 'format' itself, is simply a structure to marshal data, and then the already built RDP code within dynamic_compiler. The code looks deceptively a lot like dynamic (where it actually runs the script), but it is very different. But the final running simply boils down to an array of function pointers which are called one after another. The biggest difference, is in this new code, there simply is a stack which has variables pushed onto it, the stack is popped (possibly multiple times) to get the variables, then operation is done, and the results are pushed back on to the stack. in dynamic, I simply have 2 variables. But here, since the stack is involved, it can handle near unlimited count of variables, but at a slower rate, since each gets marshaled into and out of the stack for usage/storage. |
I will test this with some known dynamic expressions which were failing earlier. This is awesome 👍 |
src/dynamic_compiler.c
Outdated
@@ -2432,6 +2479,9 @@ char *dynamic_compile_split(char *ct) { | |||
int dynamic_assign_script_to_format(DC_HANDLE H, struct fmt_main *pFmt) { | |||
int i; | |||
|
|||
/* assume comipler success. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numerous small errors (mostly signed/unsigned char*), but this one also. All fixed, and are in 2nd push.
The new format should handle any failing expressions. It has no optimizations, and is slow due to marshal code at each step, but it is fast enough to get the job done, AND if the expression IS a common expression (i.e. used a lot), then we can see about working them into a real format, whether it be with a dynamic script, or a fat format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo still here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be fixed next time I check things in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 39ad0fe
NOTE, even though it is passing the checks, do not merge it yet. There are still issues. I am getting get_key(0) failures on my AVX2 builds. There is some settings that I have to make, so that the SIMD buffers are not used. I have not figured all of that out yet.
The 256/256 AVX2 8x1 is totally wrong. But I think both of those issues are trivial, and should easily be fixable from within the dynamic_compiler linkage function (where we determine that the RDP code is needed). |
BTW I have some faint memory of .pot file issues with this format. We should look at them too if possible. |
14fd1fb should be much closer to working. The code was failing when MD5_X2 was set (that is usual). This is why it was working like a champ on my VC test system. There, MD5_X2 is not set, so it was cracking all. But with X2 set, the only every other crypt key was being used, and the 2nd usage overwrote the first (not what we want at all). |
Here are examples showing the speed difference. The reason the 2nd one works, is because the compiler optimizes out the sha1($s) into the 'salt' (i.e. it is computed only at load time). But this shows the difference in speed. NOTE, I still see a bug, where the 2nd one (which actually does a dynamic script), still lists that it will use the RDP. This should be a trivial bug to find.
|
Unprintable characters in pot file after cracking a dynamic_25 hash (#2437) perhaps? |
On 12/28/2018 3:06 AM, Dhiru Kholia wrote:
BTW I have some faint memory of .pot file issues with this format.
We should look at them too if possible.
#2437 <#2437> perhaps.
That is a 'core' dynamic' issue, not a dyna-compiler issue. The problem
here stems from difficulty in dealing with inline $ characters (and a
few other things). I really need to step back on that one, probably
pull out a bunch of hack code, write it correctly, and be done. There
has been about 10 bugs over the years on this, and each time, symptoms
were found and fixed. What that has left is quite a few competing
hacks, some which cause negative issues in other places. But no, this
issue does not need to be addressed as part of the dyna-RDP format
|
Here's some strangeness (prior to this PR):
Dynamic 6000? Where did that come from?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Visual review
src/dynamic_compiler.c
Outdated
@@ -2432,6 +2479,9 @@ char *dynamic_compile_split(char *ct) { | |||
int dynamic_assign_script_to_format(DC_HANDLE H, struct fmt_main *pFmt) { | |||
int i; | |||
|
|||
/* assume comipler success. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo still here
Using this branch
|
dynamic_6000 is the 'hidden' dynamic format value used in the internal script for dyna compiler. BUT I am not sure why (or how) it made its way into the .pot file. The dynamic=md5($p) string is what 'should' have made it into the .pot file. NOW, this 'is' a format which loads data from the internal 'library', so that may cause different (incorrect) behavior. I will investigate why this is happening and fix it. |
NOTE, I am adding a new flag, I have thought of a few optimizations I could add to the RDP format for some formats. One big thing, is to parse and find common expressions, eliminating wasted hash calls. Those can not be added to the dynamic script, but within the RDP, since we are not limited to just having the 2 variables of dynamic. So something like md5(md5($p).md5($s.$p).md5($p)) we could store off the md5($p) into a 'local' variable, pre-hash the salt, and then simply do: Now that I have RDP working well, I will work to clean things up, and we can get this new format checked in and then I will start to work on optimizations. I am not 100% sure we need to wait on an optimized version prior to jumbo2. If things get speed improved prior to jumbo2, good, if not, I vote it is not a show stopper. |
|$ echo '900150983cd24fb0d6963f7d28e17f72' > test.in |
|$ ../run/john test.in -form:dynamic='md5($p)' Loaded 1 password hash
(dynamic=md5($p) [256/256 AVX2 8x3]) Proceeding with single,
rules:Wordlist Press 'q' or Ctrl-C to abort, almost any other key for
status Proceeding with wordlist:../run/password.lst, rules:Wordlist
abc (?) 1g 0:00:00:00 DONE 2/3 (2018-12-28 16:11) 100.0g/s 336000p/s
336000c/s 336000C/s 123456..Geronimo Use the "--show
--format=dynamic=md5($p)" options to display all of the cracked
passwords reliably Session completed $ cat ../run/john.pot
$dynamic_6000$900150983cd24fb0d6963f7d28e17f72:abc |
Dynamic 6000? Where did that come from?
|$ ../run/john test.in -show 0 password hashes cracked, 2 left |
I see the same results as you (the dynamic_6000 signature in john.pot).
however, with a removed .pot file and when run this way:
../run/john test.in -form='dynamic=md5($p),nolib'
this is the result:
cat ../run/john.pot
@dynamic=md5($p),nolib@900150983cd24fb0d6963f7d28e17f72:abc
So this bug has something to do with a format 'coming' from the
pre-built library, and is certainly a bug I will fix now. also, a tiny
issue (possible issue), is that the ,nolib probably should not be there
either. I will look at that also. Things like ,nolib ,debug ,rdp
should not be part of the signature. They are flags and not variables.
the variables (such as ,saltlen=8 or ,c1=junk_text ) should be there,
since they are 'variables'
|
…ded for dyna-lib. expanded buffers in the compiler.
This is fixed in latest push. Something changed in core john startup, and common_init() was not being called, but 'only' for the hashes in the dynamic-compiler-lib. The md5($p) is one of them in the compiler lib. Yes, I also found this when adding (and testing) the ,rdp switch. That switch 'forces' usage of the RDP code. So I tried it on some compiler-lib formats, only to find they were failing badly. The problem was the binary() function was returning a null'd out buffer, since the atoi16 stuff was not initialized. Thus the results were never matching a null binary buffer. |
Fixed the john.pot output of things from the dynamic-compiler-lib (such as md5($p) ) when run on 'bare' hashes. The prepare within dynamic-compiler was not handling raw hashes properly here, because it did not know the raw hash length. I had to add that to the compiler lib structure. They all are 16 bytes now (all are MD5 hashes), but I did not want to force ONLY md5 hashes to be available in the lib. So the length of the hash had to be added to the lib structure. This was not an easy bug to find. The worst thing, is that 'most' of the functionality did work even with the dynamic_6000 signature being used. However, all formats within the lib would have 'appeared' to have cracked the hash when using the -show function, even if that hash was not right. Now, the proper @dynamic=md5($s)@01234...abc:password signature will be output into the john.pot file, so there is no mis-interpretation in the -show command. |
…rrors, per reviews
…at stored keys in input buffers
I think we can merge this now, right? Remaining issues are either of other scope or can be done later. |
This is a terrific improvement! 👍 |
Yes, this was a very good place to merge. It is working well. The only change I have done past this, was to get 1 hash to work in the script format (a utf16), but it is still lacking, AND not really a generic fix. There are simply too many 1-offs failing the script, the code emit part gets confused VERY easily. That is why the RDP format is so important. It is slower, yes, BUT it overcomes the deficiencies of the code emission part of dyna-compiler, and allows it to be fixed/upgraded/replaced over time, without having users failing to be able to complete their work. |
Here is working version. This simply detects a failed script at linkage time, emits a warning, sets a flag. Then within dynamic, if that flag lists script failure, then a RDP function within the dynamic compiler code is called, on each password to actually perform the work, since the dynamic code would not work.
Note, the dynamic compiler can still be optimized to handle more expressions. But at some point of expression complexity, the dynamic format CAN NOT be used successfully. That is the point where the RDP parsing of the expression will take over and complete the work for the expression.
This work is for #3562