-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use restore_tester_fmt.c to find and correct restore issues #2082
Comments
Here is what rules does (currently)
here is the rec file:
So it appears that under rules, only the rule# gets restored, and the run starts at the first input word on that rule. So we need to save not only the rule# but also the word# for restore to work 'correctly'. |
Rules bug (listed above), found and fixed. Still more testing to do, but that one WAS a bug, and is corrected. NOTE, the restore_tester format kicks ass for finding these issues, and then knowing that it is correctly fixed!!!! |
Found another bug in wordlist resume. If the .rec file was made with a version that has mem_map enabled and restored on a build that does not have mem_map, then the offset was set to 0, so the new run does not restore properly. I think I can fix this, but not sure it is worth it. NOTE, if the .rec was made on a non-mem_map build, then it restores properly on a non-mem_map build (or on a mem_map build for that matter). Fixed it here. 62310e8 Now it does not matter if the build was mem_map or not. Either can resume the other's .rec files properly. |
I believe wordlist is fixed. If others wanted to see if there are other restore things that could be 'fixed', then by all means try (-stdin, -pipe ??) |
I will have to enhance restore_tester format for testing salted hashes. I think I will make a 2nd format for this, so that any issues between pure-raw and salted running 1 salt does not get in the way |
Here is multi-salt restore issues (wordlist and rules, but I do not think it will matter on the mode). Note, switched MAX/MIN candidates to 3
Note, the correct resume location is
|
I have multi-salt restore working, but it will (may) require a new REC-VERSION (v5). This 'could' be handled in the same manner as I have done on the other 'extra' blocks. So I could keep this at REC_VERSION=4 and add a block like this to the end of the file slt-v1 %x is a salt idx. Currently, I have added the salt idx to the bottom of the 'normal' block of restore stuff, and read it from there. But it would not have to be done in this manner. The way I did it for the other 'hybrid' modes would work just fine. I am asking (@magnumripper and @frank-dittrich mostly), which way they would like to see this done? I could easily add this as a slt-v1 block, and then later if solar adds this data into a REC5 model, we could switch our logic to use that (but keep the slt-v1 processing, so any .rec files created could still be read). But I really want this to be put into code. When working with a LARGE number of salts of some pretty slow formats, means that there is a lot of replicated work any time you have to shut down. |
When testing wordlist mode, be sure to also try adding EDIT it looks like you already did so. |
This really should be discussed on john-dev before doing anything. I presume the problem exists in john proper? Solar may want to fix this and in that case we should do nothing (except possibly testing ideas for him). |
I don't think storing salt idx in .rec file is the way to go, Also, I don't think it i a good idea to create .rec files that are incompatible with master branch. |
This is an important caveat. The current behavior fully supports stopping a job and then resume it when there's much fewer salts left (from cracking in other sessions). |
Not needed. As far as I know there is absolutely no difference between an unsalted format, and a salted one that has only one unique salt loaded. |
Don't we have an option to completely test one block of password candidates before stopping a session? Appatently |
It's not even an option, this is just how it works off the top of my head. Unless you brutally kill john with eg. |
See issue #1638 and your commit:
|
Sure, the --max-run-time option is special. Is that what we're testing here? We shouldn't use it, or it should be used with |
@magnumripper If you mean the behavior for sessions interrupted by pressing q or ctrl-c, then why is there a need to store the salt index in the .rec file at all? |
As far as I know there is no need but I'm not sure I understand how @jfoug tested it. |
Likewise, if you press 'q' or 'ctrl-c' twice, all bets are off. We shouldn't fix "too early restore" caused by that. We should just ensure that the restore doesn't miss anything, EVER. |
Solar has mentioned this, and made a statement about 'IF I ever chose to fix it'. I have done so keeping REC4 format. I simply append a slt-v1 block if there is a salt index (i.e. it is not at 0). It would be very trivial to switch this back to a REC5 structure if we later have that. All the other logic would need no adjustments. Only adjustment is where to grab the 'salt_idx' value. |
You should revert this and use a topic branch instead. And you should not merge that topic branch ever. Leave that to me. |
Agreed:
|
Don't get me wrong, you might be on to something. Or perhaps not. But you are too trigger happy for sure, with major core changes like this. Using topic branches, you are totally free to commit just anything for testing and reviewing. Even better you are free to |
We have a default
However, we don't want to perform a SHA-1 every time we switch salt. Yet we can't wait until "needed" because a hard kill should ideally resume perfectly too (the AM will nearly always be a hard kill). I'm not sure we need any fix at all, except possibly add an option to actually process all salts before soft abort. Given that, you'd simply run a sane number of salts at a time and there'll be no problem. |
Perhaps simply using the existing "salt hash" is good enough? It will home in to the right neighborhood even if that bucket contains several salts. And the salt hash is readily available. |
This type salt_compare function is a no go (from 7z) static int salt_compare(const void *x, const void *y)
{
const struct custom_salt *s1 = x;
const struct custom_salt *s2 = y;
return (s1->NumCyclesPower - s2->NumCyclesPower);
} This is due to undefined order if NumCyclesPower is same in a group. So the above needs to be something like: static int salt_compare(const void *x, const void *y)
{
const struct custom_salt *s1 = x;
const struct custom_salt *s2 = y;
if (s1->NumCyclesPower != s2->NumCyclesPower)
return (s1->NumCyclesPower - s2->NumCyclesPower);
return (some comparison that is repeatable)
} Dynamic format will also need adjustment. It sorts solely based upon salt length (that is a REQUIREMENT for dynamic due to implementation issues). But there will need to be a strcmp/memcmp added if the lengths are equal, so that the end result is 100% repeatable. Again, we have to store some indication of the salt being worked on (that may be salt-hash AND some other exact hash of the salt). Then upon restart, we scan for the salt-hash, and if found not found, we start over. Then we scan the salts for the right value, and if not found, start at the first one of that salt hash. If an exact match is found, then we know that we need to start there. That should be universal working, even if it has to fall back and redo salts (because the salt it no longer there, and was cracked somewhere else). It is likely that the 'matching id' could be done at salt load time, and stored, so no run-time recompute is needed. |
It wasn't a problem until you started to use the salt_compare() function for something new. |
This should have ALL salts in deterministic order (BUT does this not happen if that Now, I should also comment the salt_sort. Basically, it needs to provide an fully useful comparison function ready for qsort. The ONLY way to have qsort be deterministic (if there are NO dupes), is to never return 0, UNLESS the values match. Thus if there are no matching salts, then a properly written qsort should never see a 0 return from the int(cmp_)(void_,void*) function NOTE, there are busted implementations of qsort (they still work, just not optimally), where the qsort function will compare a value against itself. So it is not 100% out of realm to return a 0, BUT it should only be done if the salts is actually 100% identical. Also, I think salt_cmp is called during loader, to see if salt is the same, or does loader simply do a memcmp() call ? The salt_cmp() should really behave like memcmp. It can be 'smart' about it, but if the salt is not 100% equal to the other salt, then 0 should not be returned. The salt_cmp_num was easy to fix. I just passed in the salts, pre-called dyna_salt_init, and then use the dyna_salt_cmp function. |
We may have to change this in loader, to simply always sort, OR if not sorted, then there will be no salt-resume logic added for the current session (unless it has a fmt.salt_compare function). int always;
always = cfg_get_bool(SECTION_OPTIONS, NULL, "AlwaysSortSalts", 1);
if (db->salt_count < 2 || (!fmt_salt_compare && !always))
return;
// sorting happens later in the ldr function @magnumripper what do you think? NOTE, if the order of the input file is not changed, and there is no sorting, then I believe the salt order would still be deterministic. This may not be 100% correct, since I think the salt_hash buckets are created, and if there are some buckets removed, it would change the order of the buckets. NOTE, I still do not have my mind fully around all the intricacies of how salts are stored, with or without salt_hash functions. |
I think your focus is completely off. Only a handful formats have their own salt sort, and we can ignore this new resume enhancement for them. You should focus on the 400+ other formats. |
No, wait... we NEED a default salt sort that is deterministic, that is what you say right? The ones that currently (optionally) is salted by "most used first" should instead default to some deterministic order. |
Absolutely. That is what I have right now, in the 2nd commit to this branch. I have also 'fixed' the other formats to properly sort (dynamic and the 2 7z's which should get unified code). I have also looked at the salt_hash, and it appears to play no part at all, when we get to the cracker loop (which is where we have to 'skip' salts to resume. I made a 3rd test format, that also added a salt_hash, and the salt hash was simply the 4th byte (minus '0'). Note, that the salts are in deterministic salt order. They do not give a crap about the salt hash. I am pretty sure that was only used during the loader code reading salts, and striking them off from the .pot file.
|
This really should be a very easy solution, now that I am fully seeing the nuances of the salt running and am able to get salts into deterministic order all the time (note, I am 'assuming' that I now only need to build a universal salt hash, and then on restore search for it. I will likely make this something computed at load time, and then simply used to store the .rec file, and later used to search for the thing. Very easy and should have no overhead during runtime (other than writing a few extra bytes into the .rec file). If I was not going out on a date for who knows how long (doing a cycle ride with no destination, lol), I would probably get this done today. But I will almost certainly have this branch working correctly by the start of next week. |
I believe this covers everything. There is a MD5 hash done at final salt load time (after stripped/sorted salts are loaded back into final usable array). Before starting a salt, there is a pointer in the status record that is pointed to this md5. NOTE during runtime, setting this pointer is the only CPU hit. At .rec file writing, the pointer from status is read, a 32 byte buffer ascii buffer built and written to file. No hashing needed. At recover time, a static buffer is built based upon the prior .rec file saved hash. Also a flag is set in the status object listing that we should resume the salt, if we can find it. in cracker loop, if that flag is set, we enter a loop that clears the flag, and then searches for the salt (based upon the MD5). If we find it, we skip our salt ahead to that salt record. If not, we simply revert to normal start at first salt logic. Boom, I think this is it! I see no holes in this logic. It all is based upon getting those salts into the same deterministic sorted order, so that when we find our MD5, we are 'golden'. Unfortunately a linear search is needed, but it is not 'too' bad. I put the MD5 buffers into code as uint32_t [4] array, so most of the walk is simply assigning our salt ptr to s->next and then checking if s->md5[0] == recover->md5[0] so it is all just pointer walking and 32 bit int dereference. |
Fuk! CI failed :( I gotta see why, grr! Well, that's not good. But should be easy to fix, lol |
Crap, I can't use this format for core. Core does not know about FMT_NOT_EXACT. Somewhat easy (partial) fix, so it is in there. Well, it is obvious that core does not restore prior salt work. I was pretty sure Solar had stated this fact prior. |
How do we test single mode ? Any ideas? I would think we may have to build a version which does a printf of every candidate (or use restore_tester format), and then have a list of words, in order. Then re-run the single mode, breaking out and restarting several times, and make sure that the exact same words were tested, and only tested one time. Can someone think of an easier way to do this? |
For single mode, you can have multiple "users" with the same hash. John will try candidates based on all the "user names". |
I've had the idea to allow |
But just printing the candidates will not work, because single mode generates individual candidates per hash, based on user name etc., so you would have to print the hashes as well. |
I have not started on this one yet, but I think the restore_tester will be ideal. It will list candidates and the hash they are being tested on at same time. Ideal I just have to use the salted variety, and make sure each hash has a unique salt. One thing single will not resume properly, is the try found words against other candidates. But we have ability to replay the .pot file to handle that. |
I think this slow shutdown has to do with rules. What takes long, is if there are a LOT of rules. john will NOT exit, until the current loaded salt/words block has been run against ALL rules. It must have been seen early on, that rules needed to be either forced to finish, or some form is sub-block restore code added. It looks like solar (or possibly even early with cracker jack) chose to take the easy route, and let the rules complete. However, this is probably not as bad as the resume of salts. For really slow (especially salted formats), you can not use rules much anyway. When it takes days (or weeks), to burn through all the salts on a load of data for a hash, I assure you, that rules are the last thing anyone worries about, lol. So it may be ok, to have rules 'complete', and not have to worry about sub-resume logic at all. Yes, there will be times where you hit 'q' and it takes a while to exit. How long, depends upon how slow the format is, and how willing you are to have an excruciatingly slow search. |
Does that (ctrl-c taking ages) really happen with CPU formats, or is it just with GPU formats with huge max_keys_per_crypt? |
I'm pretty sure it happens with CPU formats. @jfoug is probably right. |
Its within the cracker loop. Cracker is not going to exit until the rules are exhausted. However, that logic (which I am GLAD about), was not done for the multi-salts. So, what this implies, is that there is NO additional logic needed to get -rules[=anything] from restoring properly. It always should be good. But multi-salts, yes, there is need for resume there. NOTE, I have not tested this (in the multi-salt pull request) so I need to test. Multi-salted hash using -rules. That may have failures, because john may toss us to the next block of words. That will not be a logic error in the multi-salt logic, but will simply be something that may have to change. The restore_tester_fmt will help to easily shine the light on exactly what the functionality is. It may be that the current logic is working fine, and that the save/resume does not go to the next words until the salt is done, and 'only' pauses until all rules have spun through. But that is still to be seen, in the actual running of the executable. |
This fixes the bug in non-hybrid mask mode (was restarting with the candidate completed just before the save) $ git diff mask.c
diff --git a/src/mask.c b/src/mask.c
index b78a04c..f9e86f5 100644
--- a/src/mask.c
+++ b/src/mask.c
@@ -1769,7 +1769,8 @@ void mask_save_state(FILE *file)
int mask_restore_state(FILE *file)
{
int i, d;
- unsigned char uc;
+ unsigned cu;
+ int fixed = 0;
unsigned long long ull;
int fail = !(options.flags & FLG_MASK_STACKED);
@@ -1799,9 +1800,24 @@ int mask_restore_state(FILE *file)
return fail;
}
+ /* vc and mingw can not handle %hhu and blow the stack
+ and fail to restart properly */
for (i = 0; i < cpu_mask_ctx.count; i++)
- if (fscanf(file, "%hhu\n", &uc) == 1)
- cpu_mask_ctx.ranges[i].iter = uc;
+ if (fscanf(file, "%u\n", &cu) == 1) {
+ cpu_mask_ctx.ranges[i].iter = cu;
+ /*
+ * Code was starting off with the last value. So increment
+ * things by 1 so we start at the next mask value.
+ */
+ if (!fixed) {
+ if (++cpu_mask_ctx.ranges[i].iter <
+ cpu_mask_ctx.ranges[i].count) {
+ fixed = 1;
+ } else {
+ cpu_mask_ctx.ranges[i].iter = 0;
+ }
+ }
+ }
else
return fail;
restored = 1; This increments to the next value. I have tested with many 'q' and ^c vs single runs, and the end result with the above change is correct. I am not 100% sure this is the best way to fix this, BUT it does fix it. I am waiting to check this in, to let @magnumripper or @solardiz have a look and see if there is a better way. |
Rexgen 1.4.0 with proper resume is now checked into bleeding |
Just throwing this up here: Hashcat now has an alternate quit option 'c' (checkpoint) that is same as 'q' (quit) except the former waits until eg. all keys and salts are processed. It might take ages but it can also be very rewarding when resuming. I had a look at implementing it but it will be very tricky with OMP signaling and stuff. |
Things I know about. Add more as they are seen/fixed, since some of these should get done in core also.
Some of these could be handled as 'hybrid' type blocks added to the end of a .rec file (I think). Things like the rules and multi-salts redoing work, could be addressed in that manner. The 'normal' V4 processing of the rec file gets us back to the right words, and then the hybrid extra block appended provides additional information on how far to skip ahead inside the subwork done on the word or set of words, so that prior completed work is not replicated.
The restore_tester_fmt is great for determining exactly what was processed, written to .pot files, etc. So that we know 100% that we restart at proper location, not before, and CERTAINLY not later in the word stream.
The text was updated successfully, but these errors were encountered: