-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JtR next generation #4260
Comments
Comment on improving multi-thread scalability for fast hashes within the current formats interface or with minor changes to it: #5435 (comment) |
I would add idea about pulling third party code into john: it would be nice to keep it structured to allow updates from upstream. I guess An alternative that seems nicer: |
For pascal strings, there is sds library. It is interesting because it holds both |
From attempts to hack on code quickly during contests, I would say that format interface is too detailed for many simple formats. So there is a lot of boilerplate code in simple cases. Same applies to openmp: support for it inside a format adds code that complicates reading. New interfaces should reduce code in formats and take care of such servicing as parallelization. Some formats really need the flexibility of current interface. So I guess there might be some pluggable adapters for different interfaces. I have a few considerations about multi-threading. Formats use global variables and are not compatible with threading as is. But there are the following variants (and probably some more that I did not think about):
Some formats use global arrays for its data, it gives a small speed up. But I think it works only with code inlined into |
Another idea: some objects for parsers to pack specific code like tag+hex with parameters like actual tag and length of hex part. So a format could say: raw hash as 64 hex lowercase only with required tag "$...$", postprocess binary with this callback. And there would be a function to parse hex hashes using such parameters. Then other formats would declare use of custom parser. So we would be able to implement more common parsers and switch formats onto them gradually. Postprocessing like reversing rounds is the thing that makes parsers inseparable from hashing code. Another interesting problem is that some groups of formats like cpu+opencl have different I have a prototype of a flexible dynamic parser where each parser is described by string. String is parsed into a tree and there is interpreter for this tree. I don't know the speed though. Strings would allow parsers to be defined in (Trees are needed to handle variants of forms. Other approach is bytecode-like array with jumps. Jumps are a bit harder to populate from macros.) |
Hm, |
Another idea: length-based queues or even full data-driven branching (codenamed "continuous feeding"). Writing down #5534 reminded me of that. Some aspects of the idea were discussed on john-dev. |
While I'm not sure we'll ever actually manage to give birth to the "next generation" re-write of JtR, I thought we should at least have an issue here for brainstorming. Here's some of my thoughts quoted (and edited) from some other place nearby:
We'd do a total re-write but heavily re-use existing code (after careful considerations on a case-by-case basis). We'd not add a single line of code until (re-)defining lots of things, like code style (tho' actually the code style could be not to have one [except we'd mostly keep the current one for core, as in non-plugs]), plug-in interfaces (we should have mode plugs as well), source tree structure and so on.
Some must-haves (IMHO):
-stdout
) are plug-ins. We should strive to make the-stdout
option as close to just any format as possible, from a code point of view.-stdin
) are plug-ins.set_key()
. Perhaps also (in lots of places but not everywhere) use a "pascal string" struct for strings in many parts of core.Optional features:
I think the concept of "format tags" and aliases should be a core thing.
The text was updated successfully, but these errors were encountered: