Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multibit2john.py should try not to output a "hash" for wrong input #5243

Open
solardiz opened this issue Mar 7, 2023 · 7 comments
Open

multibit2john.py should try not to output a "hash" for wrong input #5243

solardiz opened this issue Mar 7, 2023 · 7 comments

Comments

@solardiz
Copy link
Member

solardiz commented Mar 7, 2023

Here are some assorted notes about multibit2john.py (and a bit about the corresponding format).

multibit2john.py was introduced by @kholia in #2548. There's an attached file with sample wallets in there. We do not have those in the john-samples repo yet - we need to add them to there.

There are 3 kinds of those wallets/keys supported:

  1. $multibit$1 are essentially openssl enc with MD5 and AES-256. Indeed, the sample from btcrecover (btcrecover/btcrecover/test/test-wallets/multibit-wallet.key, which by the way we don't have in john-samples) is crackable with password btcr-test-password by both multibit2john.py + john or openssl2john.py + john (with the correspondingly different formats). However, the latter also produces a flood of false positives. Maybe there's room for improvement/unification based on this understanding.

  2. For almost any unidentified file format, multibit2john.py happily produces a $multibit$2 hash. The only sanity check is based on filename, and it does not stop the program even if the filename doesn't contain the expected substrings - it merely prints "Make sure that this is a MultiBit HD wallet!", which isn't even clearly worded as a warning. We ought to do better, but looking at 3 sample files in @kholia's archive referenced above (one of which is the same as btcrecover/btcrecover/test/test-wallets/multibithd-v0.5.0/mbhd.wallet.aes) there doesn't appear to be a signature we could check for. The best idea I have is to require that the file size be a multiple of 16 (AES block size) and maybe that it's also in a reasonable range (the samples are all around 25K, but maybe that's a baseline size for nearly-empty wallet and it grows with use?)

  3. We use btcrecover-derived code for $multibit$3, including a pre-generated protobuf parser in protobuf/wallet_pb2.py. I wonder if we should sync this with upstream once in a while. This file is now btcrecover/bitcoinj_pb2.py upstream, so we could take it from there and adopt the rename, too. Its content changed quite a bit, but that could be a result of its regeneration with newer compiler.

@solardiz
Copy link
Member Author

solardiz commented Mar 7, 2023

a pre-generated protobuf parser in protobuf/wallet_pb2.py. I wonder if we should sync this with upstream once in a while. This file is now btcrecover/bitcoinj_pb2.py upstream

Our version has a try/except:

import sys
_b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1'))
try:
    from google.protobuf import descriptor as _descriptor
    from google.protobuf import message as _message
    from google.protobuf import reflection as _reflection
    from google.protobuf import symbol_database as _symbol_database
    from google.protobuf import descriptor_pb2
except ImportError:
    sys.stderr.write("Install the missing protobuf package, use 'pip install protobuf' command to do so.\n")
    sys.exit(1)
# @@protoc_insertion_point(imports)

Upstream's currently does not:

from google.protobuf import descriptor as _descriptor
from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
# @@protoc_insertion_point(imports)

I wonder if this was an edit by @kholia or if the old version had it that way, and more importantly whether we want to be adding a try/except when sync'ing with upstream.

I suppose it's nice to recommend pip install protobuf instead of simply failing, but OTOH this brings up root vs. non-root, etc.

btcrecover deals with this by shipping a file requirements.txt and recommending sudo apt install python3-pip and pip3 install -r requirements.txt in its docs/INSTALL.md. I actually used pip3 install --user -r requirements.txt as a user, which worked. Our project is not so heavily dependent on Python and even less on its extra modules, yet we could want to simplify figuring out which modules to install and how to get all of our Python scripts working.

@solardiz
Copy link
Member Author

solardiz commented Mar 7, 2023

a pre-generated protobuf parser in protobuf/wallet_pb2.py. I wonder if we should sync this with upstream once in a while. This file is now btcrecover/bitcoinj_pb2.py upstream

Another observation is our version at least tries to support both Python 2 and 3, and the new one is Python 3 only. So maybe we don't want to update it, unless and until there's a good reason to. It could be tricky to get the protobuf module installed for Python 2 these days, though.

@solardiz
Copy link
Member Author

solardiz commented Mar 9, 2023

The best idea I have is to require that the file size be a multiple of 16 (AES block size) and maybe that it's also in a reasonable range

We could also require that the file looks high-entropy. Do we have any shared Python code we could reuse/invoke for this?

@solardiz solardiz added this to the Potentially 2.0.0 milestone Mar 9, 2023
@solardiz solardiz changed the title multibit2john.py notes multibit2john.py outputs a "hash" even for wrong input Mar 9, 2023
@solardiz solardiz changed the title multibit2john.py outputs a "hash" even for wrong input multibit2john.py should try not to output a "hash" for wrong input Mar 9, 2023
@magnumripper
Copy link
Member

We could also require that the file looks high-entropy. Do we have any shared Python code we could reuse/invoke for this?

Hashcat has one mentioned in hashcat/hashcat#3637 just now.

@solardiz
Copy link
Member Author

Hashcat has one mentioned in hashcat/hashcat#3637 just now.

The entropy testing code they mention is in the kernels, so not in Python. Should be easy to translate or write our own.

I've also been thinking of simply importing the Python zlib module and trying compression. For valid $multibit$2 wallets, the compressed size should be greater than uncompressed, so that would be an easy test without us having to introduce an arbitrary threshold. However, the drawback is dependency on the zlib module - or is it installed along with Python itself these days? We could make it a soft dependency (use a try/except), but then we'd partially have the original problem - for some users "hashes" would be generated for too many other files (with size that is a multiple of 16, if we add that check as mandatory at least).

@solardiz
Copy link
Member Author

$multibit$1 are essentially openssl enc with MD5 and AES-256.

Unfortunately, there's a lot of potential for tools' misuse and user confusion here: if multibit2john.py is run on other openssl enc output files, including other kinds of wallets that use this kind of encryption, it will also produce a $multibit$1 "hash", which may be uncrackable (false negatives) if produced with newer openssl enc (so with SHA-256 instead of MD5) or if the pre-encryption file format is so different that we do not detect its successful decryption as a Multibit wallet. hashcat/hashcat#3876 talks about two examples of a Strongcoin wallet where one uses MD5 and is similar enough to Multibit to be crackable by our code as-is, and the other uses SHA-256 and so is uncrackable (but then that other one is artificial and not a real Strongcoin wallet, so I don't know if real Strongcoin wallets with this "problem" exist or not).

There's little we can do about it. Print a warning maybe?

@ghost
Copy link

ghost commented Oct 20, 2023

"However, the latter also produces a flood of false positives. Maybe there's room for improvement/unification based on this understanding."
That is because the multibit format is checking the decrypted output is base58 and starts with certain characters. The openssl format I'm assuming is checking for any printable character which will lead to false postives, there is no way to improve that unless you know something about the plaintext and would be algo specific, such as multibit.

"(but then that other one is artificial and not a real Strongcoin wallet, so I don't know if real Strongcoin wallets with this "problem" exist or not)."
They don't exist, strongcoin only uses md5 aes-256-cbc. The website mentioned there has missed that and just encrypted a file using 256.

I personally think the format is fine as is, but I guess an entropy check could reduce a few user errors, although I cant imagine its a prevalent thing that is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants