-
Notifications
You must be signed in to change notification settings - Fork 58
Description
The way that DictionaryMatcher#match works isn't consistent, or at least one could make an argument that it isn't. It gives too high entropy to some passwords that are just "worse" l33t-substituted versions of some other password. For example
The password "passw0rd" will be matched directly to a word found in the dictionary "passwords". In that dictionary "passw0rd" is at rank 411 so it is given an entropy of ~8.68.
The password "p4s5w0rd" will not be matched directly to a word found in the dictionary "passwords". Instead the match-method will try to use l33t-substitutions and one of those substitutions is the word "password". That word is then looked up in the dictionaries and found at rank 2 in the "passwords" dictionary. Thus, the password is given an entropy of ~3.32.
The "easy" fix for this might be to not short circuit the match-method with "continue" in the for-loops, but I'm not sure how that would affect other parts of the system and it will increase the running time of estimating a password.
Note that I'm not saying that "p4s5w0rd" is more secure than "passw0rd" and should get a higher score. What I'm saying is that its estimated entropy shouldn't be lower, since both strings are essentially a l33t-substitution on the word "password" - it's just that one of the strings is a common password while the other one isn't.
In general, if two strings can both be transformed into the same string and that transformed string has a lower entropy than either of the non-transformed strings, then both the non-transformed strings should get the entropy value of the transformed string.