Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand_message_xmd injectivity #236

Closed
chris-wood opened this issue Apr 7, 2020 · 8 comments · Fixed by #241
Closed

expand_message_xmd injectivity #236

chris-wood opened this issue Apr 7, 2020 · 8 comments · Fixed by #241

Comments

@chris-wood
Copy link
Collaborator

chris-wood commented Apr 7, 2020

As pointed out by David Benjamin, Step 6 of expand_message_xmd is not injective! Here are two inputs that will produce the same output (b_0 = H(Z_pad + [0, 16, 0, 4, 0, 16, 0, 0])):

expand_message_xmd(msg=[], DST=[0, 16, 0, 0], len_in_bytes=16)
expand_message_xmd(msg=[0, 16, 0, 4], DST=[], len_in_bytes=16)

We should probably address this, though it's not clear what is the best way.

@jedisct1
Copy link
Contributor

jedisct1 commented Apr 8, 2020

H(Z_pad || l_i_b_str || I2OSP(0, 1) || DST_prime || msg)?

@davidben
Copy link

davidben commented Apr 8, 2020

(Out of curiosity, any reason not to use HKDF? Different requirements?)

Does DST_prime need to be in a matching location between b_0 and b_i computations? Otherwise the I2OSP(i, 1) iteration number doesn't line up quite right. I assumed that was why the tail end of all the hash invocations matched.

To that end, if we're lining everything up at the tail end, I think it'd work to just redefine DST_prime to DST || I2OSP(len(DST), 1).

NB: I'm only guessing as to the requirements and am probably making wrong assumptions. I've only just started looking at this draft and have no context as to why any of it looks the way that it does.

@davidben
Copy link

davidben commented Apr 8, 2020

expand_message_xof seems to have the same issue in the computation of msg_prime. I think the same fix to DST_prime addresses it.

It probably doesn't matter here, but this construction also has inconsistent block alignment of msg in expand_message_xmd. It sometimes follows a block-size prefix and sometimes a hash-sized prefix. (But it is injective with the DST_prime fix to align it all at the end rather than the start.) For comparison, HMAC (and thus HKDF-Extract) and the AD constructions in AES-GCM, AES-GCM-SIV, and ChaCha20-Poly1305 are all sensitive to the block size, with the AEADs even padding AD and ciphertext up so they both consistently start at block boundaries. (Then they add both lengths to be injective.) This is handy if, e.g., you've got iovecs floating around.

But this probably doesn't matter here. Fiddling with unaligned or partial blocks is negligible compared to the cost of field operations and presumably you aren't hashing bulk data.

(Ignore this; I missed that msg was not in the b_i calculations, only b_0.)

@kwantam
Copy link
Collaborator

kwantam commented Apr 9, 2020

(Out of curiosity, any reason not to use HKDF? Different requirements?)

This is a great question. We discussed at some length in #202.

Does DST_prime need to be in a matching location between b_0 and b_i computations?

No, the alignment doesn't really matter per the CDMP05 analysis.

To that end, if we're lining everything up at the tail end, I think it'd work to just redefine DST_prime to DST || I2OSP(len(DST), 1).

This is a great suggestion! Thank you.

@kwantam
Copy link
Collaborator

kwantam commented Apr 9, 2020

H(Z_pad || l_i_b_str || I2OSP(0, 1) || DST_prime || msg)?

Unfortunately, this doesn't work. We're enforcing domain separation with a trailing DST, and this would break that.

@davidben
Copy link

davidben commented Apr 9, 2020

(I clearly didn't phrase "Does DST_prime need to be in a matching location between b_0 and b_i computations?" very well. That was confirming whether DST needed to be consistently trailing, because the #236 (comment) suggestion broke this properly. Sounds like the answer is yes.)

@kwantam
Copy link
Collaborator

kwantam commented Apr 9, 2020

It probably doesn't matter here, but this construction also has inconsistent block alignment of msg in expand_message_xmd

Wait, I don't get it. msg is only an input for the b_0 computation, and it's always exactly one block offset from the start, no? Do you mean block offset from the end? Or are you talking about the offset of b_0 in the b_i's? Sorry to be slow...

It occurs to me: we would have avoided the injectivity issue entirely if we'd encoded the length of the message in the hash. I suppose we might still consider doing this if we change the input for b_0 anyway (though we'd then have to decide what the longest msg we can support is... I worry that the embedded folks might complain if we add another 8 bytes to the input...)

@davidben
Copy link

davidben commented Apr 9, 2020

Wait, I don't get it. msg is only an input for the b_0 computation, and it's always exactly one block offset from the start, no? Do you mean block offset from the end? Or are you talking about the offset of b_0 in the b_i's? Sorry to be slow...

Er, yes. Ignore me, I can't read and missed that b_i's don't have msg in them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants