expand_message_xmd injectivity #236

chris-wood · 2020-04-07T20:47:49Z

As pointed out by David Benjamin, Step 6 of expand_message_xmd is not injective! Here are two inputs that will produce the same output (b_0 = H(Z_pad + [0, 16, 0, 4, 0, 16, 0, 0])):

expand_message_xmd(msg=[], DST=[0, 16, 0, 0], len_in_bytes=16)
expand_message_xmd(msg=[0, 16, 0, 4], DST=[], len_in_bytes=16)

We should probably address this, though it's not clear what is the best way.

jedisct1 · 2020-04-08T08:02:24Z

H(Z_pad || l_i_b_str || I2OSP(0, 1) || DST_prime || msg)?

davidben · 2020-04-08T17:43:54Z

(Out of curiosity, any reason not to use HKDF? Different requirements?)

Does DST_prime need to be in a matching location between b_0 and b_i computations? Otherwise the I2OSP(i, 1) iteration number doesn't line up quite right. I assumed that was why the tail end of all the hash invocations matched.

To that end, if we're lining everything up at the tail end, I think it'd work to just redefine DST_prime to DST || I2OSP(len(DST), 1).

NB: I'm only guessing as to the requirements and am probably making wrong assumptions. I've only just started looking at this draft and have no context as to why any of it looks the way that it does.

davidben · 2020-04-08T18:07:48Z

expand_message_xof seems to have the same issue in the computation of msg_prime. I think the same fix to DST_prime addresses it.

It probably doesn't matter here, but this construction also has inconsistent block alignment of msg in expand_message_xmd. It sometimes follows a block-size prefix and sometimes a hash-sized prefix. (But it is injective with the DST_prime fix to align it all at the end rather than the start.) For comparison, HMAC (and thus HKDF-Extract) and the AD constructions in AES-GCM, AES-GCM-SIV, and ChaCha20-Poly1305 are all sensitive to the block size, with the AEADs even padding AD and ciphertext up so they both consistently start at block boundaries. (Then they add both lengths to be injective.) This is handy if, e.g., you've got iovecs floating around.

~~But this probably doesn't matter here. Fiddling with unaligned or partial blocks is negligible compared to the cost of field operations and presumably you aren't hashing bulk data.~~

(Ignore this; I missed that msg was not in the b_i calculations, only b_0.)

kwantam · 2020-04-09T00:27:34Z

(Out of curiosity, any reason not to use HKDF? Different requirements?)

This is a great question. We discussed at some length in #202.

Does DST_prime need to be in a matching location between b_0 and b_i computations?

No, the alignment doesn't really matter per the CDMP05 analysis.

To that end, if we're lining everything up at the tail end, I think it'd work to just redefine DST_prime to DST || I2OSP(len(DST), 1).

This is a great suggestion! Thank you.

kwantam · 2020-04-09T00:32:43Z

H(Z_pad || l_i_b_str || I2OSP(0, 1) || DST_prime || msg)?

Unfortunately, this doesn't work. We're enforcing domain separation with a trailing DST, and this would break that.

davidben · 2020-04-09T00:40:35Z

(I clearly didn't phrase "Does DST_prime need to be in a matching location between b_0 and b_i computations?" very well. That was confirming whether DST needed to be consistently trailing, because the #236 (comment) suggestion broke this properly. Sounds like the answer is yes.)

kwantam · 2020-04-09T00:44:53Z

It probably doesn't matter here, but this construction also has inconsistent block alignment of msg in expand_message_xmd

Wait, I don't get it. msg is only an input for the b_0 computation, and it's always exactly one block offset from the start, no? Do you mean block offset from the end? Or are you talking about the offset of b_0 in the b_i's? Sorry to be slow...

It occurs to me: we would have avoided the injectivity issue entirely if we'd encoded the length of the message in the hash. I suppose we might still consider doing this if we change the input for b_0 anyway (though we'd then have to decide what the longest msg we can support is... I worry that the embedded folks might complain if we add another 8 bytes to the input...)

davidben · 2020-04-09T00:47:08Z

Wait, I don't get it. msg is only an input for the b_0 computation, and it's always exactly one block offset from the start, no? Do you mean block offset from the end? Or are you talking about the offset of b_0 in the b_i's? Sorry to be slow...

Er, yes. Ignore me, I can't read and missed that b_i's don't have msg in them.

this closes cfrg#236

kwantam added a commit to kwantam/draft-irtf-cfrg-hash-to-curve that referenced this issue Apr 10, 2020

change DST encoding per @davidben's recommendation

c4e3249

this closes cfrg#236

kwantam mentioned this issue Apr 10, 2020

convert DST encoding to suffix free #241

Merged

chris-wood closed this as completed in #241 Apr 16, 2020

davidben mentioned this issue Sep 25, 2020

hashInput in Finalize and VerifyFinalize missing I2OSP calls cfrg/draft-irtf-cfrg-voprf#146

Closed

BasileiosKal mentioned this issue May 29, 2022

Alternative hash_to_scalar without an xof decentralized-identity/bbs-signature#151

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expand_message_xmd injectivity #236

expand_message_xmd injectivity #236

chris-wood commented Apr 7, 2020 •

edited

Loading

jedisct1 commented Apr 8, 2020

davidben commented Apr 8, 2020

davidben commented Apr 8, 2020 •

edited

Loading

kwantam commented Apr 9, 2020 •

edited

Loading

kwantam commented Apr 9, 2020

davidben commented Apr 9, 2020

kwantam commented Apr 9, 2020

davidben commented Apr 9, 2020

expand_message_xmd injectivity #236

expand_message_xmd injectivity #236

Comments

chris-wood commented Apr 7, 2020 • edited Loading

jedisct1 commented Apr 8, 2020

davidben commented Apr 8, 2020

davidben commented Apr 8, 2020 • edited Loading

kwantam commented Apr 9, 2020 • edited Loading

kwantam commented Apr 9, 2020

davidben commented Apr 9, 2020

kwantam commented Apr 9, 2020

davidben commented Apr 9, 2020

chris-wood commented Apr 7, 2020 •

edited

Loading

davidben commented Apr 8, 2020 •

edited

Loading

kwantam commented Apr 9, 2020 •

edited

Loading