Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash-to-base issues #202

Closed
kwantam opened this issue Jan 22, 2020 · 22 comments
Closed

hash-to-base issues #202

kwantam opened this issue Jan 22, 2020 · 22 comments

Comments

@kwantam
Copy link
Collaborator

kwantam commented Jan 22, 2020

EDIT: Please see this comment for a summary of the proposed changes.


Sorry to reopen this can of worms, but I think there are a couple reasons to worry about the current version of hash-to-base.

  1. @BjoernMHaase points out that, for embedded systems, hashing can be much more expensive than field or curve operations. The reason is, hashing is done on the main (sometimes 8-bit!) processor, whereas field and curve operations use a specialized co-processor. The result is that, in his CPace implementation, he doesn't want to use our version of hash-to-base.

    It seems reasonable to expect this to be an issue for several classes of hash-to-curve applications. I think we should be responsive to it, and come up with a simplified version of hash-to-base that's safe in this context.

  2. @reyzin points out that in the VRF draft they were very careful to make sure that all invocations of SHA-2 are domain separated, but our use of H in hash-to-base is incompatible with this. In fact, we sat down and talked through it yesterday, and there does not appear to be a safe and cheap way of domain separating uses of H outside of hash-to-base from its uses inside. The only way to do it appears to be to select a random, 32-byte string (e.g., by hashing a domain separation tag) and prepending that value to every use of H outside of hash-to-base.

    This is really unsatisfying: first, it adds an extra invocation of the compression function, which is bad in resource-constrained contexts (per (1), above). Second, it means that domain separation requirements "escape" the hash-to-curve draft, in the sense that implementors need to understand how hash-to-curve uses H in order to safely reuse H in upper-layer protocols---an abstraction violation.

  3. We've heard from several people that they want to use SHA3, and our version of hash-to-base is overdesigned for this purpose. I don't care so much about overdesign in itself---it would be great if we could have a one size fits all version of hash-to-base---but given issues 1 and 2, we have an opportunity to revisit.

I'm working on a couple proposals to fix this, one based on KMAC (or perhaps SHAKE) and one that's designed to be safe (including in the sense of (2) above) and efficient for Merkle-Damgaard functions (but should also be safe and efficient for SHA-3). For now I'm not proposing a specific course of action, just giving early warning that this is an issue we should think carefully about.

@BjoernMHaase
Copy link

BjoernMHaase commented Jan 22, 2020 via email

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 22, 2020

Thanks for the comments, Björn! A few quick responses below.

  • The suggestions you make for protecting against side channels are excellent, but probably are out of scope for this draft. Of course, anyone is welcome to implement randomized projective coordinates, randomized field element representations, etc.! But probably a separate draft giving guidance on these strategies would be more useful than trying to put them into this draft---especially since these countermeasures are mostly generic, not hash-to-curve--specific.

  • Extracting with a hash function and then expanding with a PRF is a perfectly reasonable strategy in general. It seems worthwhile for us to suggest this as an alternative to the approach or approaches that the document specifies, but I can't promise that we will give a detailed specification of such an approach.

I will keep thinking about this.

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 22, 2020

Here's a skeletal proposal for a simplified approach to hash-to-field.

At a high level, this proposal splits hash-to-field into two steps:

  1. Generate a pseudorandom byte string based on the message and domain separation tag.

  2. Interpret this byte string as one or more elements of F = GF(p^m).

Note that this is a departure from the current approach: in the above, hash-to-field can return multiple field elements in one call. This means that both hash-to-curve and encode-to-curve will call hash-to-field exactly once, but will request either one (encode-) or two (hash-) field elements.

There are three reasons for this change: first, it minimizes the number of hash function invocations, which is one of the explicit goals of this redesign. Second, it simplifies drop-in use of any hash function in the SHA-2 or SHA-3 family, meaning that we do not have to specify a different hash-to-field function to use, say, SHAKE128. Third, it makes drop-in use of block cipher--based expansion easy (though, per my prior comment, I do not necessarily advocate specifying here).

Here's the new proposed hash-to-field function:

hash_to_field(msg, count)

Parameters:
- DST, a domain separation tag.
- F, a finite field of characteristic p and order q = p^m.
- p, the characteristic of F.
- m, the extension degree of F, m >= 1.
- L = ceil((ceil(log2(p)) + k) / 8), where k is the security parameter.
- expand_message, a function that takes a message, DST, and number of
  bytes and outputs that number of pseudorandom bytes.

Input:
- msg is the message to hash.
- count is the number of elements of F to output.

Outputs:
- (u_0, ..., u_(count - 1)), count field elements.

Steps:
1. prb_length = count * m * L
2. pseudo_random_bytes = expand_message(msg, DST, count * m * L)
3. for i in (0, ..., count - 1):
4.   for j in (0, ..., m - 1):
5.     elm_offset = L * (j + i * m)
6.     tv = pseudo_random_bytes[elm_offset : elm_offset + L]
7.     e_i = OS2IP(tv) mod p
8.   u_i = (e_0, ..., e_(m - 1))
9. return (u_0, ..., u_(count - 1))

In subsequent comments I'll discuss options for the expand_message function.

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 22, 2020

Now we just have to specify an expand_message function.

We can easily build one based on SHAKE128:

expand_message_shake128(msg, DST, len_in_bytes)

Input:
- msg, an octet string
- DST, an octet string
- len_in_bytes, length of requested output in octets

Output:
- pseudo_random_bytes, an octet string

Steps:
1. msg_prime = DST || I2OSP(len_in_bytes, 2) || msg
2. return SHAKE128(msg_prime, 8 * len_in_bytes)

(Note that the length argument to SHAKE128, per FIPS 202, is in bits.)

SHAKE256 is analogous, of course.

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 22, 2020

How about SHA-2? Here we have to be careful, since SHA-2 is a Merkle-Damgaard construction and is thus not itself indifferentiable from a random oracle, even if we assume that the underlying compression function is a random oracle. (This relates to well-known length extension attacks, multi-collision attacks, etc., on M-D constructions.)

Note, however, that the value expand_message returns to hash_to_field isn't exposed to hash_to_field's caller. Instead, it's cut into (log(p)+k)-bit chunks, which are reduced mod p. We can use this fact to our advantage!

In particular, while I haven't yet written down a full proof, it looks very likely that we can build on existing analyses of chop-MD (1, 2, 3, i.e., constructions like SHA512/256) to show that reducing a (log(p)+k)-bit integer mod p suffices for indifferentiability. Intuitively, the reason is that the same k extra bits we're using to get a near-uniform element of GF(p) also suffice to prevent length extensions, etc.

Here's the proposed function:

EDIT: removing this version, because it's not quite strong enough. I'll post a new one below.

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 22, 2020

EDIT: I vote not to specify this function in the hash-to-curve draft. Please regard this comment as spitballing only.


If we really wanted to specify something based on a ctr-mode cipher, we could do something like this:

expand_message_ctr(msg, DST, len_in_bytes)

Parameters:
- E, a block cipher encryption function taking kE-bit keys and bE-bit blocks.
- bE, the block size of E.
- kE, the key length of E.
- H, a hash function that outputs at least kE + bE + k bits, for security parameter k.

Input and output: same as above

Steps:
1.  ell = (len_in_bytes * 8) / bE
2.  ABORT if ell > 255
3.  msg_prime = H(DST || I2OSP(ell, 1) || msg)
4.  eKey = first kE bits of msg_prime
5.  eCtr = next kB bits of msg_prime
6.  ctr = OS2IP(eCtr)
7.  for i in (0, ..., ell - 1):
8.    ctr = ctr + 1
9.    b_i = E(eKey, I2OSP(ctr, kB / 8))
10. pseudo_random_bytes = b_0 || ... || b_(ell - 1)
11. return pseudo_random_bytes[0 : len_in_bytes]

Of course, it would be totally reasonable to replace the block cipher with a good stream cipher...

Since we're throwing away k bits of H's output, it's safe to use a Merkle-Damgaard hash function. When using SHA-3, we can relax the requirement on H's output size to just kE + bE.

For collision resistance, we require that kE + bE > 2 * k. Note that by this definition AES-256 is only appropriate for k = 192, not for k = 256 (because the AES block size is small). There are ways to handle this (e.g., XOR the output of two independently-seeded AES-CTR PRGs).

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 23, 2020

OK, here's an improved version of expand_message_md along with a very brief sketch of the security argument.

For this construction, we require a hash function H whose output size in bits is at least 2 * k, for k the security parameter.

expand_message_md(msg, DST, len_in_bytes)

Parameters:
- H, a Merkle-Damgaard or Sponge-based hash function.
- obs, the output block size of H in bytes.
- k_in_bytes, ceil(k / 8) for k the security parameter, e.g,
  for k = 128, k_in_bytes = 16.

Input:
- msg, an octet string.
- DST, an octet string.
- len_in_bytes, the length of the requested output in octets.

Output:
- pseudo_random_bytes, an octet string.

Steps:
1. ell = ceil((len_in_bytes + k_in_bytes) / obs)
2. ABORT if ell > 255
3. b_0 = H(DST || I2OSP(0, 1) || I2OSP(ell, 1) || msg)
4. for i in (1, ..., ell - 1):
5.   b_i = H(DST || I2OSP(i, 1) || b_(i - 1))
6. b_0_chopped = first (obs - k_in_bytes) bytes of b_0
7. pseudo_random_bytes = b_0_chopped || b_1 || ... || b_(ell - 1)
8. return pseudo_random_bytes[0 : len_in_bytes]

In principle one can skip chopping b_0 for functions in the SHA-3 family (or, more generally, for H with a reasonable indifferentiability proof). In practice, it's not clear to me whether the extra complexity is worthwhile. Thoughts?


Security argument (sketch)

We argue security by composing existing security arguments for chop-MD and NMAC.

In particular, chop-MD that cuts off k bits of a hash whose output is at least 2 * k bits is indifferentiable from a random oracle with distinguishing advantage roughly Q / 2^-k, for an adversary making Q queries, paraphrasing DFL12. Thus, the output b_0_chopped is indifferentiable from a RO with the same advantage.

We can view outputs b_1 through b_(ell - 1) in two ways. Either we can regard them as a variant of the NMAC / HMAC^f instantiation due to Coron et al. [CDMP05, Theorem 3.5], or we can simply notice that the length of inputs to H for all of these outputs is fixed and all calls to H are prefix-free (since they start DST || 0, DST || 1, ...). In either case, since the output size of H is at least 2 * k, we have from CDMP05 that the distinguishing advantage is roughly Q^2 / 2^(2 * k).

Next, we can leverage the composability of indifferentiability to show that the concatenation of independent random oracle outputs is indifferentiable from a random oracle. The simulator is the trivial one.

Finally, we need one more indifferentiability argument, namely, that chopping the "big" RO output into (log p + k)-bit chunks and reducing each one mod p is indifferentiable. Once again the simulator here is quite simple: for each field element, there is an equivalence class of roughly 2^k bit strings; the simulator simply picks one of these. This is indifferentiable for essentially the same reason that reducing a (log p + k)-bit value mod p gives a field element with statistical difference at most 2^-k from uniform.

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 23, 2020

OK, having worked through the above and thought carefully about this, I propose the following:

  1. Replace hash_to_base in the current spec with hash_to_field as defined above.
  2. Specify expand_message_shake128 / 256 and expand_message_md, but not expand_message_ctr. The reason is that we don't want to encourage people to use another primitive, and we don't want to take on the burden of proving security for the _ctr construction.
  3. Update the suite naming scheme to include a field that specifies the expand_message variant and underlying hash function.

There's one dangling question for the _md construction: do we chop for SHA-3, or only for SHA-2? This is a question of slightly simpler spec vs. slightly more efficient implementation. My inclination is towards the simpler spec, but reasonable people can obviously disagree... EDIT: My vote is simplicity of specification. Always chop b0, even for SHA-3.

cc @chris-wood @armfazh @grittygrease @samscott89

@BjoernMHaase
Copy link

BjoernMHaase commented Jan 23, 2020 via email

@BjoernMHaase
Copy link

BjoernMHaase commented Jan 23, 2020 via email

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 23, 2020

In case that the hash function output block length alone is sufficiently large for generating the required len_in_bytes amount of PRF bytes: Would you consider it acceptable to just drop the "chopping" operation for b0 and use all of the bytes of the b0 output? One might be sparing one additional invocation of the hash. Heuristically, I think that one then might want to add the len_in_bytes field to the initial hashing operation yielding b0, e.g. before the message. This way "length extension type" attacks are not manageble and using the message with different lengths results always in different expanded messages, even if the length is changed only by a small amount that does not modify the number of required blocks.

Unfortunately, it is not safe to drop the chop operation for b0.

It might be safe to do this if we prepended msg with its length, e.g.:

b_0 = H(DST || I2OSP(0, 1) || I2OSP(ell, 1) || I2OSP(len(msg), 8) || msg)

(But note I am only saying it might---I have not thought about it carefully enough to be sure.)

There are two reasons I didn't want to go this way: first, it's never obvious how long a field to make this. Surely 8 bytes is enough! But probably 4 isn't, since we could imagine hashing a couple gigabytes in some weird corner case. So now we need a pretty long field.

Second, the issue with needing to prepend len(msg) to msg is that it means we have to know the length of the message before we start hashing it! This is probably true in most usages, but it's not obviously true all the time.

(My question relates to the case of X25519, which is often used in conjunction with Ed25519 and SHA512, even if the security parameters are not matching. An encode2curve operation for X25519 could possibly share all of field arithmetics and hash function implementation with Ed25519. In this case only one single hash invocation of SHA512 would be required.)

Right! But for X25519 or Ed25519, we have log2(p) = 255, k = 128, so this scheme would use 384 bits to generate one element of GF(p), and 512 - 128 = 384, so in fact even without modification you can use a single SHA512 invocation. And in the hash-to-curve (rather than encode-to-curve) case, you would need 2 SHA-512 invocations even if you didn't chop b_0.

(Note that this isn't strictly true in the case of X25519, since we can get very close to uniform field elements by reducing 256 bits mod p since p is so close to 2^255. But I don't think we should add special cases for p of special forms, because it will complicate things considerably.)

@BjoernMHaase
Copy link

BjoernMHaase commented Jan 23, 2020 via email

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 23, 2020

If your suggestion becomes consensus in the Hash2Curve team, I'd be modifying the CPace and AuCPace drafts to use the hash2curve default constructions. (e.g. maybe with the tweak of using SHA512 together with Curve25519).

Cool!

By the way, the VRF draft also wants to use Curve25519 with SHA-512 (for the same reason as you, I think---because Ed25519 uses this combination), so I plan to propose defining a suite for this use-case.

@burdges
Copy link

burdges commented Jan 27, 2020

Did you guys ever assess https://eprint.iacr.org/2019/1294.pdf by Koshelev Dmitrii?

@kwantam
Copy link
Collaborator Author

kwantam commented Jan 27, 2020

Did you guys ever assess https://eprint.iacr.org/2019/1294.pdf by Koshelev Dmitrii?

Unless I'm misunderstanding, the above pertains to a new mapping, not to hash-to-base. So probably it would be better to discuss in its own issue.

First impression: I doubt we'd include this map given that we've just removed other "special-purpose" maps (see #198), and this is another map that's restricted to a family of curves whose practical interest seems limited. But if this doesn't seem right---that is, if there are interesting reasons to have a map specifically for the j=1728 case---then please open a new issue about it and I'd be glad to chat more.

@peteroupc
Copy link

Has the following discussion been considered? I believe some of the discussion there may be relevant to the hash_to_base function. sipa/bips#195

@kwantam
Copy link
Collaborator Author

kwantam commented Feb 3, 2020

Has the following discussion been considered? I believe some of the discussion there may be relevant to the hash_to_base function. sipa/bips#195

Thanks for the pointer.

In the context of hash-to-curve, the answer is that there's no public or private inputs---that depends on the invoking protocol. Because of this, it looks to me like there's no way to handle this generically in hash-to-curve: this is a question for the higher-level protocols that invoke h2c.

That seem to imply, however, that it might be worthwhile to add some words in the Security Considerations section discussing this, so that invoking protocols are suitably cautious.


As a second way of answering: h2c currently tries to help readers avoid timing leaks. Power side channels are not really in scope---and, related to what I've said above, it's not clear to me that they could ever be in scope in a generic way that defends all invoking protocols against attack. This also seems to imply the need for a pointer in Security Considerations, I think.

@BjoernMHaase
Copy link

BjoernMHaase commented Feb 3, 2020 via email

@secunets
Copy link

Great discussion. Following closely to learn as much as possible and contribute at the right time.

kwantam added a commit to kwantam/draft-irtf-cfrg-hash-to-curve that referenced this issue Feb 24, 2020
This commit incorporates the XOF and MD designs from cfrg#202.
kwantam added a commit to kwantam/draft-irtf-cfrg-hash-to-curve that referenced this issue Feb 25, 2020
This commit incorporates the XOF and MD designs from cfrg#202.
@grittygrease
Copy link
Collaborator

@kwantam In the spirit of code re-use, would it make sense to allow implementers who have a bigger toolbox than just cryptographic hashes and/xof functions to use these tools? For example, HKDF is available in many languages and libraries already. Perhaps we could allow HKDF to be used with the expand_message_xof function? It would be nice to be able to avoid implementing all of expand_message_xmd if HKDF is available.

@jedisct1
Copy link
Contributor

jedisct1 commented Apr 1, 2020

Do we really want even more options?

HKDF is indeed widely available (or, if not, trivial to implement) and could be the only option for all suites.

@kwantam
Copy link
Collaborator Author

kwantam commented Apr 3, 2020

Perhaps we could allow HKDF to be used with the expand_message_xof function? It would be nice to be able to avoid implementing all of expand_message_xmd if HKDF is available.

It would be pretty easy to define expand_message_hkdf() if we really want people to be able to use it. But:

HKDF is indeed widely available (or, if not, trivial to implement) and could be the only option for all suites.

Probably not, because of the issues with HKDF described in the first post.

I'm most worried about the fact that it makes strict domain separation impossible for the underlying hash. Moreover, it would make our discussion of domain separation in the document messier, because we'd have to explain that a lot of what we say doesn't apply to expand_message_hkdf. Or we'll have to explicitly assume (not at all unreasonably) that H-inside-HKDF and H-outside-HKDF are already sufficiently separated, and the fact that further separation is impractical is OK.

In any case, I suppose if we wanted to define it, it'd look something like this:

def expand_message_hkdf(msg, DST, len_in_bytes):
   prk = HKDF_Extract(DST, msg || I2OSP(0, 1))
   pseudo_random_bytes = HKDF_Expand(prk, "expand_message_hkdf", len_in_bytes)
   return pseudo_random_bytes

(The extra zero byte appended to msg makes the indifferentiability proof go through.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants