Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly reduce the amount of hash functions used in the code #2

Closed
fjarri opened this issue Jul 28, 2020 · 5 comments · Fixed by #36
Closed

Possibly reduce the amount of hash functions used in the code #2

fjarri opened this issue Jul 28, 2020 · 5 comments · Fixed by #36
Labels
enhancement New feature or request scoping Scoping required for further action

Comments

@fjarri
Copy link
Contributor

fjarri commented Jul 28, 2020

In the Python implementation we have, in different places, SHA256, Blake2b and Keccak. In this implementation some occurrences of Blake2b are replaced with SHA3 temporarily because of digest length requirements (e.g. in hash_to_scalar()). Could we use a single hash function everywhere?

@fjarri fjarri added enhancement New feature or request cryptography Needs attention of someone who knows what they're doing labels Aug 5, 2020
@tuxxy
Copy link

tuxxy commented Aug 12, 2020

Absolutely we can do this. Problematically, there's some places we may need to keep as keccak or sha256 to ensure compatibility with Ethereum, but that needs re-scoping.

I'm going to add the scoping label to this and leave the actionable item in this issue to identify which hash function calls are required for compatibility.

Be aware that we should likely also update pyUmbral at the same time, and subsequently release on nucypher to ensure compatibility. I'd put this at a relatively high priority should we want these changes before a mainnet release.

@tuxxy tuxxy added the scoping Scoping required for further action label Aug 12, 2020
@vepkenez
Copy link
Contributor

vepkenez commented Dec 4, 2020

After I spent so much time finding a useable and compatible blake2b library for JS 😢😢😢😢

@fjarri
Copy link
Contributor Author

fjarri commented Mar 2, 2021

From the discussion: use SHA-256 everywhere. Seems like unsafe_hash_to_point() can use it as well.

@fjarri fjarri removed the cryptography Needs attention of someone who knows what they're doing label Mar 3, 2021
@cygnusv
Copy link
Member

cygnusv commented Mar 3, 2021

To be clear, we will use SHA-256 "everywhere", but in some cases we can't use it directly; for example, when hashing to a scalar, where we will use a more complex construction which internally uses SHA256 (but no other hashing function).

After reading the security discussion in the hash-to-curve standard, it's clear that SHA256 alone could introduce a small bias when used modulo p directly, so it needs to be "expanded".

For the particular case of hashing to a scalar, we need to use the hash_to_field algorithm described in the standard. Internally, and for the particular combination of SHA256 and secp256k1, this uses a construction called expand_message_xmd which outputs 48-byte digests. See here some test vectors for expand_message_xmd instantiated with SHA256.

So, in essence, every time that we were using hash_to_curvebn in pyumbral, we would use now hash_to_field from the standard. Note that hash_to_field natively supports domain separation tags (#38), so we don't need to add them in an ad-hoc manner as we were doing in pyumbral.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 3, 2021

Hm, I'm confused here. hash_to_field is not an exact analogue of PyUmbral's hash_to_curvebn. The former hashes to the domain of curve coordinates, the latter to the domain of scalars. So they do the same thing, true, but security requirements may be different.

What was hash_to_curvebn in PyUmbral is now ScalarDigest, and uses Scalar::from_digest() from the backend library (k256), which is implemented as modulo reduction of the digest without expansion. Is that a potential problem? If it is, I would prefer to fix it in k256 and not here (may not be even possible to fix it here if k256 does not expose enough internals).

If I understand the implementation of hash_to_curvebn() in PyUmbral, it does the same as what is done here:

  • the DST is added just as another input to the digest
  • the digest is taken modulo the order without any expansion

The only difference I noticed is that it is guaranteed to return a non-zero value. Is that important?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request scoping Scoping required for further action
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants