Skip to content

secp256k1: Optimize NAF conversion. #2695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 12, 2021

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented Aug 3, 2021

This is rebased on #2690.

This significantly optimizes the NAF conversion code by rewriting it to avoid all heap allocations as well as switch to an O(1) algorithm.

It also reworks the NAF tests to actually test the produced encoding adheres to the requires of NAF encoding as well as to make them more consistent with modern practices in the code.

Specifically, the following properties are asserted:

  • There must not be a leading zero byte and the number of bytes used to encode the negative portion must not exceed the positive portion
  • There must not be any adjacent non-zero digits
  • The negative and positive portions must sum back to the original value

The existing tests only ensure the positive and negative portions sum back up to the original value.

The updated tests also revealed that the existing naf function doesn't strip leading zeroes, so that logic has been added. That behavior is not a problem in terms of correctness given how it is used in the code since extra leading zeroes don't change the resulting value, but it could potentially cause more work to be done than necessary and having leading zeroes technically doesn't conform to NAF encoding.

Finally, it slightly reworks the scalar multiplication logic to improve its readability by making use of zero values and a single mask instead of shifting every individual byte of the positive and negative portions of k1 and k2.

The following benchmark shows a before and after comparison of the NAF conversion as well as how that translates to scalar multiplication and signature verification:

name        old time/op      new time/op     delta
----------------------------------------------------------------------
NAF          1.16µs ± 1%     0.08µs ± 1%     -93.02%  (p=0.008 n=5+5)
ScalarMult    138µs ± 1%      135µs ± 0%      -1.77%  (p=0.016 n=5+4)
SigVerify     164µs ± 0%      162µs ± 0%      -0.98%  (p=0.008 n=5+5)

name         old alloc/op    new alloc/op    delta
----------------------------------------------------------------------
NAF           96.0B ± 0%       0.0B          -100.00%  (p=0.008 n=5+5)
ScalarMult     816B ± 0%       720B ± 0%      -11.76%  (p=0.008 n=5+5)
SigVerify    1.54kB ± 0%     1.44kB ± 0%      -6.25%   (p=0.008 n=5+5)

name         old allocs/op   new allocs/op   delta
----------------------------------------------------------------------
NAF          2.00 ± 0%       0.00            -100.00%  (p=0.008 n=5+5)
ScalarMult   15.0 ± 0%       11.0 ± 0%        -26.67%  (p=0.008 n=5+5)
SigVerify    32.0 ± 0%       28.0 ± 0%        -12.50%  (p=0.008 n=5+5)

@davecgh davecgh added this to the 1.7.0 milestone Aug 3, 2021
@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from 58b7304 to e3f1ce2 Compare August 3, 2021 22:43
@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from e3f1ce2 to 1c44f93 Compare August 5, 2021 14:01
// Ensure the result doesn't have any adjacent non-zero digits.
gotPos := new(big.Int).SetBytes(pos)
gotNeg := new(big.Int).SetBytes(neg)
posOrNeg := new(big.Int).Or(gotPos, gotNeg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not immediately obvious to me why checking like this would work (e.g. couldn't maybe bit gotPos[10] = 1 and gotNeg[11] = 1 and thus this test would erroneously fail)?

But looking at the NAF definition using "signed-bit" representation makes it more clear (from my copy of Handbook of Elliptic and Hyperelliptic Curve Cryptography, p. 151):

The [signed-binary] representation is said to be in non-adjecent form, NAF for short if nᵢnᵢ₊₁ = 0
...
[...] the NAF of n = 478 is equal to 1000̅₁000̅₁0

So or'ing the two separate slices (which represent the positive and negative bits respectively) works here.

Copy link
Member Author

@davecgh davecgh Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, NAF is specifically the unique representation where there are no adjacent (side-by-side) non-zero digits, where those digits are from the set {0,±1}. In other words, there will always be a zero on either side of every digit (where those digits are +1 or -1).

I suspect it's probably easier to change the notation slightly to say that p represents the +1 signed digit and n represents the -1 signed digit, and 0 represents the 0 signed digit. Every number in NAF will always be some combination of ...{n,p}0{n,p}0{n,p}0....

Taking the n = 478 case as an example, the NAF in this notation is p000n000n0. That is 2^9 - 2^5 - 2 = 512 - 32 - 2 = 478.

When split up into the positive and negative arrays, it then would be:

pos: 1000000000  <--- all of the `p` (aka `+1`) digits have a 1 bit set here
neg: 0000100010  <--- all of the `n` (aka `-1`) digits have a 1 bit set here

Thus, taking their bitwise or will end up with bits set in the positions where there is either a p (aka +1) or an n (aka -1) which means that any two adjacent bits being set implies there were two adjacent non-zero signed digits, which is a violation of the encoding.

@matheusd
Copy link
Member

matheusd commented Aug 5, 2021

Commit msg for Improve scalar mult readability has "might not otherwise by obvious"

@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from 1c44f93 to 2b2aec3 Compare August 5, 2021 18:22
@davecgh
Copy link
Member Author

davecgh commented Aug 5, 2021

Commit msg for Improve scalar mult readability has "might not otherwise by obvious"

Corrected.

@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from 2b2aec3 to a5be2b2 Compare August 6, 2021 17:52
@davecgh
Copy link
Member Author

davecgh commented Aug 6, 2021

I added an additional edge-condition test case which results in a 257-bit NAF encoding since I noticed that wasn't explicitly tested.

@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from a5be2b2 to b309ba4 Compare August 11, 2021 19:21
This slightly reworks the scalar multiplication logic to improve its
readability by making use of zero values and a single mask instead of
shifting every individual byte of the positive and negative portions of
k1 and k2.

It also improves the comments to clarify the intention of the code which
might not otherwise be obvious given the bit twiddling nature of it.

The reduction in the number of shifts in the double and add loop is also
technically an optimization, but the calculation is by far dominated by
the ec operations, so saving a few nanoseconds on shifts is so negigible
that it's in the margin of error in benchmarks.
This reworks the NAF tests to actually test the produced encoding
adheres to the requires of NAF encoding as well as to make them more
consistent with modern practices in the code.

Specifically, the following properties are asserted:

- There must not be a leading zero byte and the number of bytes used to
  encode the negative portion must not exceed the positive portion
- There must not be any adjacent non-zero digits
- The negative and positive portions must sum back to the original value

The existing tests only ensure the positive and negative portions sum
back up to the original value.

The updated tests also revealed that the existing naf function doesn't
strip leading zeroes, so that logic has been added.  That behavior is
not a problem in terms of correctness given how it is used in the code
since extra leading zeroes don't change the resulting value, but it
could potentially cause more work to be done than necessary and having
leading zeroes technically doesn't conform to NAF encoding.
This modifies the NAF benchmark to ensure it is only benchmarking the
NAF function itself instead of including the big int to bytes
conversion.
This significantly optimizes the NAF conversion code by rewriting it to
avoid all heap allocations as well as switch to an O(1) algorithm.

The following benchmark shows a before and after comparison of the NAF
conversion as well as how that translates to scalar multiplication and
signature verification:

name        old time/op      new time/op     delta
----------------------------------------------------------------------
NAF          1.16µs ± 1%     0.08µs ± 1%     -93.02%  (p=0.008 n=5+5)
ScalarMult    138µs ± 1%      135µs ± 0%      -1.77%  (p=0.016 n=5+4)
SigVerify     164µs ± 0%      162µs ± 0%      -0.98%  (p=0.008 n=5+5)

name         old alloc/op    new alloc/op    delta
----------------------------------------------------------------------
NAF           96.0B ± 0%       0.0B          -100.00%  (p=0.008 n=5+5)
ScalarMult     816B ± 0%       720B ± 0%      -11.76%  (p=0.008 n=5+5)
SigVerify    1.54kB ± 0%     1.44kB ± 0%      -6.25%   (p=0.008 n=5+5)

name         old allocs/op   new allocs/op   delta
----------------------------------------------------------------------
NAF          2.00 ± 0%       0.00            -100.00%  (p=0.008 n=5+5)
ScalarMult   15.0 ± 0%       11.0 ± 0%        -26.67%  (p=0.008 n=5+5)
SigVerify    32.0 ± 0%       28.0 ± 0%        -12.50%  (p=0.008 n=5+5)
@davecgh davecgh force-pushed the secp256k1_optimize_naf branch from b309ba4 to 2994f0f Compare August 12, 2021 10:02
@davecgh davecgh merged commit 2994f0f into decred:master Aug 12, 2021
@davecgh davecgh deleted the secp256k1_optimize_naf branch August 12, 2021 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants