Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DLN proof verification performance for large signing groups #203

Merged
merged 8 commits into from
Sep 23, 2022

Conversation

pdyraga
Copy link
Contributor

@pdyraga pdyraga commented Aug 23, 2022

Verification of discrete logarithm proofs is the most expensive part of threshold ECDSA key generation for large groups. In round 2 of key generation, the local party needs to verify proofs received from all other parties. The cost of a call to dlnProof.Verify measured on Darwin/arm64 Apple M1 max is 341758528 ns/op - see BenchmarkDlnProof_Verify. There are two proofs that need to be verified during the key generation so assuming there are two cores available exclusively for this work, the verification of 100 messages takes about 35 seconds. For a group size of 1000, the verification takes about 350 seconds. The verification is performed in separate goroutines with no control over the number of goroutines created. When executing the protocol locally, during the development, for a group size of 100, 100*99*2 = 19 800 goroutines for DLN proof verification are created more or less at the same time. Even such a powerful CPU as Apple M1 Max struggles with computing the proofs - it takes more than 16 minutes on all available cores and all other processes are starved.

To optimize the code to allow it to be executed for larger groups, the number of goroutines created at the same time for DLN proof verification is throttled so that all other processes are not perpetually denied necessary CPU time to perform their work. This is achieved by introducing the DlnProofVerifier that limits the concurrency level, by default to the number of CPUs (cores) available.

The throttling logic has no real impact on the DLN proof verification performance for small groups:

BenchmarkDlnProof_Verify-10    	           3	 343508611 ns/op	 1777336 B/op	  3803 allocs/op
BenchmarkDlnVerifier_VerifyProof1-10       3	 343131722 ns/op	 1857218 B/op	  4319 allocs/op
BenchmarkDlnVerifier_VerifyProof2-10       3	 342200917 ns/op	 1858941 B/op	  4319 allocs/op

Here is a two-minute profiling sample when executing locally key generation for a group size of 100 on the master branch:

image

The DLN proof verification is one of the most expensive parts of the key
generation protocol. This benchmark allows to check how expensive the
Validate call is.
Control the concurrency level when verifying DLN proofs

Verification of discrete logarithm proofs is the most expensive part of
threshold ECDSA key generation for large groups. In round 2 of key generation,
the local party needs to verify proofs received from all other parties. The cost
of a call to `dlnProof.Verify` measured on Darwin/arm64 Apple M1 max is
341758528 ns/op - see `BenchmarkDLNProofVerification`. There are two proofs that
need to be verified during the key generation so assuming there are two cores
available exclusively for this work, the verification of 100 messages takes
about 35 seconds. For a group size of 1000, the verification takes about 350
seconds. The verification is performed in separate goroutines with no control
over the number of goroutines created. When executing the protocol locally,
during the development, for a group size of 100, 100*99*2 = 19 800 goroutines
for DLN proof verification are created more or less at the same time. Even such
a powerful CPU as Apple M1 Max struggles with computing the proofs - it takes
more than 16 minutes on all available cores and all other processes are starved.

To optimize the code to allow it to be executed for larger groups, the number of
goroutines created at the same time for DLN proof verification is throttled so
that all other processes are not perpetually denied necessary CPU time to
perform their work. This is achieved by introducing the `DlnProofVerifier` that
limits the concurrency level, by default to the number of CPUs (cores) available.
The benchmarks are promising and shows the the validator does not add any
significant overhead over the DLN verification itself:

BenchmarkDlnProof_Verify-10                3	 342581417 ns/op	1766010 B/op	3790 allocs/op
BenchmarkDlnVerifier_VerifyProof1-10       3	 342741028 ns/op	1859093 B/op	4320 allocs/op
BenchmarkDlnVerifier_VerifyProof2-10       3	 341878361 ns/op	1851984 B/op	4311 allocs/op
The concurrency defaults to `GOMAXPROCS` and can be updated with a call to
`SetConcurrency`. The concurrency level is applied to the pre-params generator
and DLN proof validator. Since there are two optional values now when
constructing parameters, instead of passing safe prime gen timeout as the last
value of `NewParameters`, all expected parameters should be configured with
`Set*` functions.
DLN proof verification is the most computationally expensive operation of the
protocol when working in large groups. DLN verifier allows to throttle the
number of goroutines verifying the proofs at the same time so that other
processes do not get starved. DLN verifier is already applied to key generation
protocol. Here, it is getting applied to resharing as well.
The concurrency level is now available in all rounds constructing the
verifier and the optional concurrency feature is never used.
@pdyraga pdyraga force-pushed the dln-proof-performance branch from e60a601 to 4ba50f5 Compare August 25, 2022 14:46
pdyraga added a commit to keep-network/keep-core that referenced this pull request Aug 26, 2022
The fix have been proposed in bnb-chain/tss-lib#203
and is not yet merged. The version is temporarily updated to the branch to
unblock the work.

The version will have to be updated once the proposed PR gets merged.

The explanation of the fix from tss-lib below:

Verification of discrete logarithm proofs is the most expensive part of
threshold ECDSA key generation for large groups. In round 2 of key generation,
the local party needs to verify proofs received from all other parties. The cost
of a call to `dlnProof.Verify` measured on Darwin/arm64 Apple M1 max is
341758528 ns/op - see `BenchmarkDLNProofVerification`. There are two proofs that
need to be verified during the key generation so assuming there are two cores
available exclusively for this work, the verification of 100 messages takes
about 35 seconds. For a group size of 1000, the verification takes about 350
seconds. The verification is performed in separate goroutines with no control
over the number of goroutines created. When executing the protocol locally,
during the development, for a group size of 100, 100*99*2 = 19 800 goroutines
for DLN proof verification are created more or less at the same time. Even such
a powerful CPU as Apple M1 Max struggles with computing the proofs - it takes
more than 16 minutes on all available cores and all other processes are starved.

To optimize the code to allow it to be executed for larger groups, the number of
goroutines created at the same time for DLN proof verification is throttled so
that all other processes are not perpetually denied necessary CPU time to
perform their work. This is achieved by introducing the `DlnProofVerifier` that
limits the concurrency level, by default to the number of CPUs (cores) available.
`tss.NewParameters` is not validating provided values and I did not want to make
a breaking change there. Instead, I added a comment to `SetConcurrency` and
added a panic in the DLN proof verifier ensuring the protocol fails with a clear
message instead of hanging. This is aligned with how
`keygen.GeneratePreParamsWithContext` deals with an invalid value for
`optionalConcurrency` param.
@pdyraga pdyraga force-pushed the dln-proof-performance branch from c4ba3f1 to 73ee955 Compare August 26, 2022 12:16
@pdyraga pdyraga marked this pull request as ready for review August 26, 2022 12:46
@yycen
Copy link
Contributor

yycen commented Sep 1, 2022

Looks good and thanks for sharing the profiling results! I will have a test and read code soon.

@pdyraga
Copy link
Contributor Author

pdyraga commented Sep 22, 2022

Hey @yycen, I wanted to bump up this PR. TBTC v2 chaosnet starts next week and it would be great to ship this code. Currently we are attached to a fork, and we would love to switch back to the upstream.

@yycen yycen merged commit 14e70f2 into bnb-chain:master Sep 23, 2022
pdyraga added a commit to keep-network/keep-core that referenced this pull request Sep 25, 2022
The new version released has an improvements from
bnb-chain/tss-lib#203 we need to generate groups
with 100 signing members locally for development.
@pdyraga pdyraga deleted the dln-proof-performance branch September 25, 2022 14:43
tomaszslabon added a commit to keep-network/keep-core that referenced this pull request Sep 26, 2022
Updated bnb-chain/tss-lib dependency to v1.3.5

The new version released has an improvement from bnb-chain/tss-lib#203 we need
to generate groups with 100 signing members locally for development.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants