-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/merkle: hashing ~10% faster and ~60% fewer allocs #351
Conversation
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
// returns tmhash(0x01 || left || right) | ||
func innerHash(left []byte, right []byte) []byte { | ||
return tmhash.Sum(append(innerPrefix, append(left, right...)...)) | ||
} | ||
|
||
func innerHashOpt(s hash.Hash, left []byte, right []byte) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea is to avoid appending and re-use a provided hash.Hash
.
@@ -35,6 +35,7 @@ func TestKeyPath(t *testing.T) { | |||
|
|||
res, err := KeyPathToKeys(path.String()) | |||
require.Nil(t, err) | |||
require.Equal(t, len(keys), len(res)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop below could possibly PASS even if res
was longer.
Codecov Report
@@ Coverage Diff @@
## master #351 +/- ##
==========================================
- Coverage 61.98% 61.81% -0.17%
==========================================
Files 263 262 -1
Lines 22931 22927 -4
==========================================
- Hits 14213 14172 -41
- Misses 7207 7253 +46
+ Partials 1511 1502 -9
|
Looking now at I didn't want to cascade the change to more places until having some extra feedback if this feels correct. I can extend this PR to remove the previous functions if that sounds ok. |
Wow, thanks for doing this (ref #272 / tendermint/tendermint#2186) and for even providing proper benchmarks! I'll do a proper review shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change LGTM, glad the reset API is getting used =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!!
we should see if it can be upstreamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks again for your contribution 💙 This is cool! And it even reminded me that we recently changes something in another repo in a non-optimal way (reverted here: celestiaorg/nmt#40).
I didn't want to cascade the change to more places until having some extra feedback if this feels correct. I can extend this PR to remove the previous functions if that sounds ok.
Yeah, this feels correct. The change almost looks good as is. The only thing that needs addressing before merging: the simd version is currently added in go mod but not used.
Related to this, it is not clear if the benchmarks improved with the current state or did you actually use the simd version when you ran those?
It might be worth noting that we use this merkle tree only for tiny inputs. The only lazyledger-specific place will be computing the data hash from the rows/columns. Hence the actual impact of these changes will be marginal but we should still merge them! The merkle tree we use for actual data is in this repo: https://github.com/lazyledger/nmt
There is plenty of room for improvements in that one too.
BTW: Alternatively, you could also open this PR on tendermint directly.
@marbar3778 signaled that the team there might merge your changes as well.
To make the turnaround faster on the vanilla tendermint repo, I would suggest splitting the changes into two PRs: 1) the totally uncontroversial use of hash.Hash and reset and 2) replacing standard sha2 with the simd version.
@liamsi, I prefer going the route of leaving SIMD out of this PR as to prove the impact of reducing allocs. As to be totally sure about the benchstat result without SIMD, I've run it again:
My laptop is a bit noisy, but this proves that the gains are mostly from reducing allocs. The test data is also quite tiny as to have much impact from SIMD instructions, so makes sense. I agree on leaving the dependency change to another PR. |
"math/bits" | ||
) | ||
|
||
// HashFromByteSlices computes a Merkle tree where the leaves are the byte slice, | ||
// in the provided order. It follows RFC-6962. | ||
func HashFromByteSlices(items [][]byte) []byte { | ||
return hashFromByteSlices(sha256.New(), items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to myself: at some point, we need to search for all occurrences like this and make sure they use a central variable / constant instead (just in case we decide to switch to another hash function instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again 👍🏼
## Description Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ### Benchmarking: ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm
## Description Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ### Benchmarking: ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm
Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm
* crypto/merkle: optimize merkle tree hashing (#6513) Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm * update pending log Co-authored-by: Marko <marbar3778@yahoo.com>
adaptation/backport of celestiaorg#351
#351) * Rename package back to tendermint/tendermint and use replace to point to the current repo * Rename package back to tendermint/tendermint in the proto files and generated files * Rename package back to tendermint/tendermint in the mocks * Remove useless replace * Apply suggestions from code review * Apply suggestions from code review
* crypto/merkle: optimize merkle tree hashing (#6513) Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm * update pending log Co-authored-by: Marko <marbar3778@yahoo.com>
* crypto/merkle: optimize merkle tree hashing (#6513) Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm * update pending log Co-authored-by: Marko <marbar3778@yahoo.com> (cherry picked from commit 8e787f0)
…25) (#26) * crypto/merkle: optimize merkle tree hashing (#6513) (#9446) * crypto/merkle: optimize merkle tree hashing (#6513) Upstream celestiaorg/celestia-core#351 to optimize merkle tree hashing ``` benchmark old ns/op new ns/op delta BenchmarkHashAlternatives/recursive-8 22914 21949 -4.21% BenchmarkHashAlternatives/iterative-8 21634 21939 +1.41% benchmark old allocs new allocs delta BenchmarkHashAlternatives/recursive-8 398 200 -49.75% BenchmarkHashAlternatives/iterative-8 399 301 -24.56% benchmark old bytes new bytes delta BenchmarkHashAlternatives/recursive-8 19088 6496 -65.97% BenchmarkHashAlternatives/iterative-8 21776 13984 -35.78% ``` cc @odeke-em @cuonglm * update pending log Co-authored-by: Marko <marbar3778@yahoo.com> (cherry picked from commit 8e787f0) * add changelog (cherry picked from commit 9e015e0) * add back rootHash nil check (cherry picked from commit 4b602f8) * update changelog (cherry picked from commit 6b1d05f) --------- Co-authored-by: JayT106 <JayT106@users.noreply.github.com> Co-authored-by: Adam Tucker <adamleetucker@outlook.com>
This PR makes some optimizations that produce the following results compared to the
master
implementation:The gist of the change is basically avoiding allocs, and some nit gain by using a SIMD sha256 implementation. The latter could also be considered for the
crypto/tmhash
package.There's a further optimization possible that might reduce allocs even more with pooling, but the main bottleneck is sha256 calc as expected so maybe isn't worth it.
I kept the original
leftHash
andinnerHash
methods since they're used in other places, but looks to me can be removed and used the ones I introduced which will avoid the same kind in allocations in other places producing some performance gains in other packages.