Skip to content

Commit

Permalink
perf(blockstore): Add LRU caches to blockstore operations used in con…
Browse files Browse the repository at this point in the history
…… (backport #76) (#81)

* perf(blockstore): Add LRU caches to blockstore operations used in consensus (backport cometbft#3003) (cometbft#3083)

Closes cometbft#2844

We are seeing that the blockstore loading operations get used in hot
loops within gossip routines, and queryMaj23 routines. This PR reduces
that overhead using an LRU cache.

The LRU cache does have a mutex on every get, but the time with the LRU
cache is 9x faster than without (before even adding in DB overheads),
due to the proto unmarshalling saved. We could imagine a setup where we
avoided a lock there entirely. I don't think this is worth right now,
since the new code is 9x faster, and these mostly appear in catchup code
which should not be highly contended for across peers at the same time.

With the new benchmark I added:
OLD:
```
BenchmarkRepeatedLoadSeenCommit-12         24447             54691 ns/op           46495 B/op        319 allocs/op
```
NEW:
```
BenchmarkRepeatedLoadSeenCommit-12        224131              6401 ns/op            8320 B/op          2 allocs/op
```

It turns out these gossip routines don't need mutative copies, so we
could optimize out the large allocation in the future if we want.

1 hour cpu profile that shows this appearing in prod:

![image](https://github.com/cometbft/cometbft/assets/6440154/5a7e0f02-8385-4c01-aa6a-dba2a2bf376d)

The state machine execution time here for context is 92 seconds. So this
is adding up in system load (and GC! The GC load is 52GB, the entire
trace is 200GB, with other parts being optimized down from recent PRs)

---

- [ ] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#3003 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
(cherry picked from commit 0c10bd5)

* Add Changelog

(cherry picked from commit 4594f29)

# Conflicts:
#	CHANGELOG.md

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Dev Ojha <dojha@berkeley.edu>
Co-authored-by: PaddyMc <paddymchale@hotmail.com>
  • Loading branch information
3 people authored May 28, 2024
1 parent 610d5cb commit 8d09b54
Show file tree
Hide file tree
Showing 8 changed files with 86 additions and 5 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
- [`blockstore`] Use LRU caches in blockstore, significiantly improving consensus gossip routine performance
([\#3003](https://github.com/cometbft/cometbft/issues/3003)
1 change: 1 addition & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ linters-settings:
- github.com/google
- github.com/gorilla/websocket
- github.com/informalsystems/tm-load-test/pkg/loadtest
- github.com/hashicorp/golang-lru/v2
- github.com/lib/pq
- github.com/libp2p/go-buffer-pool
- github.com/Masterminds/semver/v3
Expand Down
3 changes: 1 addition & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,13 @@
* [#73](https://github.com/osmosis-labs/cometbft/pull/73) perf(consensus/blockexec): Add simplistic block validation cache
* [#74](https://github.com/osmosis-labs/cometbft/pull/74) perf(consensus): Minor speedup to mark late vote metrics
* [#75](https://github.com/osmosis-labs/cometbft/pull/75) perf(p2p): 4% speedup to readMsg by removing one allocation
* [#76](https://github.com/osmosis-labs/cometbft/pull/76) perf(consensus): Add LRU caches for blockstore operations used in gossip

## v0.37.4-v25-osmo-3

* [#61](https://github.com/osmosis-labs/cometbft/pull/61) refactor(p2p/connection): Slight refactor to sendManyPackets that helps highlight performance improvements (backport #2953) (#2978)
* [#62](https://github.com/osmosis-labs/cometbft/pull/62) perf(consensus/blockstore): Remove validate basic call from LoadBlock
* [#71](https://github.com/osmosis-labs/cometbft/pull/71) perf(consensus): Make mempool update async from Commit
* [#73](https://github.com/osmosis-labs/cometbft/pull/73) perf(consensus/blockexec): Add simplistic block validation cache
* [#74](https://github.com/osmosis-labs/cometbft/pull/74) perf(consensus): Minor speedup to mark late vote metrics

## v0.37.4-v25-osmo-2 & v0.37.4-v24-osmo-5

Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ require (
github.com/cosmos/gogoproto v1.4.2
github.com/go-git/go-git/v5 v5.11.0
github.com/golang/protobuf v1.5.3
github.com/hashicorp/golang-lru/v2 v2.0.7
github.com/oasisprotocol/curve25519-voi v0.0.0-20220708102147-0a8a51822cae
github.com/vektra/mockery/v2 v2.14.0
golang.org/x/sync v0.3.0
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -494,6 +494,8 @@ github.com/hashicorp/go-version v1.6.0 h1:feTTfFNnjP967rlCxM/I9g701jU+RN74YKx2mO
github.com/hashicorp/go-version v1.6.0/go.mod h1:fltr4n8CU8Ke44wwGCBoEymUuxUHl09ZGVZPK5anwXA=
github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k=
github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM=
github.com/hashicorp/hcl v1.0.0 h1:0Anlzjpi4vEasTeNFn2mLJgTSwt0+6sfsiTG8qcWGx4=
github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ=
github.com/hexops/gotextdiff v1.0.3 h1:gitA9+qJrrTCsiCl7+kh75nPqQt1cx4ZkudSTLoUqJM=
Expand Down
37 changes: 37 additions & 0 deletions store/bench_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package store

import (
"bytes"
"testing"

"github.com/stretchr/testify/require"

"github.com/cometbft/cometbft/internal/test"
"github.com/cometbft/cometbft/libs/log"
"github.com/cometbft/cometbft/types"
cmttime "github.com/cometbft/cometbft/types/time"
)

// TestLoadBlockExtendedCommit tests loading the extended commit for a previously
// saved block. The load method should return nil when only a commit was saved and
// return the extended commit otherwise.
func BenchmarkRepeatedLoadSeenCommitSameBlock(b *testing.B) {
state, bs, cleanup := makeStateAndBlockStore(log.NewTMLogger(new(bytes.Buffer)))
defer cleanup()
h := bs.Height() + 1
block := state.MakeBlock(h, test.MakeNTxs(h, 10), new(types.Commit), nil, state.Validators.GetProposer().Address)
seenCommit := makeTestCommit(block.Header.Height, cmttime.Now())
ps, err := block.MakePartSet(types.BlockPartSizeBytes)
require.NoError(b, err)
bs.SaveBlock(block, ps, seenCommit)

// sanity check
res := bs.LoadSeenCommit(block.Height)
require.Equal(b, seenCommit, res)

b.ResetTimer()
for i := 0; i < b.N; i++ {
res := bs.LoadSeenCommit(block.Height)
require.NotNil(b, res)
}
}
36 changes: 33 additions & 3 deletions store/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import (
"fmt"
"strconv"

lru "github.com/hashicorp/golang-lru/v2"

dbm "github.com/cometbft/cometbft-db"
"github.com/cosmos/gogoproto/proto"

Expand Down Expand Up @@ -41,17 +43,35 @@ type BlockStore struct {
mtx cmtsync.RWMutex
base int64
height int64

seenCommitCache *lru.Cache[int64, *types.Commit]
blockCommitCache *lru.Cache[int64, *types.Commit]
}

// NewBlockStore returns a new BlockStore with the given DB,
// initialized to the last height that was committed to the DB.
func NewBlockStore(db dbm.DB) *BlockStore {
bs := LoadBlockStoreState(db)
return &BlockStore{
bStore := &BlockStore{
base: bs.Base,
height: bs.Height,
db: db,
}
bStore.addCaches()
return bStore
}

func (bs *BlockStore) addCaches() {
var err error
// err can only occur if the argument is non-positive, so is impossible in context.
bs.blockCommitCache, err = lru.New[int64, *types.Commit](100)
if err != nil {
panic(err)
}
bs.seenCommitCache, err = lru.New[int64, *types.Commit](100)
if err != nil {
panic(err)
}
}

func (bs *BlockStore) IsEmpty() bool {
Expand Down Expand Up @@ -224,6 +244,10 @@ func (bs *BlockStore) LoadBlockMetaByHash(hash []byte) *types.BlockMeta {
// and it comes from the block.LastCommit for `height+1`.
// If no commit is found for the given height, it returns nil.
func (bs *BlockStore) LoadBlockCommit(height int64) *types.Commit {
comm, ok := bs.blockCommitCache.Get(height)
if ok {
return comm.Clone()
}
pbc := new(cmtproto.Commit)
bz, err := bs.db.Get(calcBlockCommitKey(height))
if err != nil {
Expand All @@ -240,13 +264,18 @@ func (bs *BlockStore) LoadBlockCommit(height int64) *types.Commit {
if err != nil {
panic(fmt.Sprintf("Error reading block commit: %v", err))
}
return commit
bs.blockCommitCache.Add(height, commit)
return commit.Clone()
}

// LoadSeenCommit returns the locally seen Commit for the given height.
// This is useful when we've seen a commit, but there has not yet been
// a new block at `height + 1` that includes this commit in its block.LastCommit.
func (bs *BlockStore) LoadSeenCommit(height int64) *types.Commit {
comm, ok := bs.seenCommitCache.Get(height)
if ok {
return comm.Clone()
}
pbc := new(cmtproto.Commit)
bz, err := bs.db.Get(calcSeenCommitKey(height))
if err != nil {
Expand All @@ -264,7 +293,8 @@ func (bs *BlockStore) LoadSeenCommit(height int64) *types.Commit {
if err != nil {
panic(fmt.Errorf("error from proto commit: %w", err))
}
return commit
bs.seenCommitCache.Add(height, commit)
return commit.Clone()
}

// PruneBlocks removes block up to (but not including) a height. It returns number of blocks pruned.
Expand Down
9 changes: 9 additions & 0 deletions types/block.go
Original file line number Diff line number Diff line change
Expand Up @@ -801,6 +801,15 @@ func (commit *Commit) GetVote(valIdx int32) *Vote {
}
}

// Clone creates a deep copy of this commit.
func (commit *Commit) Clone() *Commit {
sigs := make([]CommitSig, len(commit.Signatures))
copy(sigs, commit.Signatures)
commCopy := *commit
commCopy.Signatures = sigs
return &commCopy
}

// VoteSignBytes returns the bytes of the Vote corresponding to valIdx for
// signing.
//
Expand Down

0 comments on commit 8d09b54

Please sign in to comment.