Description
Hello iavl team, we are the exchain development team, we have developed evm based on cosmos-sdk and have launched the main network.
When we used pprof to analyze the performance of the exchain system, we found an optimization point for the iavl project. We designed an optimization solution for this point. If possible, we look forward to your help to analyze whether this solution is feasible.
We tested this program and found that abci time consumption can be reduced by 80% .
Problem Definition
In cosmos-sdk logic:
- In the
abci.commit
phase, theMultableTree.SaveVersion
will be called to write the state increment tonodedb
andleveldb
- Then through the
pruneStores
operation in cosmos-sdk, call theMultableTree.DeleteVersionsRange
method of iavl to delete theOrphan
node of the old height.
For leveldb, a large number of Orphan
nodes are deleted just after being written. Frequent writes and deletions increase the burden of leveldb
, and the leveldb.Compaction
, leveldb.Set
and leveldb.Get
operation takes a long time, which becomes the performance bottleneck of exchain
system.
Proposal
We can try to reduce the read and write to leveldb
.
We don't need to execute ndoeDB.SaveBranch
and nodeDB.Commit
at each block height.
Define the variable CommitIntervalHeight
, which represents how many blocks to wait before executing nodeDB.Commit
, such as 100.
We execute ndoeDB.SaveBranch
and nodeDB.Commit
every 100 block heights.
One problem is that we need to write the non-Orphan node
created in the height of 1 to 100 to leveldb
when the height is 100, then we need to index these non-Orphan nodes
to facilitate batch write
.
For this problem, our idea is to store the above-described non-Orphan nodes
in MutableTree
(In the current version of iavl, MutableTree
will be reseted by nodeDB.SaveBranch
in MultableTree.SaveVersion
, we will modify it to execute reset every 100 blocks), and write them to nodedb
and leveldb
at a height of 100.
// sample code(Simplified)
var CommitIntervalHeight int64 = 100
func (tree *MutableTree) SaveVersion() ([]byte, int64, error) {
version := tree.version + 1
if version % CommitIntervalHeight == 0 {
tree.ndb.SaveBranch(tree.root)
tree.ndb.DeleteOrphans(version, tree.orphans)
tree.ndb.Commit()
} else {
tree.ndb.DeleteOrphans(version, tree.orphans)
}
tree.version = version
tree.versions[version] = true
return tree.Hash(), version, nil
}
func (ndb *nodeDB) DeleteOrphans(version int64, orphans map[string]int64) {
ndb.mtx.Lock()
defer ndb.mtx.Unlock()
toVersion := ndb.getPreviousVersion(version)
for hash, fromVersion := range orphans {
debug("Delete ORPHAN %v-%v %X\n", fromVersion, toVersion, hash)
ndb.uncacheNode([]byte(hash))
}
}