-
Notifications
You must be signed in to change notification settings - Fork 566
Problem: contract call which does lots of message calls is slow to execute #710
Comments
I tried same script with geth and same gas limit, it's much faster, the request returns normally:
|
how about logging the loop count which calls evm evaluation in binary searching? maybe too many calls? starting gas allowance maybe too high? |
it's the worst case of binary search, so it should be around
yeah, possibly, or make the performance closer to go-ethereum. |
I did a single |
25 times of search count is not bad,
|
I did a profile, most of the time are spent on the commit of context stack. I think we have an |
The benchmark is added here: https://github.com/yihuang/ethermint/tree/bench-tx
|
…st (#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes #10487 Updates evmos/ethermint#710
…st (cosmos#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes cosmos#10487 Updates evmos/ethermint#710
This issue is stale because it has been open 45 days with no activity. Remove |
…st (#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes #10487 Updates evmos/ethermint#710 (cherry picked from commit 5399e72) # Conflicts: # CHANGELOG.md
…st, avoid keys sort (backport #10486) (#10848) * perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast (#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes #10487 Updates evmos/ethermint#710 (cherry picked from commit 5399e72) # Conflicts: # CHANGELOG.md * fix conflicts Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
…st, avoid keys sort (backport #10486) (#10848) * perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast (#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes #10487 Updates evmos/ethermint#710 (cherry picked from commit 5399e72) # Conflicts: # CHANGELOG.md * fix conflicts Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
…st, avoid keys sort (backport cosmos#10486) (cosmos#10848) * perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast (cosmos#10486) We can shave off some milliseconds, but also cut down some Megabytes of RAM consumed by only requesting from the cache if needed, but also using the map clearing idiom which is recognized by the compiler to make fast code. Noticed in profiles from Tharsis' Ethermint per evmos/ethermint#710 - Before * Memory profiles ```shell 19.50MB 19.50MB 134: store.cache = make(map[string]*cValue) 18.50MB 18.50MB 135: store.deleted = make(map[string]struct{}) 15.50MB 15.50MB 136: store.unsortedCache = make(map[string]struct{}) ``` * CPU profiles ```go . . 118: // TODO: Consider allowing usage of Batch, which would allow the write to . . 119: // at least happen atomically. 150ms 150ms 120: for _, key := range keys { 220ms 3.64s 121: cacheValue := store.cache[key] . . 122: . . 123: switch { . 250ms 124: case store.isDeleted(key): . . 125: store.parent.Delete([]byte(key)) 210ms 210ms 126: case cacheValue.value == nil: . . 127: // Skip, it already doesn't exist in parent. . . 128: default: 240ms 27.94s 129: store.parent.Set([]byte(key), cacheValue.value) . . 130: } . . 131: } ... 10ms 60ms 134: store.cache = make(map[string]*cValue) . 40ms 135: store.deleted = make(map[string]struct{}) . 50ms 136: store.unsortedCache = make(map[string]struct{}) . 110ms 137: store.sortedCache = dbm.NewMemDB() ``` - After * Memory profiles ```shell . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . . 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . . 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . . 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } ``` * CPU profiles ```shell . . 111: // TODO: Consider allowing usage of Batch, which would allow the write to . . 112: // at least happen atomically. 110ms 110ms 113: for _, key := range keys { . 210ms 114: if store.isDeleted(key) { . . 115: // We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot . . 116: // be sure if the underlying store might do a save with the byteslice or . . 117: // not. Once we get confirmation that .Delete is guaranteed not to . . 118: // save the byteslice, then we can assume only a read-only copy is sufficient. . . 119: store.parent.Delete([]byte(key)) . . 120: continue . . 121: } . . 122: 50ms 2.45s 123: cacheValue := store.cache[key] 910ms 920ms 124: if cacheValue.value != nil { . . 125: // It already exists in the parent, hence delete it. 120ms 29.56s 126: store.parent.Set([]byte(key), cacheValue.value) . . 127: } . . 128: } . . 129: . . 130: // Clear the cache using the map clearing idiom . . 131: // and not allocating fresh objects. . . 132: // Please see https://bencher.orijtech.com/perfclinic/mapclearing/ . 210ms 133: for key := range store.cache { . . 134: delete(store.cache, key) . . 135: } . 10ms 136: for key := range store.deleted { . . 137: delete(store.deleted, key) . . 138: } . 170ms 139: for key := range store.unsortedCache { . . 140: delete(store.unsortedCache, key) . . 141: } . 260ms 142: store.sortedCache = dbm.NewMemDB() . 10ms 143:} ``` Fixes cosmos#10487 Updates evmos/ethermint#710 (cherry picked from commit 5399e72) # Conflicts: # CHANGELOG.md * fix conflicts Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
System info: ethermint main
Test Contract
EstimateGas Slow
When call
eth_estimateGas
for thetest(40000)
contract method call, the web3 HTTP client timeout, the node log shows it takes 17 seconds to finish.A single
eth_call
can be finished in 584ms, and the binary search in estimateGas didceil(log2(25000000)) = 25
number of calls, it sums up to25 * 584ms = 14
seconds, close to what we see in theeth_estimateGas
call.Solution:
Tx Execution Slow
Further investigation shows that the actual execution of the tx is even slower, a single run of the tx takes 30 seconds, the profile shows the hotspot is in the commit of a deep context stack.
Solution:
Expected behavior: RPC API and transactions should return fast
Actual behavior: blocked for dozens of seconds
Additional info:
Compiled contract:
TestMessageCall.json.zip
The text was updated successfully, but these errors were encountered: