Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, eth, trie: streaming GC for the trie cache #16810

Merged
merged 2 commits into from
Jun 4, 2018

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented May 25, 2018

This PR tries to fine tune the initial trie pruning work introduced in Geth 1.8.0.

The current version of pruning works by accumulating trie writes in an intermediate in-memory database layer, also tracking the reference relationships between nodes. This in-memory layer tracks a window of 128 tries (to aid fast sync), and whenever a trie gets old enough, it is dereferenced, and any dangling node garbage collected. This process is repeated until either a memory limit or a time limit is reached, when an entire trie is flushed to disk. For more details, please see #15857.

After running the current pruning code for a few months, we can observe a few rough edges that make it suboptimal. On chain head, processing blocks takes a significant amount of time, and the baked-in 5 minute flush window is too small to meaningfully collect enough garbage. This results is flushes like Persisted trie from memory database nodes=779937 size=224.01mB time=6.34419045s gcnodes=1631823 gcsize=701.68mB gctime=6.624533035s livenodes=301196 livesize=114.13mB where we push 225MB of data to disk out of 725MB total. This leads to a too large of a chain growth, which acts as a vicious cycle, further growing the chain ever faster. The 5 minute window was meant as a sanity fallback against crashes, but it's just too wasteful. This PR bumps the sanity write window to 1 hour. Yes, that's a lot more painful if Geth crashes, but keeping the chain under control is more important than catering for unstable environments.

Albeit the above is a theoretical solution to disk growth, it does not - alone - work in practice. Bumping the flush timeout from 5 minute to 1 hour would on mainnet result in flushes every 10 minutes or so due to hitting the permitted memory limits (Geth by default runs with a 256MB cache limit (25% of --cache)).

The second important observation we need to make with trie pruning is that most of the trie is junk; or rather will become junk after some number of blocks. However, the more time a trie node does not become junk, the higher the probability it will never do so (never = rarely active pathway). When the current trie pruning code reaches its memory limit, it blindly flushes out nodes (a recent entire trie) to disk; most of which we know will end up as junk very fast. But there is no reason to flush an entire trie vs. random nodes... we just need to flush something. By tracking the age of nodes within the trie cache, we could free up memory by flushing old nodes to disk. This should significantly reduce disk junk, since a node can only end up on disk if it has been actively referenced for a very long time, essentially making it very unlikely to become junk in the near future. Memory cap wise we still enforce the same limits, just pick-and-chose what to write in a more meaningful way.

This "tracking by age" is a bit more involved, as a single timestamp field would make it hard to quickly find nodes to flush: we are constantly adding and removing nodes. The fastest data structure to handle this would be a heap, which is still log(n) complexity, where N is the number of live nodes (1M+ on mainnet). This PR implements this age tracking with a doubly linked list representing the age order, each item of the list being also inserted into a map. The linked list permits O(1) iteration complexity to find and flush the next node; and also O(1) complexity to add/remove a node. The map part ensures that when we're deleting a node, we can find it in the linked list in O(1) too.

The last piece of the puzzle is creating the flush-list in the first place, since the doubly-linked list needs to retain the invariant that writing a node to disk must entail all its children already being present on disk. This invariant is already satisfied by the trie cache insertion order, since we're always pushing a child into the cache first, and the parent after (since the parent needs the hash of the child). As such, if we create the flush-list in the node insertion order, the flush order will also retain the child-first-parent-after storage. I.e. the complexity of insertion is also O(1).

Memory complexity wise the cache still retains it's current O(n) complexity, where previously it was O(n) = n * common.HashLength + SUM(blobs), and now it's O(n) = n * 3 * common.HashLength + SUM(blobs).

This PR should also fix #16674, at least for the general flushes. The slowdown will still be felt during the "hourly" snapshot flushes.


Stats at block 4.8M:

Import Time Datadir Disk Reads Disk Writes
master 40h 25m 105.5GB 10.75TB 9.3TB
PR 34h 55m 74.5GB 6.16TB 5.6TB

Stats at chain head (5.7M):

Import Time Datadir Disk Reads Disk Writes
master 100h 267GB 27.4TB 20.7TB
PR 86h 190GB 21.8TB 17.2TB

@karalabe karalabe requested a review from holiman as a code owner May 25, 2018 15:17
trie/database.go Outdated
}
if batch.ValueSize() > ethdb.IdealBatchSize {
if err := batch.Write(); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about db.lock.RUnlock()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... copy paste from a different part and forgot to clean up.

trie/database.go Outdated
// Fetch the oldest referenced node and push into the batch
node := db.nodes[oldest]
if err := batch.Put(oldest[:], node.blob); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about db.lock.RUnlock()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... copy paste from a different part and forgot to clean up.

delete(db.nodes, db.oldest)
db.oldest = node.flushNext

db.nodesSize -= common.StorageSize(common.HashLength + len(node.blob))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier, you're using common.StorageSize(3*common.HashLength + len(node.blob)) (3*), but here a different fomula. Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn it. Originally I tracked the size differently. Apparently I didn't update all paths.... then the cache is a bit f-ed now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, it's fine. But definitely need to document this.

The size of the cache is defined as nodesSize + flushlistSize. The nodesSize is made up of a map, mapping hashes to blobs (nodesSize = SUM(common.Hash + len(blob))). The flushlist is a doubly linked list maintained inside the nodes map, so it's a prevHash and nextHash, i.e. SUM(2 * common.Hash).

The net total weight of a tracked node is thus 3*common.Hash+len(blob).

The reason I'm tracking the size of the nodes and the flush list separately is to allow printing the correct GC amount when logging. The flushlist is just metadata... it is counted to limit memory usage, but it should not be counted when reporting the total amount not written to disk.

I need to document it better.

@karalabe karalabe force-pushed the trie-streaming-gc branch from 3c60bc4 to e00fd72 Compare May 29, 2018 08:35
@karalabe karalabe force-pushed the trie-streaming-gc branch from e00fd72 to 456bdc2 Compare May 29, 2018 09:10
@karalabe karalabe changed the title [WIP] core, eth, trie: streaming GC for the trie cache core, eth, trie: streaming GC for the trie cache May 29, 2018
@karalabe karalabe requested a review from fjl May 30, 2018 11:16
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, with some questions/comments

@@ -47,7 +47,7 @@ var DefaultConfig = Config{
LightPeers: 100,
DatabaseCache: 768,
TrieCache: 256,
TrieTimeout: 5 * time.Minute,
TrieTimeout: 60 * time.Minute,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In blockchain.go NewBlockchain, some other (hardcoded) defaults are used in case cacheconfig is nil. Perhaps they should use this value instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I thought there would be used.

nodes, imgs = triedb.Size()
limit = common.StorageSize(bc.cacheConfig.TrieNodeLimit) * 1024 * 1024
)
if nodes > limit || imgs > 4*1024*1024 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the 4*1024*1024 limit for imgs, and why is that not a named parameter like the other settings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a random number really, just needs to be small enough not to be bothersome, large enough to dedup data. It's arbitrary really.

limit = common.StorageSize(bc.cacheConfig.TrieNodeLimit) * 1024 * 1024
)
if nodes > limit || imgs > 4*1024*1024 {
triedb.Cap(limit - ethdb.IdealBatchSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will tell the triedb to Cap at 256M - 100k. What's the reason to remove 100k from the limit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hysteresis. If we cap it as the limit, we will keep writing out 32 byte blobs. This way when we go over the limit, we push out 100KB, and then we have a bit of buffer to accumulate data before flushing. At least that's the theory. In practice, it might be interesting to see how much we overflow the limit.

}
db.flushnodes += uint64(nodes - len(db.nodes))
db.flushsize += storage - db.nodesSize
db.flushtime += time.Since(start)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about exposing the flush stats as metrics? Wouldn't it be pretty usefull to have on our graphs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added them.

@karalabe karalabe added this to the 1.8.10 milestone May 31, 2018
@@ -47,7 +47,7 @@ var DefaultConfig = Config{
LightPeers: 100,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    LightPeers:    100,
DatabaseCache: 768,
TrieCache:     256,
TrieTimeout:   60 * time.Minute,

@karalabe karalabe merged commit 143c434 into ethereum:master Jun 4, 2018
@karalabe karalabe modified the milestones: 1.8.10, 1.8.11 Jun 11, 2018
kielbarry pushed a commit to kielbarry/go-ethereum that referenced this pull request Jul 9, 2018
* core, eth, trie: streaming GC for the trie cache

* trie: track memcache statistics
firmianavan pushed a commit to firmianavan/go-ethereum that referenced this pull request Aug 28, 2018
* core, eth, trie: streaming GC for the trie cache

* trie: track memcache statistics
TuitionCoin added a commit to FinTechToken/go-ethereum that referenced this pull request Jun 16, 2019
* build: add -e and -X flags to get more information on ethereum#16433 (ethereum#16443)

* core: remove stray account creations in state transition (ethereum#16470)

The 'from' and 'to' methods on StateTransitions are reader methods and
shouldn't have inadvertent side effects on state.

It is safe to remove the check in 'from' because account existence is
implicitly checked by the nonce and balance checks. If the account has
non-zero balance or nonce, it must exist. Even if the sender account has
nonce zero at the start of the state transition or no balance, the nonce
is incremented before execution and the account will be created at that
time.

It is safe to remove the check in 'to' because the EVM creates the
account if necessary.

Fixes ethereum#15119

* travis, appveyor: bump to Go 1.10.1

* travis.yml: add TEST_PACKAGES to speed up swarm testing (ethereum#16456)

This commit is meant to allow ecosystem projects such as ethersphere
to minimize CI build times by specifying an environment variable with
the packages to run tests on.

If the environment variable isn't defined the build script will test
all packages so this shouldn't affect the main go-ethereum repository.

* les: add ps.lock.Unlock() before return (ethereum#16360)

* core/state: fix bug in copy of copy State

* core/state: fix ripemd-cornercase in Copy

* core: txpool stable underprice drop order, perf fixes

* miner: remove contention on currentMu for pending data retrievals (ethereum#16497)

* ethdb: add leveldb write delay statistic (ethereum#16499)

* eth/downloader: wait for all fetcher goroutines to exit before terminating (ethereum#16509)

* cmd/clef, signer: initial poc of the standalone signer (ethereum#16154)

* signer: introduce external signer command

* cmd/signer, rpc: Implement new signer. Add info about remote user to Context

* signer: refactored request/response, made use of urfave.cli

* cmd/signer: Use common flags

* cmd/signer: methods to validate calldata against abi

* cmd/signer: work on abi parser

* signer: add mutex around UI

* cmd/signer: add json 4byte directory, remove passwords from api

* cmd/signer: minor changes

* cmd/signer: Use ErrRequestDenied, enable lightkdf

* cmd/signer: implement tests

* cmd/signer: made possible for UI to modify tx parameters

* cmd/signer: refactors, removed channels in ui comms, added UI-api via stdin/out

* cmd/signer: Made lowercase json-definitions, added UI-signer test functionality

* cmd/signer: update documentation

* cmd/signer: fix bugs, improve abi detection, abi argument display

* cmd/signer: minor change in json format

* cmd/signer: rework json communication

* cmd/signer: implement mixcase addresses in API, fix json id bug

* cmd/signer: rename fromaccount, update pythonpoc with new json encoding format

* cmd/signer: make use of new abi interface

* signer: documentation

* signer/main: remove redundant  option

* signer: implement audit logging

* signer: create package 'signer', minor changes

* common: add 0x-prefix to mixcaseaddress in json marshalling + validation

* signer, rules, storage: implement rules + ephemeral storage for signer rules

* signer: implement OnApprovedTx, change signing response (API BREAKAGE)

* signer: refactoring + documentation

* signer/rules: implement dispatching to next handler

* signer: docs

* signer/rules: hide json-conversion from users, ensure context is cleaned

* signer: docs

* signer: implement validation rules, change signature of call_info

* signer: fix log flaw with string pointer

* signer: implement custom 4byte databsae that saves submitted signatures

* signer/storage: implement aes-gcm-backed credential storage

* accounts: implement json unmarshalling of url

* signer: fix listresponse, fix gas->uint64

* node: make http/ipc start methods public

* signer: add ipc capability+review concerns

* accounts: correct docstring

* signer: address review concerns

* rpc: go fmt -s

* signer: review concerns+ baptize Clef

* signer,node: move Start-functions to separate file

* signer: formatting

* light: new CHTs (ethereum#16515)

* params: release Geth v1.8.4

* VERSION, params: begin v1.8.5 release cycle

* build: enable goimports and varcheck linters (ethereum#16446)

* core/asm: remove unused condition (ethereum#16487)

* cmd/utils: fix help template issue for subcommands (ethereum#16351)

* rpc: clean up IPC handler (ethereum#16524)

This avoids logging accept errors on shutdown and removes
a bit of duplication. It also fixes some goimports lint warnings.

* core/asm: accept uppercase instructions (ethereum#16531)

* all: fix various typos (ethereum#16533)

* fix typo

* fix typo

* fix typo

* rpc: handle HTTP response error codes (ethereum#16500)

* whisper/whisperv6: post returns the hash of sent message (ethereum#16495)

* ethclient: add DialContext and Close (ethereum#16318)

DialContext allows users to pass a Context object for cancellation.
Close closes the underlying RPC connection.

* vendor: update elastic/gosigar so that it compiles on OpenBSD (ethereum#16542)

* eth/downloader: fix for Issue ethereum#16539 (ethereum#16546)

* params: release Geth v1.8.5 - Dirty Derivative²

* VERSION, params: begin Geth 1.8.6 release cycle

* cmd/geth: update the copyright year in the geth command usage (ethereum#16537)

* Revert "Dockerfile.alltools: fix invalid command"

* Revert "cmd/puppeth: fix node deploys for updated dockerfile user"

* Dockerfile: revert the user change PR that broke all APIs

* Dockerfile: drop legacy discovery v5 port mappings

* params: release v1.8.6 to fix docker images

* VERSION, params: begin release cycle 1.8.7

* cmd/geth, mobile: add memsize to pprof server (ethereum#16532)

* cmd/geth, mobile: add memsize to pprof server

This is a temporary change, to be reverted before the next release.

* cmd/geth: fix variable name

* core/types: avoid duplicating transactions on changing signer (ethereum#16435)

* core/state: cache missing storage entries (ethereum#16584)

* cmd/utils: point users to --syncmode under DEPRECATED (ethereum#16572)

Indicate that --light and --fast options are replaced by --syncmode

* trie: remove unused `buf` parameter (ethereum#16583)

* core, eth: fix tracer dirty finalization

* travis.yml: remove obsolete brew-cask install

* whisper: Golint fixes in whisper packages (ethereum#16637)

* vendor: fix leveldb crash when bigger than 1 TiB

* core: ensure local transactions aren't discarded as underpriced

This fixes an issue where local transactions are discarded as
underpriced when the pool and queue are full.

* evm/main: use blocknumber from genesis

* accounts: golint updates for this or self warning (ethereum#16627)

* tests: golint fixes for tests directory (ethereum#16640)

* trie: golint iterator fixes (ethereum#16639)

* internal: golint updates for this or self warning (ethereum#16634)

* core: golint updates for this or self warning (ethereum#16633)

* build: Add ldflags -s -w when building aar

Smaller size on mobile is always good.
Might also solve our maven central upload problem

* cmd/clef: documentation about setup (ethereum#16568)

clef: documentation about setup

* params: release geth 1.8.7

* VERSION, params: begin v1.8.8 release cycle

* log: changed if-else blocks to conform with golint (ethereum#16661)

* p2p: changed if-else blocks to conform with golint (ethereum#16660)

* les: changed if-else blocks to conform with golint (ethereum#16658)

* accounts: changed if-else blocks to conform with golint (ethereum#16654)

* rpc: golint error with context as last parameter (ethereum#16657)

* rpc/*: golint error with context as last parameter

* Update json.go

* metrics: golint updates for this or self warning (ethereum#16635)

* metrics/*: golint updates for this or self warning

* metrics/*: golint updates for this or self warning, updated pr from feedback

* consensus/ethash: fixed typo (ethereum#16665)

* event: golint updates for this or self warning (ethereum#16631)

* event/*: golint updates for this or self warning

* event/*: golint updates for this or self warning, pr updated per feedback

* eth: golint updates for this or self warning (ethereum#16632)

* eth/*:golint updates for this or self warning

* eth/*: golint updates for this or self warning, pr updated per feedback

* signer: fix golint errors (ethereum#16653)

* signer/*: golint fixes

Specifically naming and comment formatting for documentation

* signer/*: fixed naming error crashing build

* signer/*: corrected error

* signer/core: fix tiny error whitespace

* signer/rules: fix test refactor

* whisper/mailserver: pass init error to the caller (ethereum#16671)

* whisper/mailserver: pass init error to the caller

* whisper/mailserver: add returns to fmt.Errorf

* whisper/mailserver: check err in mailserver init test

* common: changed if-else blocks to conform with golint (ethereum#16656)

* mobile: add GetStatus Method for Receipt (ethereum#16598)

* core/rawdb: separate raw database access to own package (ethereum#16666)

* rlp: fix some golint warnings (ethereum#16659)

* p2p: fix some golint warnings (ethereum#16577)

* eth/filters: derive FilterCriteria from ethereum.FilterQuery (ethereum#16629)

* p2p/simulations/adapters: fix websocket log line parsing in exec adapter (ethereum#16667)

* build: specify the key to use when invoking gpg:sign-and-deploy-file (ethereum#16696)

* crypto: fix golint warnings (ethereum#16710)

* p2p: don't discard reason set by Disconnect (ethereum#16559)

Peer.run was discarding the reason for disconnection sent to the disc
channel by Disconnect.

* cmd: various golint fixes (ethereum#16700)

* cmd: various golint fixes

* cmd: update to pr change request

* cmd: update to pr change request

* eth: golint fixes to variable names (ethereum#16711)

* eth/filter: check nil pointer when unsubscribe (ethereum#16682)

* eth/filter: check nil pointer when unsubscribe

* eth/filters, accounts, rpc: abort system if subscribe failed

* eth/filter: add crit log before exit

* eth/filter, event: minor fixes

* whisper/shhclient: update call to shh_generateSymKeyFromPassword to pass a string (ethereum#16668)

* all: get rid of error when creating memory database (ethereum#16716)

* all: get rid of error when create mdb

* core: clean up variables definition

* all: inline mdb definition

* event: document select case slice use and add edge case test (ethereum#16680)

Feed keeps active subscription channels in a slice called 'f.sendCases'.
The Send method tracks the active cases in a local variable 'cases'
whose value is f.sendCases initially. 'cases' shrinks to a shorter
prefix of f.sendCases every time a send succeeds, moving the successful
case out of range of the active case list.

This can be confusing because the two slices share a backing array. Add
more comments to document what is going on. Also add a test for removing
a case that is in 'f.sentCases' but not 'cases'.

* travis: use Android NDK 16b (ethereum#16562)

* bmt: golint updates for this or self warning (ethereum#16628)

* bmt/*: golint updates for this or self warning

* Update bmt.go

* light: new CHT for mainnet and ropsten (ethereum#16736)

* params: release go-ethereum v1.8.8

* VERSION, params: start 1.8.9 release cycle

* accounts/abi: allow abi: tags when unpacking structs

Go code users can now tag event struct members with `abi:` to specify in what fields the event will be de-serialized.

See PR ethereum#16648 for details.

* travis: try to upgrade android builder to trusty

* p2p/enr: updates for discovery v4 compatibility (ethereum#16679)

This applies spec changes from ethereum/EIPs#1049 and adds support for
pluggable identity schemes.

Some care has been taken to make the "v4" scheme standalone. It uses
public APIs only and could be moved out of package enr at any time.

A couple of minor changes were needed to make identity schemes work:

- The sequence number is now updated in Set instead of when signing.
- Record is now copy-safe, i.e. calling Set on a shallow copy doesn't
  modify the record it was copied from.

* all: collate new transaction events together

* core, eth: minor txpool event cleanups

* travis, appveyor: bump Go release to 1.10.2

* core, consensus: fix some typos in comment code and output log

* eth: propagate blocks and transactions async

* trie: fixes to comply with golint (ethereum#16771)

* log: fixes for golint warnings (ethereum#16775)

* node: all golint warnings fixed (ethereum#16773)

* node: all golint warnings fixed

* node: rm per peter

* node: rm per peter

* vendor, ethdb: print warning log if leveldb is performing compaction (ethereum#16766)

* vendor: update leveldb package

* ethdb: print warning log if db is performing compaction

* ethdb: update annotation and log

* core/types: convert status type from uint to uint64 (ethereum#16784)

* trie: support proof generation from the iterator

* core/vm: fix typo in instructions.go (ethereum#16788)

* core: use a wrapped map to remove contention in `TxPool.Get`. (ethereum#16670)

* core: use a wrapped `map` and `sync.RWMutex` for `TxPool.all` to remove contention in `TxPool.Get`.

* core: Remove redundant `txLookup.Find` and improve comments on txLookup methods.

* trie: cleaner logic, one less func call

* eth, node, trie: fix minor typos (ethereum#16802)

* params: release go-ethereum v1.8.9

* VERSION, params: begin 1.8.10 release cycle

* ethereum: fix a typo in FilterQuery{} (ethereum#16827)

Fix a spelling mistake in comment

* eth/fetcher: reuse variables for hash and number (ethereum#16819)

* whisper/shhclient: update call to shh_post to expect string instead of bool (ethereum#16757)

Fixes ethereum#16756

* common: improve documentation comments (ethereum#16701)

This commit adds many comments and removes unused code.
It also removes the EmptyHash function, which had some uses
but was silly.

* core/vm: fix typo in comment

* p2p/discv5: add egress/ingress traffic metrics to discv5 udp transport (ethereum#16369)

* core: improve test for TransactionPriceNonceSort (ethereum#16413)

* trie: rename TrieSync to Sync and improve hexToKeybytes (ethereum#16804)

This removes a golint warning: type name will be used as trie.TrieSync by
other packages, and that stutters; consider calling this Sync.

In hexToKeybytes len(hex) is even and (even+1)/2 == even/2, remove the +1.

* core: fix transaction event asynchronicity

* params: release Geth 1.8.10 hotfix

* VERSION, params: begin 1.8.11 release cycle

* ethstats: fix last golint warning (ethereum#16837)

* console: squash golint warnings (ethereum#16836)

* rpc: use HTTP request context as top-level context (ethereum#16861)

* consensus/ethash: reduce keccak hash allocations (ethereum#16857)

Use Read instead of Sum to avoid internal allocations and
copying the state.

name                      old time/op  new time/op  delta
CacheGeneration-8          764ms ± 1%   579ms ± 1%  -24.22%  (p=0.000 n=20+17)
SmallDatasetGeneration-8  75.2ms ±12%  60.6ms ±10%  -19.37%  (p=0.000 n=20+20)
HashimotoLight-8          1.58ms ±11%  1.55ms ± 8%     ~     (p=0.322 n=20+19)
HashimotoFullSmall-8      4.90µs ± 1%  4.88µs ± 1%   -0.31%  (p=0.013 n=19+18)

* core, eth, trie: streaming GC for the trie cache (ethereum#16810)

* core, eth, trie: streaming GC for the trie cache

* trie: track memcache statistics

* rpc: set timeouts for http server, see ethereum#16859

* metrics: expvar support for ResettingTimer (ethereum#16878)

* metrics: expvar support for ResettingTimer

* metrics: use integers for percentiles; remove Overall

* metrics: fix edge-case panic for index-out-of-range

* cmd/geth: cap cache allowance

* core: fix typo in comment code

* les: add Skip overflow check to GetBlockHeadersMsg handler (ethereum#16891)

* eth/tracers: fix minor off-by-one error (ethereum#16879)

* tracing: fix minor off-by-one error

* tracers: go generate

* core: concurrent background transaction sender ecrecover

* miner: not call commitNewWork if it's a side block (ethereum#16751)

* cmd/abigen: support for reading solc output from stdin (ethereum#16683)

Allow the --abi flag to be given - to indicate that it should read the
ABI information from standard input. It expects to read the solc output
with the --combined-json flag providing bin, abi, userdoc, devdoc, and
metadata, and works very similarly to the internal invocation of solc,
except it allows external invocation of solc.

This facilitates integration with more complex solc invocations, such
as invocations that require path remapping or --allow-paths tweaks.

Simple usage example:

    solc --combined-json bin,abi,userdoc,devdoc,metadata *.sol | abigen --abi -

* params: fix golint warnings (ethereum#16853)

params: fix golint warnings

* vendor: added vendor packages necessary for the swarm-network-rewrite merge (ethereum#16792)

* vendor: added vendor packages necessary for the swarm-network-rewrite merge into ethereum master

* vendor: removed multihash deps

* trie: reduce hasher allocations (ethereum#16896)

* trie: reduce hasher allocations

name    old time/op    new time/op    delta
Hash-8    4.05µs ±12%    3.56µs ± 9%  -12.13%  (p=0.000 n=20+19)

name    old alloc/op   new alloc/op   delta
Hash-8    1.30kB ± 0%    0.66kB ± 0%  -49.15%  (p=0.000 n=20+20)

name    old allocs/op  new allocs/op  delta
Hash-8      11.0 ± 0%       8.0 ± 0%  -27.27%  (p=0.000 n=20+20)

* trie: bump initial buffer cap in hasher

* whisper: re-insert ethereum#16757 that has been lost during a merge (ethereum#16889)

* cmd/puppeth: fixed a typo in a wizard input query (ethereum#16910)

* core: relax type requirement for bc in ApplyTransaction (ethereum#16901)

* trie: avoid unnecessary slicing on shortnode decoding (ethereum#16917)

optimization code

* cmd/ethkey: add command to change key passphrase (ethereum#16516)

This change introduces 

    ethkey changepassphrase <keyfile>

to change the passphrase of a key file.

* metrics: return an empty snapshot for NilResettingTimer (ethereum#16930)

* light: new CHTs for mainnet and ropsten (ethereum#16926)

* ethclient: fix RPC parse error of Parity response (ethereum#16924)

The error produced when using a Parity RPC was the following:

ERROR: transaction did not get mined: failed to get tx for txid 0xbdeb094b3278019383c8da148ff1cb5b5dbd61bf8731bc2310ac1b8ed0235226: json: cannot unmarshal non-string into Go struct field txExtraInfo.blockHash of type common.Hash

* core: improve getBadBlocks to return full block rlp (ethereum#16902)

* core: improve getBadBlocks to return full block rlp

* core, eth, ethapi: changes to getBadBlocks formatting

* ethapi: address review concerns

* rpc: fix a comment typo (ethereum#16929)

* rpc: support returning nil pointer big.Ints (null)

* trie: don't report the root flushlist as an alloc

* metrics: removed repetitive calculations (ethereum#16944)

* core/rawdb: wrap db key creations (ethereum#16914)

* core/rawdb: use wrappered helper to assemble key

* core/rawdb: wrappered helper to assemble key

* core/rawdb: rewrite the wrapper, pass common.Hash

* ethdb: gracefullly handle quit channel (ethereum#16794)

* ethdb: gratefullly handle quit channel

* ethdb: minor polish

* internal/ethapi: reduce pendingTransactions to O(txs+accs) from O(txs*accs)

* les: pass server pool to protocol manager (ethereum#16947)

* metrics: fix gofmt linter warnings

* crypto: replace ToECDSAPub with error-checking func UnmarshalPubkey (ethereum#16932)

ToECDSAPub was unsafe because it returned a non-nil key with nil X, Y in
case of invalid input. This change replaces ToECDSAPub with
UnmarshalPubkey across the codebase.

* core, eth, les: more efficient hash-based header chain retrieval (ethereum#16946)

* les: fix retriever logic (ethereum#16776)

This PR fixes a retriever logic bug. When a peer had a soft timeout
and then a response arrived, it always assumed it was the same peer
even though it could have been a later requested one that did not time
out at all yet. In this case the logic went to an illegal state and
deadlocked, causing a goroutine leak.

Fixes ethereum#16243 and replaces ethereum#16359.
Thanks to @riceke for finding the bug in the logic.

* params: release go-ethereum v1.8.11

* VERSION, params: begin v1.8.12 release cycle

* core: change comment to match code more closely (ethereum#16963)

* internal/web3ext: fix method name for enabling mutex profiling (ethereum#16964)

* eth/fetcher: fix annotation (ethereum#16969)

* core/asm: correct comments typo (ethereum#16975)

core/asm/lexer: correct comments typo

* console: correct some comments typo (ethereum#16971)

console/console: correct some comments typo

*  ethereum#15685 made peer_test.go more portable by using random free port instead of hardcoded port 30303 (ethereum#15687)

Improves test portability by resolving 127.0.0.1:0
to get a random free port instead of the hard coded one. Now
the test works if you have a running node on the same
interface already.

Fixes ethereum#15685

* all: library changes for swarm-network-rewrite (ethereum#16898)

This commit adds all changes needed for the merge of swarm-network-rewrite.
The changes:

- build: increase linter timeout
- contracts/ens: export ensNode
- log: add Output method and enable fractional seconds in format
- metrics: relax test timeout
- p2p: reduced some log levels, updates to simulation packages
- rpc: increased maxClientSubscriptionBuffer to 20000

* core/vm: optimize MSTORE and SLOAD (ethereum#16939)

* vm/test: add tests+benchmarks for mstore

* core/vm: less alloc and copying for mstore

* core/vm: less allocs in sload

* vm: check for errors more correctly

* eth/filters: make filterLogs func more readable (ethereum#16920)

* cmd/utils: fix NetworkId default when -dev is set (ethereum#16833)

Prior to this change, when geth was started with `geth -dev -rpc`,
it would report a network id of `1` in response to the `net_version` RPC
request. But the actual network id it used to verify transactions
was `1337`.

This change causes geth instead respond with `1337` to the `net_version`
RPC when geth is started with `geth -dev -rpc`.

* travis, appveyor: update to Go 1.10.3

* common: all golint warnings removed (ethereum#16852)

* common: all golint warnings removed

* common: fixups

* eth: conform better to the golint standards (ethereum#16783)

* eth: made changes to conform better to the golint standards

* eth: fix comment nit

* core: reduce nesting in transaction pool code (ethereum#16980)

* bmt: fix package documentation comment (ethereum#16909)

* common/number: delete unused package (ethereum#16983)

This package was meant to hold an improved 256 bit integer library, but
the effort was abandoned in 2015. AFAIK nothing ever used this package.
Time to say goodbye.

* core/asm: correct comments typo (ethereum#16974)

* core/asm/compiler: correct comments typo

core/asm/compiler: correct comments typo

* Correct comments typo

* internal/debug: use pprof goroutine writer for debug_stacks (ethereum#16892)

* debug: Use pprof goroutine writer in debug.Stacks() to ensure all goroutines are captured.

* Up to 64MB limit, previous code only captured first 1MB of goroutines.

* internal/debug: simplify stacks handler

* fix typo

* fix pointer receiver

* accounts/keystore: assign schema as const instead of var (ethereum#16985)

* cmd: remove faucet/puppeth dead code (ethereum#16991)

* cmd/faucet: authGitHub is not used anymore

* cmd/puppeth: remove not used code

* mobile: correct comment typo in geth.go (ethereum#17021)

* accounts/usbwallet: correct comment typo (ethereum#17008)

* core: remove dead code, limit test code scope (ethereum#17006)

* core: move test util var/func to test file

* core: remove useless func

*  accounts/usbwallet: correct comment typo (ethereum#16998)

* signer: remove useless errorWrapper (ethereum#17003)

* travis: use NDK 17b for Android archives (ethereum#17029)

* tracers: fix err in 4byte, add some opcode analysis tools

* accounts: remove deadcode isSigned (ethereum#16990)

* mobile: correct comment typo in ethereum.go (ethereum#17040)

* cmd/geth: remove the tail "," from genesis config (ethereum#17028)

remove the tail "," from genesis config,  which will cause genesis config parse error .

* trie: cache collapsed tries node, not rlp blobs (ethereum#16876)

The current trie memory database/cache that we do pruning on stores
trie nodes as binary rlp encoded blobs, and also stores the node
relationships/references for GC purposes. However, most of the trie
nodes (everything apart from a value node) is in essence just a
collection of references.

This PR switches out the RLP encoded trie blobs with the
collapsed-but-not-serialized trie nodes. This permits most of the
references to be recovered from within the node data structure,
avoiding the need to track them a second time (expensive memory wise).

* swarm: network rewrite merge

* les: handle conn/disc/reg logic in the eventloop (ethereum#16981)

* les: handle conn/disc/reg logic in the eventloop

* les: try to dial before start eventloop

* les: handle disconnect logic more safely

* les: grammar fix

* log: Change time format

- Keep the tailing zeros.
- Limit precision to milliseconds.

* swarm/fuse: Disable fuse tests, they are flaky (ethereum#17072)

* swarm/pss: Hide big network tests under longrunning flag (ethereum#17074)

* whisper: Reduce message loop log from Warn to Info (ethereum#17055)

* core/vm: clear linter warnings (ethereum#17057)

* core/vm: clear linter warnings

* core/vm: review input

* core/vm.go: revert lint in noop as per request

* build: make build/goimports.sh more potable

* node: remove formatting from ResettingTimer metrics if requested in raw

* ethstats: comment minor correction (ethereum#17102)

spell correction from `repors` to `reports`

* ethdb, core: implement delete for db batch (ethereum#17101)

* vendor: update docker/docker/pkg/reexec so that it compiles on OpenBSD (ethereum#17084)

* trie: fix a temporary memory leak in the memcache

* cmd/geth: export metrics to InfluxDB (ethereum#16979)

* cmd/geth: add flags for metrics export

* cmd/geth: update usage fields for metrics flags

* metrics/influxdb: update reporter logger to adhere to geth logging convention

* node: documentation typo fix (ethereum#17113)

* core/vm: reuse bigint pools across transactions (ethereum#17070)

* core/vm: A pool for int pools

* core/vm: fix rebase issue

* core/vm: push leftover stack items after execution, not before

* cmd/p2psim: add exit error output and exit code (ethereum#17116)

* p2p/discover: move bond logic from table to transport (ethereum#17048)

* p2p/discover: move bond logic from table to transport

This commit moves node endpoint verification (bonding) from the table to
the UDP transport implementation. Previously, adding a node to the table
entailed pinging the node if needed. With this change, the ping-back
logic is embedded in the packet handler at a lower level.

It is easy to verify that the basic protocol is unchanged: we still
require a valid pong reply from the node before findnode is accepted.

The node database tracked the time of last ping sent to the node and
time of last valid pong received from the node. Node endpoints are
considered verified when a valid pong is received and the time of last
pong was called 'bond time'. The time of last ping sent was unused. In
this commit, the last ping database entry is repurposed to mean last
ping _received_. This entry is now used to track whether the node needs
to be pinged back.

The other big change is how nodes are added to the table. We used to add
nodes in Table.bond, which ran when a remote node pinged us or when we
encountered the node in a neighbors reply. The transport now adds to the
table directly after the endpoint is verified through ping. To ensure
that the Table can't be filled just by pinging the node repeatedly, we
retain the isInitDone check. During init, only nodes from neighbors
replies are added.

* p2p/discover: reduce findnode failure counter on success

* p2p/discover: remove unused parameter of loadSeedNodes

* p2p/discover: improve ping-back check and comments

* p2p/discover: add neighbors reply nodes always, not just during init

* consensus/ethash: fixed documentation typo (ethereum#17121)

"proot-of-work" to "proof-of-work"

* light: new CHTs (ethereum#17124)

* les: add announcement safety check to light fetcher (ethereum#17034)

* params: v1.8.12 stable

* 1.8.12
)
if nodes > limit || imgs > 4*1024*1024 {
triedb.Cap(limit - ethdb.IdealBatchSize)
}
// Find the next state trie we need to commit
header := bc.GetHeaderByNumber(current - triesInMemory)
chosen := header.Number.Uint64()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @karalabe, sorry to dig this out. I'm reading geth source code and just come across this part. May I know why do we choose the current - triesInMemory as the next commit trie on timeout(or memory run out)? Do we choose it in case of the chain reorgs?

Otherwise current - triesInMemory / 2 or more aggressive current - 1 may be a better chosen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is to support fast / snap sync. I need the latest 128 blocks' state available in the network.

Copy link
Contributor

@windycrypto windycrypto Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, well I mean the chosen trie to commit to disk on exceeding the TrieTimeLimit, currently we chose to commit the oldest of the 128, I mean we can commit a newer one, will it be a better chosen?
(even we commit the newer one, we still can keep the latest 128 in memory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Block processing time slowdown following trie persistence
4 participants