-
Notifications
You must be signed in to change notification settings - Fork 20.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/les: import header chains in batches #19456
Conversation
This chart is from this PR but with an additional fix that I later removed for this PR, where I also modded the db interface for state lookups -- which I think is unnessecary. The linear section in before the marker is when it downloads headers. The less linear section after is when it wraps up the receipts/bodies. This PR is markedly lower on db reads during this section.
I'll spin it up a new benchmark with this PR |
Ok, restarted on |
I increased the size of the A remaining TODO is to see if we can port over lightchain to use the new format. Also, I believe I return the wrong integers from |
da1feba
to
c700c88
Compare
I fixed up LES so it uses the same mechanism. The only tangible change from before, is that when a new chain is imported that reorgs and takes over the old chain, previously the events would be e.g. like this: @zsfelfoldi PTAL |
c700c88
to
613e2c4
Compare
@@ -199,15 +199,24 @@ func (ethash *Ethash) VerifyUncles(chain consensus.ChainReader, block *types.Blo | |||
|
|||
number, parent := block.NumberU64()-1, block.ParentHash() | |||
for i := 0; i < 7; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be the worst thing to turn this into a constant while you're here, such as maxKin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really a fan of that. The way it's written, it's very clear upon inspection what the limit is, without having to look up the constant in some params-file. Also, since we're not likely to change it ever, I don't see what the benefit would be in parameterizing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. It was not about parameterizing it, but about making it clear what the 7 means as some people might not be able to immediately know the kin relation for ommers.
// | ||
// Note: This method is not concurrent-safe with inserting blocks simultaneously | ||
// into the chain, as side effects caused by reorganisations cannot be emulated | ||
// without the real blocks. Hence, writing headers directly should only be done | ||
// in two scenarios: pure-header mode of operation (light clients), or properly | ||
// separated header/block phases (non-archive clients). | ||
func (hc *HeaderChain) WriteHeader(header *types.Header) (status WriteStatus, err error) { | ||
// Cache some values to prevent constant recalculation | ||
func (hc *HeaderChain) WriteHeaders(headers []*types.Header, pwCallbackFn PWCallback) (ignored, imported int, status WriteStatus, lastHash common.Hash, lastHeader *types.Header, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this feels like a lot of vales to return from a single function. It makes me wonder if it is trying to do too much or there is another way to split this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's a bit silly. I'll see if I can shave off some of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The easy way out is to combine all of these into a WriteHeadersResult
since you've already given all the fields name. I'd still leave error
separate if you decide to go that route.
if len(headers) == 0 { | ||
return ignored, imported, NonStatTy, lastHash, nil, err | ||
} | ||
ptd := hc.GetTd(headers[0].ParentHash, headers[0].Number.Uint64()-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it was not immediately clear to me that ptd
is parentTD
just by looking at its name. You may want to consider a name change.
core/headerchain.go
Outdated
number = header.Number.Uint64() | ||
lastNumber = headers[0].Number.Uint64() - 1 // Last successfully imported number | ||
externTd *big.Int // TD of successfully imported chain | ||
inserted []numberHash // Quick lookup of number/hash for the chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if you could get away with using headers
instead of inserted
since you will know lastNumber
and have the hash and number of the first header at header[0]
. I'm not a big fan of method-scoped struct
definitions personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem with that is that I don't want to write the hashes until later, and if I don't stash the hashes here, I'll have to call Hash()
on each header again.
types.Block.Hash()
calls Header.Hash
and remembers the hash internally, but headers don't store it internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but headers don't store it internally.
Perhaps this is something that could be done (and could potentially benefit the rest of the code base) (maybe in the future).
whFunc := func(header *types.Header) error { | ||
status, err := lc.hc.WriteHeader(header) | ||
|
||
postWriteCallback := func(header *types.Header, status core.WriteStatus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider moving this function declaration outside this function and naming it to something that describes that it's doing.
I have now rebased this on top of latest master, and squashed it. @rjl493456442 @karalabe PTAL |
} | ||
log.Info("Imported new block headers", context...) | ||
|
||
return 0, nil | ||
return 0, status, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also return the last index of the inserted header instead of returning 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I understood the return parameters (which are poorly documented) is that the first return value is the "index of the erroring header".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the compilation error, otherwise LGTM
core/headerchain.go
Outdated
hash = header.Hash() | ||
number = header.Number.Uint64() | ||
lastNumber uint64 // Last successfully imported number | ||
lastHash common.Hash // Last successfully imported hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core/headerchain.go:148:3: lastHash redeclared in this block
Doing another run with this one, combined with #20197. Idea being that this PR makes headers use batches, and may get a boost from larger batches. |
Although they're pretty close to one another now, here's the most recent logs from papertrail:
master:
So, from the looks of it, (356+439+377+444+307+342+365+529+211+360+382+351+377+239+370+223)/16= 354.5 (379+657+768+409+711+586+580+597+418+616+916+584+684+756+562+710+562)/17 = 617.35 |
Both master and experimental were quite fast: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=mon08&var-master=mon09&var-percentile=50&from=1571993737000&to=1572013800000 Experimental slightly faster |
Closing in favour of #21471 |
This PR tries to improve a couple of things are in fast-sync.
HasHeader
a part of theChainReader
interface. This can save a few lookups where we don't have to load from disknumberCache
Needs some general cleanup, and probably some tests will fail, and it needs some fixes for LES to keep working, but I'll post some charts here later on.