Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rlp: optimize big.Int decoding for size <= 32 bytes #22927

Merged
merged 3 commits into from
May 25, 2021

Conversation

fjl
Copy link
Contributor

@fjl fjl commented May 23, 2021

This change grows the static integer buffer in Stream to 32 bytes,
making it possible to decode 256bit integers without allocating a
temporary buffer.

In the recent commit 088da24, Stream struct size decreased from 120
bytes down to 88 bytes. This commit grows the struct to 112 bytes again,
but the size change will not degrade performance because Stream
instances are internally cached in sync.Pool.

name             old time/op    new time/op    delta
DecodeBigInts-8    12.2µs ± 0%     8.6µs ± 4%  -29.58%  (p=0.000 n=9+10)

name             old speed      new speed      delta
DecodeBigInts-8   230MB/s ± 0%   326MB/s ± 4%  +42.04%  (p=0.000 n=9+10)

fjl added 3 commits May 23, 2021 20:38
This change grows the static integer buffer in Stream to 32 bytes,
making it possible to decode 256bit integers without allocating a
temporary buffer.

In the recent commit 088da24, Stream struct size decreased from 120
bytes down to 88 bytes. This commit grows the struct to 112 bytes again,
but the size change will not degrade performance because Stream
instances are internally cached in sync.Pool.

    name             old time/op    new time/op    delta
    DecodeBigInts-8    12.2µs ± 0%     8.6µs ± 4%  -29.58%  (p=0.000 n=9+10)

    name             old speed      new speed      delta
    DecodeBigInts-8   230MB/s ± 0%   326MB/s ± 4%  +42.04%  (p=0.000 n=9+10)
My initial implementation of the optimization contained a bug: the
Stream didn't advance when encountering an empty big.Int because s.kind
was not re-armed. The test checks for this.
@fjl fjl force-pushed the rlp-bigint-opt branch from f43ec83 to 00a22fa Compare May 23, 2021 19:23
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fjl fjl merged commit 4d33de9 into ethereum:master May 25, 2021
@fjl fjl added this to the 1.10.4 milestone May 25, 2021
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Aug 20, 2021
* focus on performance improvement in many aspects.

1. Do BlockBody verification concurrently;
2. Do calculation of intermediate root concurrently;
3. Preload accounts before processing blocks;
4. Make the snapshot layers configurable.
5. Reuse some object to reduce GC.

add

* rlp: improve decoder stream implementation (ethereum#22858)

This commit makes various cleanup changes to rlp.Stream.

* rlp: shrink Stream struct

This removes a lot of unused padding space in Stream by reordering the
fields. The size of Stream changes from 120 bytes to 88 bytes. Stream
instances are internally cached and reused using sync.Pool, so this does
not improve performance.

* rlp: simplify list stack

The list stack kept track of the size of the current list context as
well as the current offset into it. The size had to be stored in the
stack in order to subtract it from the remaining bytes of any enclosing
list in ListEnd. It seems that this can be implemented in a simpler
way: just subtract the size from the enclosing list context in List instead.

* rlp: use atomic.Value for type cache (ethereum#22902)

All encoding/decoding operations read the type cache to find the
writer/decoder function responsible for a type. When analyzing CPU
profiles of geth during sync, I found that the use of sync.RWMutex in
cache lookups appears in the profiles. It seems we are running into
CPU cache contention problems when package rlp is heavily used
on all CPU cores during sync.

This change makes it use atomic.Value + a writer lock instead of
sync.RWMutex. In the common case where the typeinfo entry is present in
the cache, we simply fetch the map and lookup the type.

* rlp: optimize byte array handling (ethereum#22924)

This change improves the performance of encoding/decoding [N]byte.

    name                     old time/op    new time/op    delta
    DecodeByteArrayStruct-8     336ns ± 0%     246ns ± 0%  -26.98%  (p=0.000 n=9+10)
    EncodeByteArrayStruct-8     225ns ± 1%     148ns ± 1%  -34.12%  (p=0.000 n=10+10)

    name                     old alloc/op   new alloc/op   delta
    DecodeByteArrayStruct-8      120B ± 0%       48B ± 0%  -60.00%  (p=0.000 n=10+10)
    EncodeByteArrayStruct-8     0.00B          0.00B          ~     (all equal)

* rlp: optimize big.Int decoding for size <= 32 bytes (ethereum#22927)

This change grows the static integer buffer in Stream to 32 bytes,
making it possible to decode 256bit integers without allocating a
temporary buffer.

In the recent commit 088da24, Stream struct size decreased from 120
bytes down to 88 bytes. This commit grows the struct to 112 bytes again,
but the size change will not degrade performance because Stream
instances are internally cached in sync.Pool.

    name             old time/op    new time/op    delta
    DecodeBigInts-8    12.2µs ± 0%     8.6µs ± 4%  -29.58%  (p=0.000 n=9+10)

    name             old speed      new speed      delta
    DecodeBigInts-8   230MB/s ± 0%   326MB/s ± 4%  +42.04%  (p=0.000 n=9+10)

* eth/protocols/eth, les: avoid Raw() when decoding HashOrNumber (ethereum#22841)

Getting the raw value is not necessary to decode this type, and
decoding it directly from the stream is faster.

* fix testcase

* debug no lazy

* fix can not repair

* address comments

Co-authored-by: Felix Lange <fjl@twurst.com>
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
This change grows the static integer buffer in Stream to 32 bytes,
making it possible to decode 256bit integers without allocating a
temporary buffer.

In the recent commit 088da24, Stream struct size decreased from 120
bytes down to 88 bytes. This commit grows the struct to 112 bytes again,
but the size change will not degrade performance because Stream
instances are internally cached in sync.Pool.

    name             old time/op    new time/op    delta
    DecodeBigInts-8    12.2µs ± 0%     8.6µs ± 4%  -29.58%  (p=0.000 n=9+10)

    name             old speed      new speed      delta
    DecodeBigInts-8   230MB/s ± 0%   326MB/s ± 4%  +42.04%  (p=0.000 n=9+10)
yperbasis pushed a commit to erigontech/erigon that referenced this pull request Dec 5, 2021
AlexeyAkhunov pushed a commit to erigontech/erigon that referenced this pull request Dec 6, 2021
…3089)

* Clean up test runners. Don't run legacy tests

* Cherry pick ethereum/go-ethereum#22927

* Tests update 10.1: Transaction Tests

* Port decodeBigInt changes to decodeUint256

* Introduce (*Stream) Uint256Bytes

* Temporarily disable stTransactionTest/HighGasPrice

* linter

* ttWrongRLP transaction tests pass now

* Fix stTransactionTest/HighGasPrice

Co-authored-by: Felix Lange <fjl@twurst.com>
AlexeyAkhunov pushed a commit to erigontech/erigon that referenced this pull request Dec 6, 2021
…3089)

* Clean up test runners. Don't run legacy tests

* Cherry pick ethereum/go-ethereum#22927

* Tests update 10.1: Transaction Tests

* Port decodeBigInt changes to decodeUint256

* Introduce (*Stream) Uint256Bytes

* Temporarily disable stTransactionTest/HighGasPrice

* linter

* ttWrongRLP transaction tests pass now

* Fix stTransactionTest/HighGasPrice

Co-authored-by: Felix Lange <fjl@twurst.com>
AlexeyAkhunov added a commit to erigontech/erigon that referenced this pull request Dec 6, 2021
…3089) (#3095)

* Clean up test runners. Don't run legacy tests

* Cherry pick ethereum/go-ethereum#22927

* Tests update 10.1: Transaction Tests

* Port decodeBigInt changes to decodeUint256

* Introduce (*Stream) Uint256Bytes

* Temporarily disable stTransactionTest/HighGasPrice

* linter

* ttWrongRLP transaction tests pass now

* Fix stTransactionTest/HighGasPrice

Co-authored-by: Felix Lange <fjl@twurst.com>

Co-authored-by: Andrew Ashikhmin <34320705+yperbasis@users.noreply.github.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants