Skip to content

Conversation

@daniilrrr
Copy link
Collaborator

Ticket

  • Related Linear Ticket: SEQ-240

What does this PR do?

  • Summary: Better function for gleaning more accurate compression performance

Does this PR introduce any breaking changes (API/schema)?

  • No

Do any environment variables need to be updated or added before deployment?

  • No

How can this PR be tested?

Tests pass

@linear
Copy link

linear bot commented Oct 29, 2024

SEQ-240 [optimization] improve compression

Far-off nice-to-have

Details TBD

AC:

  • Unit tests/ benchmarks have representative unique txn sets
  • More TBD

@RomanHodulak
Copy link
Contributor

Benchmarks generally

Comment on lines 172 to 192
/// Generates a random raw Ethereum transaction in hexadecimal format
pub fn generate_random_raw_transaction_rlp() -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this get added to the production codebase? We should ensure that it is only in dev builds

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what you mean?

  • we'll only have 1 codebase that will eventually be open source, right?
  • @RomanHodulak is there a way to use [#dev] annotations or something to do what Will is describing?

@WillPapper also what in particular concerns you about including this function in "prod" ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a benchmark it won't be part of the production codebase.

@daniilrrr daniilrrr force-pushed the daniil/SEQ-240-better-txns branch from deb445f to 9b50e8c Compare November 8, 2024 15:23
@daniilrrr daniilrrr force-pushed the daniil/SEQ-240-better-txns branch from 685efe4 to fd49132 Compare November 8, 2024 15:59
Copy link
Contributor

@RomanHodulak RomanHodulak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! I appreciate the way you addressed the review.

Interesting results

     Running benches/zlib_compression.rs (target/release/deps/zlib_compression-c2f3f3d6b31def3c)
Gnuplot not found, using plotters backend
single_tx_compression   time:   [13.966 µs 13.970 µs 13.973 µs]
                        change: [-0.2983% -0.2380% -0.1926%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe


Single TX compression
Compression ratio: -10.00%
Original size: 110 bytes
Compressed size: 121 bytes
Decompressed size: 110 bytes
Compression time: 56.895µs
Decompression time: 4.137µs
batch_multiple_tx       time:   [23.190 µs 23.235 µs 23.275 µs]
                        change: [-0.2423% +0.0410% +0.2746%] (p = 0.77 > 0.05)
                        No change in performance detected.


Multiple TX compression
Compression ratio: 23.01%
Original size: 365 bytes
Compressed size: 281 bytes
Decompressed size: 365 bytes
Compression time: 50.292µs
Decompression time: 12.8µs
batch_sizes/100         time:   [735.16 µs 735.40 µs 735.65 µs]
                        change: [-1.5510% -0.9114% -0.4428%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe

Batch compression (n=100)
Compression ratio: 46.05%
Original size: 25052 bytes
Compressed size: 13516 bytes
Decompressed size: 25052 bytes
Compression time: 960.891µs
Decompression time: 124.479µs
batch_sizes/1000        time:   [9.9003 ms 9.9244 ms 9.9561 ms]
                        change: [+0.1384% +0.4073% +0.7175%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Batch compression (n=1000)
Compression ratio: 48.46%
Original size: 249610 bytes
Compressed size: 128638 bytes
Decompressed size: 249610 bytes
Compression time: 8.705869ms
Decompression time: 1.122659ms

@daniilrrr
Copy link
Collaborator Author

daniilrrr commented Nov 8, 2024

Nicely done! I appreciate the way you addressed the review.

Interesting results
...

yes the takeaway for me seems like zlib compression flattens out and approaches 50% as input size increases

@daniilrrr daniilrrr merged commit eb1dae5 into main Nov 8, 2024
2 checks passed
@daniilrrr daniilrrr deleted the daniil/SEQ-240-better-txns branch November 8, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants