baseline: Track stack evolution separately #418

chfast · 2022-01-17T13:58:39Z

Track EVM stack top item outside of State object. This make the stack
top pointer always visible to the compiler and it is able to better
optimize the interpreter.

chfast · 2022-01-17T14:17:38Z

Benchmark together with #416.

baseline/execute/main/blake2b_huff/empty_mean                               -0.1074         -0.1074            14            13            14            13
baseline/execute/main/blake2b_huff/2805nulls_mean                           -0.1104         -0.1104           309           275           309           275
baseline/execute/main/blake2b_huff/5610nulls_mean                           -0.1039         -0.1039           603           540           603           540
baseline/execute/main/blake2b_huff/8415nulls_mean                           -0.1018         -0.1018           884           794           884           794
baseline/execute/main/blake2b_huff/65536nulls_mean                          -0.1033         -0.1033          6856          6148          6856          6148
baseline/execute/main/blake2b_shifts/2805nulls_mean                         -0.0724         -0.0724          3011          2793          3011          2793
baseline/execute/main/blake2b_shifts/5610nulls_mean                         -0.0752         -0.0752          6016          5563          6016          5564
baseline/execute/main/blake2b_shifts/8415nulls_mean                         -0.0756         -0.0756          9010          8329          9010          8329
baseline/execute/main/blake2b_shifts/65536nulls_mean                        -0.0773         -0.0773         69839         64437         69840         64438
baseline/execute/main/sha1_divs/empty_mean                                  -0.0462         -0.0462            56            54            56            54
baseline/execute/main/sha1_divs/1351_mean                                   -0.0421         -0.0421          1178          1128          1178          1128
baseline/execute/main/sha1_divs/2737_mean                                   -0.0421         -0.0421          2297          2200          2297          2200
baseline/execute/main/sha1_divs/5311_mean                                   -0.0425         -0.0425          4486          4296          4486          4296
baseline/execute/main/sha1_divs/65536_mean                                  -0.0419         -0.0419         54667         52376         54668         52377
baseline/execute/main/sha1_shifts/empty_mean                                -0.0853         -0.0853            33            30            33            30
baseline/execute/main/sha1_shifts/1351_mean                                 -0.0850         -0.0850           703           643           703           643
baseline/execute/main/sha1_shifts/2737_mean                                 -0.0852         -0.0852          1372          1255          1372          1255
baseline/execute/main/sha1_shifts/5311_mean                                 -0.0856         -0.0856          2680          2451          2680          2451
baseline/execute/main/sha1_shifts/65536_mean                                -0.0849         -0.0849         32684         29910         32685         29910
baseline/execute/main/weierstrudel/0_mean                                   -0.0195         -0.0195           172           169           172           169
baseline/execute/main/weierstrudel/1_mean                                   -0.0259         -0.0259           379           369           379           369
baseline/execute/main/weierstrudel/3_mean                                   -0.0276         -0.0276           593           577           593           577
baseline/execute/main/weierstrudel/9_mean                                   -0.0303         -0.0303          1229          1192          1229          1192
baseline/execute/main/weierstrudel/14_mean                                  -0.0303         -0.0303          1760          1707          1760          1707
baseline/execute/micro/jump_around/empty_mean                               -0.0915         -0.0915            24            22            24            22
baseline/execute/micro/jumpdests_0xffff/empty_mean                          +0.0051         +0.0051           117           117           117           117
baseline/execute/micro/loop_with_many_jumpdests/empty_mean                  +0.0544         +0.0544         13899         14655         13899         14655
baseline/execute/micro/memory_grow_mload/nogrow_mean                        -0.0664         -0.0664            55            51            55            51
baseline/execute/micro/memory_grow_mload/by1_mean                           -0.0880         -0.0879            59            53            59            53
baseline/execute/micro/memory_grow_mload/by16_mean                          -0.0902         -0.0902            64            58            64            58
baseline/execute/micro/memory_grow_mload/by32_mean                          -0.0889         -0.0889            73            66            73            66
baseline/execute/micro/memory_grow_mstore/nogrow_mean                       -0.0209         -0.0209            55            53            55            54
baseline/execute/micro/memory_grow_mstore/by1_mean                          -0.0285         -0.0285            58            57            58            57
baseline/execute/micro/memory_grow_mstore/by16_mean                         -0.0182         -0.0182            65            63            65            63
baseline/execute/micro/memory_grow_mstore/by32_mean                         -0.0365         -0.0365            75            72            75            72
baseline/execute/micro/signextend/zero_mean                                 -0.0993         -0.0993            69            62            69            62
baseline/execute/micro/signextend/one_mean                                  -0.0989         -0.0989            69            62            69            62
OVERALL_GEOMEAN                                                             -0.0621         -0.0621             0             0             0             0

chfast · 2022-01-19T19:05:00Z

lib/evmone/tracing.hpp

@@ -38,17 +39,19 @@ class Tracer
    }

    void notify_instruction_start(  // NOLINT(misc-no-recursion)
-        uint32_t pc, const ExecutionState& state) noexcept
+        uint32_t pc, intx::uint256* stack_top, int stack_height,


The parameters are not beautiful here. The best option seems to be std::span<const uint256>, but we wait for C++20.

In Silkworm we use gsl::span from https://github.com/microsoft/GSL.

Any reason not using std::span from C++20?

Because Silkworm is compatible with C++17.

codecov · 2022-01-19T19:13:16Z

Codecov Report

Merging #418 (0574645) into master (6b02dd0) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #418   +/-   ##
=======================================
  Coverage   99.62%   99.62%           
=======================================
  Files          31       31           
  Lines        4013     4017    +4     
=======================================
+ Hits         3998     4002    +4     
  Misses         15       15

Flag	Coverage Δ
consensus	`86.51% <50.00%> (+0.01%)`	⬆️
unittests	`99.67% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
lib/evmone/baseline.cpp	`100.00% <100.00%> (ø)`
lib/evmone/tracing.cpp	`100.00% <100.00%> (ø)`
lib/evmone/tracing.hpp	`94.11% <100.00%> (ø)`
test/unittests/tracing_test.cpp	`100.00% <100.00%> (ø)`

gumb0 · 2022-01-26T11:41:12Z

Please rebase.

gumb0 · 2022-01-26T11:50:02Z

lib/evmone/baseline.cpp

@@ -193,17 +211,22 @@ evmc_result execute(const VM& vm, ExecutionState& state, const CodeAnalysis& ana
    const auto& cost_table = get_baseline_cost_table(state.rev);

    const auto* const code = state.code.data();
-    auto code_it = code;  // Code iterator for the interpreter loop.
-    while (true)          // Guaranteed to terminate because padded code ends with STOP.
+    const auto stack_bottom = state.stack.top_item;


~~Why is this not~~ I got it, initially top points to the bottom, I was confused

Suggested change

const auto stack_bottom = state.stack.top_item;

const auto stack_top = state.stack.top_item;

It's kind of confusing that now we there exist state.stack.top_item and position.stack_top and only one of them is valid.

I guess ExecutionState::Stack should be removed soon and replaced with just a pair of stack storage and "initial_stack_top" pointer.

This is because Baseline leaves .top_item untouched but Advanced is manipulating it (as Baseline did before).
We could more Stack to AdvancedExecutionState, but this would allocate it after the stack storage. We can also migrate Advanced to StackTop. All to be decided later and with low priority.

So far I used state.stack.storage here instead.

lib/evmone/baseline.cpp

lib/evmone/tracing.cpp

gumb0 · 2022-01-26T12:27:35Z

It looks fine overall, but intuition why this helps optimizations better is a bit beyond me...

chfast · 2022-01-26T18:04:23Z

It looks fine overall, but intuition why this helps optimizations better is a bit beyond me...

This is actually easy to explain. This is stack check for ADD before and after.

--- old.asm     2022-01-26 18:48:34.751835959 +0100
+++ new.asm     2022-01-26 18:48:48.287972455 +0100
@@ -1,10 +1,8 @@
        #APP
        # OP_ADD
        #NO_APP
-       movq    (%r13), %rax  ; %rax has current stack top
-       movq    %rax, %rcx
-       subq    %r15, %rcx
-       addq    $32, %rcx
-       shrq    $5, %rcx
-       cmpl    $2, %ecx
-       jl      .LBB3_690
+       movq    %rbx, %rax  ; %rbx has current stack top
+       subq    %r14, %rax
+       shrq    $5, %rax
+       cmpl    $2, %eax
+       jl      .LBB3_668

If we keep the stack top pointer in ExecutionState it is unlikely to be kept in a register because it is passed to instruction implementations by reference and it works as in-out param. And not all instructions are inlined and some implementations are not even visible (calls are implemented in separate translation unit).

The same can be done for gas_left.

We can also further optimize stack height computation, see #420. Currently this is

       subq    %r14, %rax  ; stack_top - stack_bottom
       shrq    $5, %rax    ; convert to sizeof(uint256)

gumb0

Fine as it is, but I'd vote to rename Stack::end() -> Stack::bottom()

Track EVM stack top item outside of State object. This make the stack top pointer always visible to the compiler and it is able to better optimize the interpreter.

yperbasis · 2022-01-28T11:07:18Z

Hmm. It looks like this change overflows 16MB stack in the Debug mode: https://app.circleci.com/pipelines/github/torquem-ch/silkworm/3728/workflows/2da56982-dbc3-4897-90dd-490b40278d8b/jobs/10490

chfast force-pushed the instr_core branch from d4a0a01 to 925f3e3 Compare January 19, 2022 18:42

Base automatically changed from instr_core to master January 19, 2022 18:53

chfast force-pushed the baseline_stack branch from 4b4cdbc to 9a03aeb Compare January 19, 2022 19:03

chfast requested review from gumb0 and yperbasis January 19, 2022 19:03

chfast commented Jan 19, 2022

View reviewed changes

chfast mentioned this pull request Jan 26, 2022

Update evmone erigontech/silkworm#547

Merged

chfast force-pushed the baseline_stack branch from 9a03aeb to b3b4aa4 Compare January 26, 2022 11:46

gumb0 reviewed Jan 26, 2022

View reviewed changes

lib/evmone/baseline.cpp Show resolved Hide resolved

gumb0 reviewed Jan 26, 2022

View reviewed changes

lib/evmone/baseline.cpp Outdated Show resolved Hide resolved

gumb0 reviewed Jan 26, 2022

View reviewed changes

lib/evmone/tracing.cpp Show resolved Hide resolved

chfast force-pushed the baseline_stack branch from b3b4aa4 to 08b2a5a Compare January 26, 2022 17:36

chfast force-pushed the baseline_stack branch from 08b2a5a to af8116b Compare January 26, 2022 18:28

gumb0 approved these changes Jan 26, 2022

View reviewed changes

chfast force-pushed the baseline_stack branch from af8116b to a78f6bd Compare January 26, 2022 19:43

chfast mentioned this pull request Jan 26, 2022

Refactoring and clean-ups #424

Merged

chfast added 2 commits January 26, 2022 21:03

baseline: Track stack evolution separately

0e7675a

Track EVM stack top item outside of State object. This make the stack top pointer always visible to the compiler and it is able to better optimize the interpreter.

Fix tracing: pass stack top pointer to tracers

0574645

chfast force-pushed the baseline_stack branch from a78f6bd to 0574645 Compare January 26, 2022 20:04

chfast merged commit 75b316a into master Jan 26, 2022

chfast deleted the baseline_stack branch January 26, 2022 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baseline: Track stack evolution separately #418

baseline: Track stack evolution separately #418

chfast commented Jan 17, 2022

chfast commented Jan 17, 2022

chfast Jan 19, 2022

yperbasis Jan 20, 2022

chfast Jan 20, 2022

yperbasis Jan 20, 2022

codecov bot commented Jan 19, 2022 •

edited

Loading

gumb0 commented Jan 26, 2022

gumb0 Jan 26, 2022 •

edited

Loading

gumb0 Jan 26, 2022

chfast Jan 26, 2022

gumb0 commented Jan 26, 2022 •

edited

Loading

chfast commented Jan 26, 2022

gumb0 left a comment

yperbasis commented Jan 28, 2022

	const auto stack_bottom = state.stack.top_item;
	const auto stack_top = state.stack.top_item;

baseline: Track stack evolution separately #418

baseline: Track stack evolution separately #418

Conversation

chfast commented Jan 17, 2022

chfast commented Jan 17, 2022

chfast Jan 19, 2022

Choose a reason for hiding this comment

yperbasis Jan 20, 2022

Choose a reason for hiding this comment

chfast Jan 20, 2022

Choose a reason for hiding this comment

yperbasis Jan 20, 2022

Choose a reason for hiding this comment

codecov bot commented Jan 19, 2022 • edited Loading

Codecov Report

gumb0 commented Jan 26, 2022

gumb0 Jan 26, 2022 • edited Loading

Choose a reason for hiding this comment

gumb0 Jan 26, 2022

Choose a reason for hiding this comment

chfast Jan 26, 2022

Choose a reason for hiding this comment

gumb0 commented Jan 26, 2022 • edited Loading

chfast commented Jan 26, 2022

gumb0 left a comment

Choose a reason for hiding this comment

yperbasis commented Jan 28, 2022

codecov bot commented Jan 19, 2022 •

edited

Loading

gumb0 Jan 26, 2022 •

edited

Loading

gumb0 commented Jan 26, 2022 •

edited

Loading