refactor(testing): Implement IteratingBytecode, FixedIterationsBytecode #2030

marioevz · 2026-01-16T16:21:37Z

🗒️ Description

Implements some changes to the benchmark tests to use bytecode.gas_cost instead of manually adding fork.gas_costs cost values.

`IteratingBytecode`, `FixedIterationsBytecode`

Two new classes that subclass Bytecode but contain it in three different sections: setup, iterating (could be renamed to iteration perhaps?) and cleanup.

This allows the implementation of a new function gas_cost_by_iteration_count which returns the gas cost given a number of iterations of the iterating bytecode.

Also included is a field called warm_iterating which defaults to None but when set will be used as the iterating bytecode from the second iteration and onwards.

This allows passing bytecodes with different gas behaviors on the first run when compared to the second run and after. E.g. SSTORE accessing a cold key at the beginning but starting from the second iteration it becomes warm.

FixedIterationsBytecode is very similar but simply contains another field iteration_count which makes it so that bytecode.gas_cost for this instance uses this iteration count and returns the gas used after all the iterations are run.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

marioevz · 2026-01-16T16:25:05Z

cc @LouisTsai-Csie this is a work in progress but I wanted to highlight some ideas in this PR that could benefit benchmarking, and also so we don't duplicate work.

cc @jochem-brouwer also PTAL, I would appreciate your feedback on these new ideas!

codecov · 2026-01-17T01:52:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.14%. Comparing base (58b4bb0) to head (b593678).
⚠️ Report is 1 commits behind head on forks/amsterdam.

Additional details and impacted files

@@               Coverage Diff                @@
##           forks/amsterdam    #2030   +/-   ##
================================================
  Coverage            86.14%   86.14%           
================================================
  Files                  599      599           
  Lines                39472    39472           
  Branches              3780     3780           
================================================
  Hits                 34002    34002           
  Misses                4848     4848           
  Partials               622      622

Flag	Coverage Δ
unittests	`86.14% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…2030)

LouisTsai-Csie

I like this idea, especially the txs_with_gas_limit_cap helper that uses a sliding-window approach to handle contract creation and the transaction gas limit cap. I think this could be useful for several bloatnet scenarios as well. After this PR (and PR #1962 ) merge, I can follow up by refactoring the bloatnet scenarios accordingly.

Also, for IteratingBytecode and FixedIterationsBytecode: since these are not limited to compute scenarios, what do you think about moving them under packages/testing/src/execution_testing/benchmark/? That would let us reuse them as templates for additional helper objects (e.g., MaxSizedContractInitcode, MaxSizedContractFactory, etc.).

LouisTsai-Csie · 2026-01-22T06:39:19Z

tests/benchmark/compute/instruction/test_storage.py

        # All the storage slots in the contract are initialized to their index.
        # That is, storage slot `i` is initialized to `i`.


Suggested change

# All the storage slots in the contract are initialized to their index.

# That is, storage slot `i` is initialized to `i`.

This is only true if absent_slots=False, we could also simplify this:

if absent_slots: execution_code_body = Op.SSTORE( Op.DUP2, 0, original_value=0, new_value=0 ) else: execution_code_body = Op.SSTORE( Op.DUP1, Op.DUP1, original_value=1, new_value=1 )

into this:

execution_code_body = Op.SSTORE( Op.DUP1, Op.DUP1, original_value=int(not absent_slots), new_value=1 )

Not sure if this breaks the readability.

I've updated the comment, but the optimization does not quite work because if we use Op.DUP1 with zero->zero case, we duplicate the index which is not zero.

LouisTsai-Csie · 2026-01-22T09:13:41Z

tests/benchmark/compute/helpers.py

+            + Op.ADD(1, Op.CALLDATALOAD(32))
+            + Op.CALLDATALOAD(0)


It would be nice if we have some comment for the CALLDATA layout here.

CALLDATA[0..31] = start_index
CALLDATA[32..63] = end_index

LouisTsai-Csie · 2026-01-23T07:38:58Z

Hi @marioevz could we try to get this merged first so we don’t get file conflict ? I’d like to refactor the test_storage_access_cold benchmark to make it Osaka-compatible using your approach after this PR merged

fix feat: Implement deployer txs method fixes words

LouisTsai-Csie

I’ve reviewed most of the changes, except for test_iterating_bytecode.py, I’ll continue on it later today. I left a few comments, but overall it looks really good. Amazing work!

LouisTsai-Csie · 2026-01-27T06:46:02Z

tests/benchmark/compute/instruction/test_keccak.py

        target_opcode=Op.SHA3,
        code_generator=JumpLoopGenerator(
            setup=Op.PUSH20[optimal_input_length],
            attack_block=Op.POP(Op.SHA3(Op.PUSH0, Op.DUP1)),


Suggested change

attack_block=iterating_bytecode,

LouisTsai-Csie · 2026-01-27T08:15:26Z

packages/testing/src/execution_testing/tools/tools_code/generators.py

+        # Estimate maximum possible iterations
+        low = 0
+        high = 1
+        high_gas_cost = self.tx_gas_limit_by_iteration_count(
+            fork=fork,
+            iteration_count=high,
+            **intrinsic_cost_kwargs,
+        )
+        while high_gas_cost < gas_limit_cap:
+            low = high
+            high *= 2
+            high_gas_cost = self.tx_gas_limit_by_iteration_count(
+                fork=fork,
+                iteration_count=high,
+                **intrinsic_cost_kwargs,
+            )
+
+        # Binary search for the maximum number of iterations that fits
+        # within both remaining_gas and gas_limit_cap
+        best_iterations = 0
+        best_iterations_gas = 0
+
+        while low <= high:
+            mid = (low + high) // 2
+            if mid == 0:
+                break
+
+            tx_gas = self.tx_gas_limit_by_iteration_count(
+                fork=fork,
+                iteration_count=mid,
+                **intrinsic_cost_kwargs,
+            )
+
+            if tx_gas <= gas_limit_cap:
+                best_iterations = mid
+                best_iterations_gas = tx_gas
+                low = mid + 1
+            else:
+                high = mid - 1


I found this logic duplicate in tx_iterations_by_gas_limit_cap and tx_iterations_by_gas_limit, could we replace with this _binary_search_iterations helper function?

def _binary_search_iterations( self, *, fork: Fork, gas_limit: int, initial_low: int = 0, initial_high: int = 1, **intrinsic_cost_kwargs: Any, ) -> Tuple[int, int]: """ Binary search for the maximum iterations that fit within a gas limit. """ low = initial_low high = initial_high # Exponential search to find upper bound high_gas_cost = self.tx_gas_limit_by_iteration_count( fork=fork, iteration_count=high, **intrinsic_cost_kwargs, ) while high_gas_cost < gas_limit: low = high high *= 2 high_gas_cost = self.tx_gas_limit_by_iteration_count( fork=fork, iteration_count=high, **intrinsic_cost_kwargs, ) # Binary search for exact fit best_iterations = 0 best_iterations_gas = 0 while low <= high: mid = (low + high) // 2 if mid == 0: break tx_gas = self.tx_gas_limit_by_iteration_count( fork=fork, iteration_count=mid, **intrinsic_cost_kwargs, ) if tx_gas <= gas_limit: best_iterations = mid best_iterations_gas = tx_gas low = mid + 1 else: high = mid - 1 return best_iterations, best_iterations_gas # Updated version def tx_iterations_by_gas_limit_cap( self, *, fork: Fork, **intrinsic_cost_kwargs: Any, ) -> Tuple[int, int]: """ Calculate the number of iterations needed to reach the fork's transaction gas limit cap. """ gas_limit_cap = fork.transaction_gas_limit_cap() if gas_limit_cap is None: raise ValueError("Fork has no gas limit cap.") return self._binary_search_iterations( fork=fork, gas_limit=gas_limit_cap, initial_low=0, initial_high=1, **intrinsic_cost_kwargs, )

LouisTsai-Csie · 2026-01-27T08:20:45Z

packages/testing/src/execution_testing/tools/tools_code/generators.py

+        gas_limit_cap = fork.transaction_gas_limit_cap()
+        if gas_limit_cap is None:
+            raise ValueError("Fork has no gas limit cap.")


Some benchmarks are still running under the Prague fork like fixed-opcode-count feature. With the current structure, Prague benchmark should use tx_iterations_by_gas_limit, while Osaka uses tx_iterations_by_gas_limit_cap.

We could simplify this by unifying the logic in tx_iterations_by_gas_limit_cap: if gas_limit_cap is None, fall back to using the block gas limit. This would avoid having separate code paths for Prague vs. Osaka.

Example:

execution-specs/tests/benchmark/conftest.py

Lines 90 to 93 in c6c380a

@pytest.fixture

def tx_gas_limit(fork: Fork, gas_benchmark_value: int) -> int:

"""Return the transaction gas limit cap."""

return fork.transaction_gas_limit_cap() or gas_benchmark_value

LouisTsai-Csie · 2026-01-27T08:23:24Z

packages/testing/src/execution_testing/tools/tools_code/generators.py

+        gas_limit_cap = fork.transaction_gas_limit_cap()
+        if gas_limit_cap is None:
+            # No limit, all iterations fit in a single transaction.
+            return [total_iterations]


We should be careful with this approach. In some cases, we use an optimistic estimation, and not all operations may fit into a single transaction. In those scenarios, we may need to split the execution into two transactions across separate blocks.

LouisTsai-Csie · 2026-01-27T09:04:32Z

packages/testing/src/execution_testing/tools/tools_code/generators.py

+        remaining_iterations -= max_iterations_per_tx * full_txs
+        return result + [remaining_iterations]


What if there is no remaining iteration, there might be trailing zero in the list.

LouisTsai-Csie self-requested a review January 19, 2026 13:21

marioevz force-pushed the update-benchmark-tests branch from 674c395 to 79218ed Compare January 20, 2026 00:00

jsign pushed a commit to jsign/execution-specs that referenced this pull request Jan 20, 2026

chore(fill): make slow tests use separate pre-alloc groups (ethereum#…

0836ed2

…2030)

spencer-tb mentioned this pull request Jan 20, 2026

chore(test-tests): migrate tests from fork.gas_costs to bytecode.gas_cost(fork) #2049

Open

65 tasks

LouisTsai-Csie requested changes Jan 22, 2026

View reviewed changes

marioevz added 5 commits January 26, 2026 17:42

refactor(tests/benchmark): use bytecode.gas_cost

f080d61

fix feat: Implement deployer txs method fixes words

fix: argument

6f61366

refactor(testing): Move IteratingBytecode, FixedIterationsBytecode

ac664d5

fix(testing): IteratingBytecode case zero iterations

57fd02f

fix: Use create 2 preimage methods

d32b3d1

marioevz force-pushed the update-benchmark-tests branch from 79218ed to d32b3d1 Compare January 26, 2026 19:24

fix(tests): tox

a5ccc83

marioevz changed the title ~~WIP: Update Benchmark Tests to use bytecode.gas_cost~~ refactor(testing): Implement IteratingBytecode, FixedIterationsBytecode Jan 26, 2026

marioevz added 2 commits January 27, 2026 01:06

refactor

97963f7

review comments

b593678

marioevz requested a review from LouisTsai-Csie January 27, 2026 01:09

marioevz marked this pull request as ready for review January 27, 2026 01:10

LouisTsai-Csie requested changes Jan 27, 2026

View reviewed changes

		# All the storage slots in the contract are initialized to their index.
		# That is, storage slot `i` is initialized to `i`.

	@pytest.fixture
	def tx_gas_limit(fork: Fork, gas_benchmark_value: int) -> int:
	"""Return the transaction gas limit cap."""
	return fork.transaction_gas_limit_cap() or gas_benchmark_value

		remaining_iterations -= max_iterations_per_tx * full_txs
		return result + [remaining_iterations]

refactor(testing): Implement IteratingBytecode, FixedIterationsBytecode #2030

Are you sure you want to change the base?

refactor(testing): Implement IteratingBytecode, FixedIterationsBytecode #2030

Uh oh!

Conversation

marioevz commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

IteratingBytecode, FixedIterationsBytecode

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

marioevz commented Jan 16, 2026

Uh oh!

codecov bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LouisTsai-Csie left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie commented Jan 23, 2026

Uh oh!

LouisTsai-Csie left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marioevz commented Jan 16, 2026 •

edited

Loading

`IteratingBytecode`, `FixedIterationsBytecode`

codecov bot commented Jan 17, 2026 •

edited

Loading