Skip to content

Conversation

@marioevz
Copy link
Member

@marioevz marioevz commented Jan 16, 2026

🗒️ Description

Implements some changes to the benchmark tests to use bytecode.gas_cost instead of manually adding fork.gas_costs cost values.

IteratingBytecode, FixedIterationsBytecode

Two new classes that subclass Bytecode but contain it in three different sections: setup, iterating (could be renamed to iteration perhaps?) and cleanup.

This allows the implementation of a new function gas_cost_by_iteration_count which returns the gas cost given a number of iterations of the iterating bytecode.

Also included is a field called warm_iterating which defaults to None but when set will be used as the iterating bytecode from the second iteration and onwards.

This allows passing bytecodes with different gas behaviors on the first run when compared to the second run and after. E.g. SSTORE accessing a cold key at the beginning but starting from the second iteration it becomes warm.

FixedIterationsBytecode is very similar but simply contains another field iteration_count which makes it so that bytecode.gas_cost for this instance uses this iteration count and returns the gas used after all the iterations are run.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@marioevz
Copy link
Member Author

cc @LouisTsai-Csie this is a work in progress but I wanted to highlight some ideas in this PR that could benefit benchmarking, and also so we don't duplicate work.

cc @jochem-brouwer also PTAL, I would appreciate your feedback on these new ideas!

@codecov
Copy link

codecov bot commented Jan 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.14%. Comparing base (58b4bb0) to head (b593678).
⚠️ Report is 1 commits behind head on forks/amsterdam.

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #2030   +/-   ##
================================================
  Coverage            86.14%   86.14%           
================================================
  Files                  599      599           
  Lines                39472    39472           
  Branches              3780     3780           
================================================
  Hits                 34002    34002           
  Misses                4848     4848           
  Partials               622      622           
Flag Coverage Δ
unittests 86.14% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, especially the txs_with_gas_limit_cap helper that uses a sliding-window approach to handle contract creation and the transaction gas limit cap. I think this could be useful for several bloatnet scenarios as well. After this PR (and PR #1962 ) merge, I can follow up by refactoring the bloatnet scenarios accordingly.

Also, for IteratingBytecode and FixedIterationsBytecode: since these are not limited to compute scenarios, what do you think about moving them under packages/testing/src/execution_testing/benchmark/? That would let us reuse them as templates for additional helper objects (e.g., MaxSizedContractInitcode, MaxSizedContractFactory, etc.).

Comment on lines 159 to 160
# All the storage slots in the contract are initialized to their index.
# That is, storage slot `i` is initialized to `i`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# All the storage slots in the contract are initialized to their index.
# That is, storage slot `i` is initialized to `i`.

This is only true if absent_slots=False, we could also simplify this:

 if absent_slots:
execution_code_body = Op.SSTORE(
    Op.DUP2, 0, original_value=0, new_value=0
)
else:
execution_code_body = Op.SSTORE(
    Op.DUP1, Op.DUP1, original_value=1, new_value=1
)

into this:

execution_code_body = Op.SSTORE(
    Op.DUP1, Op.DUP1, original_value=int(not absent_slots), new_value=1
)

Not sure if this breaks the readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the comment, but the optimization does not quite work because if we use Op.DUP1 with zero->zero case, we duplicate the index which is not zero.

Comment on lines +556 to +346
+ Op.ADD(1, Op.CALLDATALOAD(32))
+ Op.CALLDATALOAD(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we have some comment for the CALLDATA layout here.

CALLDATA[0..31] = start_index
CALLDATA[32..63] = end_index

@LouisTsai-Csie
Copy link
Collaborator

Hi @marioevz could we try to get this merged first so we don’t get file conflict ? I’d like to refactor the test_storage_access_cold benchmark to make it Osaka-compatible using your approach after this PR merged

@marioevz marioevz force-pushed the update-benchmark-tests branch from 79218ed to d32b3d1 Compare January 26, 2026 19:24
@marioevz marioevz changed the title WIP: Update Benchmark Tests to use bytecode.gas_cost refactor(testing): Implement IteratingBytecode, FixedIterationsBytecode Jan 26, 2026
@marioevz marioevz marked this pull request as ready for review January 27, 2026 01:10
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve reviewed most of the changes, except for test_iterating_bytecode.py, I’ll continue on it later today. I left a few comments, but overall it looks really good. Amazing work!

target_opcode=Op.SHA3,
code_generator=JumpLoopGenerator(
setup=Op.PUSH20[optimal_input_length],
attack_block=Op.POP(Op.SHA3(Op.PUSH0, Op.DUP1)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attack_block=iterating_bytecode,

Comment on lines +674 to +712
# Estimate maximum possible iterations
low = 0
high = 1
high_gas_cost = self.tx_gas_limit_by_iteration_count(
fork=fork,
iteration_count=high,
**intrinsic_cost_kwargs,
)
while high_gas_cost < gas_limit_cap:
low = high
high *= 2
high_gas_cost = self.tx_gas_limit_by_iteration_count(
fork=fork,
iteration_count=high,
**intrinsic_cost_kwargs,
)

# Binary search for the maximum number of iterations that fits
# within both remaining_gas and gas_limit_cap
best_iterations = 0
best_iterations_gas = 0

while low <= high:
mid = (low + high) // 2
if mid == 0:
break

tx_gas = self.tx_gas_limit_by_iteration_count(
fork=fork,
iteration_count=mid,
**intrinsic_cost_kwargs,
)

if tx_gas <= gas_limit_cap:
best_iterations = mid
best_iterations_gas = tx_gas
low = mid + 1
else:
high = mid - 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this logic duplicate in tx_iterations_by_gas_limit_cap and tx_iterations_by_gas_limit, could we replace with this _binary_search_iterations helper function?

def _binary_search_iterations(
    self,
    *,
    fork: Fork,
    gas_limit: int,
    initial_low: int = 0,
    initial_high: int = 1,
    **intrinsic_cost_kwargs: Any,
) -> Tuple[int, int]:
    """
    Binary search for the maximum iterations that fit within a gas limit.
    """
    low = initial_low
    high = initial_high

    # Exponential search to find upper bound
    high_gas_cost = self.tx_gas_limit_by_iteration_count(
        fork=fork,
        iteration_count=high,
        **intrinsic_cost_kwargs,
    )
    while high_gas_cost < gas_limit:
        low = high
        high *= 2
        high_gas_cost = self.tx_gas_limit_by_iteration_count(
            fork=fork,
            iteration_count=high,
            **intrinsic_cost_kwargs,
        )

    # Binary search for exact fit
    best_iterations = 0
    best_iterations_gas = 0

    while low <= high:
        mid = (low + high) // 2
        if mid == 0:
            break

        tx_gas = self.tx_gas_limit_by_iteration_count(
            fork=fork,
            iteration_count=mid,
            **intrinsic_cost_kwargs,
        )

        if tx_gas <= gas_limit:
            best_iterations = mid
            best_iterations_gas = tx_gas
            low = mid + 1
        else:
            high = mid - 1

    return best_iterations, best_iterations_gas

# Updated version
def tx_iterations_by_gas_limit_cap(
    self,
    *,
    fork: Fork,
    **intrinsic_cost_kwargs: Any,
) -> Tuple[int, int]:
    """
    Calculate the number of iterations needed to reach the fork's
    transaction gas limit cap.
    """
    gas_limit_cap = fork.transaction_gas_limit_cap()
    if gas_limit_cap is None:
        raise ValueError("Fork has no gas limit cap.")

    return self._binary_search_iterations(
        fork=fork,
        gas_limit=gas_limit_cap,
        initial_low=0,
        initial_high=1,
        **intrinsic_cost_kwargs,
    )

Comment on lines +670 to +672
gas_limit_cap = fork.transaction_gas_limit_cap()
if gas_limit_cap is None:
raise ValueError("Fork has no gas limit cap.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some benchmarks are still running under the Prague fork like fixed-opcode-count feature. With the current structure, Prague benchmark should use tx_iterations_by_gas_limit, while Osaka uses tx_iterations_by_gas_limit_cap.

We could simplify this by unifying the logic in tx_iterations_by_gas_limit_cap: if gas_limit_cap is None, fall back to using the block gas limit. This would avoid having separate code paths for Prague vs. Osaka.

Example:

@pytest.fixture
def tx_gas_limit(fork: Fork, gas_benchmark_value: int) -> int:
"""Return the transaction gas limit cap."""
return fork.transaction_gas_limit_cap() or gas_benchmark_value

Comment on lines +820 to +823
gas_limit_cap = fork.transaction_gas_limit_cap()
if gas_limit_cap is None:
# No limit, all iterations fit in a single transaction.
return [total_iterations]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful with this approach. In some cases, we use an optimistic estimation, and not all operations may fit into a single transaction. In those scenarios, we may need to split the execution into two transactions across separate blocks.

Comment on lines +838 to +839
remaining_iterations -= max_iterations_per_tx * full_txs
return result + [remaining_iterations]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is no remaining iteration, there might be trailing zero in the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants