Tags · ethereum/execution-spec-tests

v5.4.0

chore: update steel website blog post links (#2318)

Nov 20, 2025
88e9fb8
zip
tar.gz
Notes
Downloads

bal@v2.0.0

chore: update steel website blog post links (#2318)

Nov 20, 2025
88e9fb8
zip
tar.gz
Notes
Downloads

bal@v1.8.0

chore: update steel website blog post links (#2318)

Nov 20, 2025
88e9fb8
zip
tar.gz
Notes
Downloads

bal@v1.7.0

chore: update steel website blog post links (#2318)

Nov 20, 2025
88e9fb8
zip
tar.gz
Notes
Downloads

benchmark@v0.0.6

chore(docs,ci): update docs for weld finalization; disable ci workflo…

…ws (#2317)

* chore(ci): disable cron schedule for eip checksums and links

* doc: update pr template and readme for weld finalization

* doc: remove bullet point

Nov 6, 2025
e9958ed
zip
tar.gz
Notes
Downloads

bal@v1.6.0

chore(docs,ci): update docs for weld finalization; disable ci workflo…

…ws (#2317)

* chore(ci): disable cron schedule for eip checksums and links

* doc: update pr template and readme for weld finalization

* doc: remove bullet point

Nov 6, 2025
e9958ed
zip
tar.gz
Notes
Downloads

bal@v1.5.0

chore(docs,ci): update docs for weld finalization; disable ci workflo…

…ws (#2317)

* chore(ci): disable cron schedule for eip checksums and links

* doc: update pr template and readme for weld finalization

* doc: remove bullet point

Nov 6, 2025
e9958ed
zip
tar.gz
Notes
Downloads

bal@v1.4.1

chore(docs,ci): update docs for weld finalization; disable ci workflo…

…ws (#2317)

* chore(ci): disable cron schedule for eip checksums and links

* doc: update pr template and readme for weld finalization

* doc: remove bullet point

Nov 6, 2025
e9958ed
zip
tar.gz
Notes
Downloads

benchmark@v0.0.5

feat(benchmark): add SLOAD/SSTORE benchmark test with multi-contract …

…support (#2256)

* feat(benchmark): add SLOAD benchmark test with multi-contract support

Add test_sload_empty_erc20_balanceof to benchmark SLOAD operations on
non-existing storage slots using ERC20 balanceOf() queries.

The idea of this benchmark is to exploit within a single or series of N
contracts calls to non-existing addresses. On this way, we force clients
to resolve as many tree branches as possible.

* feat(benchmark): add SSTORE benchmark test using ERC20 approve

Add test_sstore_erc20_approve that benchmarks SSTORE operations by calling
approve(spender, amount) on pre-deployed ERC20 contracts. Follows the same
pattern as the SLOAD benchmark:
- Auto-discovers ERC20 contracts from stubs
- Splits gas budget evenly across all discovered contracts
- Uses counter as both spender address and amount
- Forces SSTOREs to allowance mapping storage slots

The test measures client performance when writing to many storage slots
across multiple contracts, stressing state-handling write operations.

* fix(benchmark): correct SSTORE benchmark gas calculation

Fixed gas calculation for test_sstore_erc20_approve to ensure accurate
gas usage prediction and prevent transaction reverts:

Key fixes:
- Added memory expansion cost (15 gas per contract)
- Corrected G_LOW gas values in comments (5 gas, not 3)
- Separated per-contract overhead from per-iteration costs
- Improved cost calculation clarity with detailed opcode breakdown

Gas calculation (10M gas, 3 contracts):
- Intrinsic: 21,000
- Overhead per contract: 38
- Cost per iteration: 20,226
- Calls per contract: 164
- Expected gas used: 9,972,306 (99.72% utilization)

* feat(benchmark): add mixed SLOAD/SSTORE benchmark with configurable ratios

Add test_mixed_sload_sstore to test_multi_opcode.py that combines SLOAD
and SSTORE operations with parameterized gas distribution ratios (50-50,
70-30, 90-10).

The test stresses clients with mixed read/write workloads by:
- Dividing gas budget evenly across all discovered ERC20 contract stubs
- Splitting each contract's allocation by the specified percentage ratio
- Executing balanceOf (cold SLOAD on empty slots) for the SLOAD portion
- Executing approve (SSTORE to new allowance slots) for the SSTORE portion

Verified gas calculations for 10M gas budget with 3 contracts (50-50 ratio):
- SLOAD operations: ~2,312 gas/iteration → 719 calls per contract
- SSTORE operations: ~20,226 gas/iteration → 82 calls per contract
- Total operations: 2,403 state operations (2,157 SLOADs + 246 SSTOREs)
- Gas usage: 9.98M / 10M (16K buffer, no out-of-gas errors)

This benchmark enables testing different read/write ratios to identify
client performance characteristics under varying state operation mixes.

* refactor(benchmark): optimize SLOAD/SSTORE benchmarks per review feedback

Address review comments by optimizing loop efficiency:

1. Move function selector MSTORE outside loops (Comment #2)
   - BALANCEOF_SELECTOR and APPROVE_SELECTOR now stored once per contract
   - Saves 3 gas (G_VERY_LOW) per iteration
   - Total savings: ~6,471 gas for 50-50 ratio with 10M budget and 3 contracts

2. Remove unused return data from CALL operations (Comment #1)
   - Changed ret_offset=96/128, ret_size=32 to ret_offset=0, ret_size=0
   - Eliminates unnecessary memory expansion
   - Minor gas savings, cleaner implementation

Skipped Comment #3 (use Op.GAS for addresses):
- Would lose determinism (GAS varies per iteration)
- Adds complexity for minimal benefit
- Counter still needed for loop control

Changes applied to:
- test_sload_empty_erc20_balanceof
- test_sstore_erc20_approve
- test_mixed_sload_sstore (both SLOAD and SSTORE loops)

* refactor(benchmark): simplify SLOAD benchmark memory layout and fix calldata encoding

- Move selector MSTORE outside for-loop (saves gas per contract)
- Use single counter at MEM[32] instead of duplicate at MEM[0] and MEM[64]
- Fix calldata encoding by using args_offset=28 for correct ABI format
- Selector now properly positioned at start of calldata

* refactor(benchmark): simplify SSTORE benchmark memory layout and fix calldata encoding

- Move selector MSTORE outside for-loop (saves gas per contract)
- Use single counter at MEM[32] instead of duplicate at MEM[0]
- Fix calldata encoding by using args_offset=28 for correct ABI format
- Selector now properly positioned at start of calldata

* refactor(benchmark): simplify mixed SLOAD/SSTORE memory layout and fix calldata encoding

- Move selectors MSTORE outside for-loop (saves gas per contract)
- Use separate memory regions for balanceOf and approve to avoid conflicts
- Fix calldata encoding by using correct args_offset for proper ABI format
- Selectors now properly positioned at start of calldata

* refactor(benchmark): simplify mixed test to reuse memory layout consistently

- Reuse MEM[0] for both selectors (sequential operations, no conflict)
- Reuse MEM[32] for both counters (balanceOf then approve)
- Reuse MEM[64] and MEM[96] for parameters
- Consistent args_offset=28 for both operations (was 28 and 128)
- Matches single-opcode test pattern for easier understanding
- Reduces memory footprint from 196 bytes to 96 bytes

* feat(benchmark): add parametrized contract count and stub filtering to single-opcode tests

- Add parametrization for num_contracts [1, 5, 10, 20, 100]
- Implement stub prefix filtering based on test function name
- Add validation to error if insufficient matching stubs
- Add SSTORE benchmark architecture documentation
- Create README with setup instructions and stubs.json format

* fix(benchmark): add type annotations to test functions

* fix(benchmark): add AddressStubs type annotation to address_stubs parameter

* feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations

- Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test
- Implement stub prefix filtering for all benchmarks
- Fix gas cost calculations to account for COLD/WARM account access
- CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100)
- SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes
- Update gas calculation formulas to solve for calls per contract correctly

* feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations

- Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test
- Implement stub prefix filtering for all benchmarks
- Fix gas cost calculations to account for COLD/WARM account access
- CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100)
- SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes
- Update gas calculation formulas to solve for calls per contract correctly

Oct 23, 2025
54b46ea
zip
tar.gz
Notes
Downloads

bal@v1.4.0

feat(benchmark): add SLOAD/SSTORE benchmark test with multi-contract …

…support (#2256)

* feat(benchmark): add SLOAD benchmark test with multi-contract support

Add test_sload_empty_erc20_balanceof to benchmark SLOAD operations on
non-existing storage slots using ERC20 balanceOf() queries.

The idea of this benchmark is to exploit within a single or series of N
contracts calls to non-existing addresses. On this way, we force clients
to resolve as many tree branches as possible.

* feat(benchmark): add SSTORE benchmark test using ERC20 approve

Add test_sstore_erc20_approve that benchmarks SSTORE operations by calling
approve(spender, amount) on pre-deployed ERC20 contracts. Follows the same
pattern as the SLOAD benchmark:
- Auto-discovers ERC20 contracts from stubs
- Splits gas budget evenly across all discovered contracts
- Uses counter as both spender address and amount
- Forces SSTOREs to allowance mapping storage slots

The test measures client performance when writing to many storage slots
across multiple contracts, stressing state-handling write operations.

* fix(benchmark): correct SSTORE benchmark gas calculation

Fixed gas calculation for test_sstore_erc20_approve to ensure accurate
gas usage prediction and prevent transaction reverts:

Key fixes:
- Added memory expansion cost (15 gas per contract)
- Corrected G_LOW gas values in comments (5 gas, not 3)
- Separated per-contract overhead from per-iteration costs
- Improved cost calculation clarity with detailed opcode breakdown

Gas calculation (10M gas, 3 contracts):
- Intrinsic: 21,000
- Overhead per contract: 38
- Cost per iteration: 20,226
- Calls per contract: 164
- Expected gas used: 9,972,306 (99.72% utilization)

* feat(benchmark): add mixed SLOAD/SSTORE benchmark with configurable ratios

Add test_mixed_sload_sstore to test_multi_opcode.py that combines SLOAD
and SSTORE operations with parameterized gas distribution ratios (50-50,
70-30, 90-10).

The test stresses clients with mixed read/write workloads by:
- Dividing gas budget evenly across all discovered ERC20 contract stubs
- Splitting each contract's allocation by the specified percentage ratio
- Executing balanceOf (cold SLOAD on empty slots) for the SLOAD portion
- Executing approve (SSTORE to new allowance slots) for the SSTORE portion

Verified gas calculations for 10M gas budget with 3 contracts (50-50 ratio):
- SLOAD operations: ~2,312 gas/iteration → 719 calls per contract
- SSTORE operations: ~20,226 gas/iteration → 82 calls per contract
- Total operations: 2,403 state operations (2,157 SLOADs + 246 SSTOREs)
- Gas usage: 9.98M / 10M (16K buffer, no out-of-gas errors)

This benchmark enables testing different read/write ratios to identify
client performance characteristics under varying state operation mixes.

* refactor(benchmark): optimize SLOAD/SSTORE benchmarks per review feedback

Address review comments by optimizing loop efficiency:

1. Move function selector MSTORE outside loops (Comment #2)
   - BALANCEOF_SELECTOR and APPROVE_SELECTOR now stored once per contract
   - Saves 3 gas (G_VERY_LOW) per iteration
   - Total savings: ~6,471 gas for 50-50 ratio with 10M budget and 3 contracts

2. Remove unused return data from CALL operations (Comment #1)
   - Changed ret_offset=96/128, ret_size=32 to ret_offset=0, ret_size=0
   - Eliminates unnecessary memory expansion
   - Minor gas savings, cleaner implementation

Skipped Comment #3 (use Op.GAS for addresses):
- Would lose determinism (GAS varies per iteration)
- Adds complexity for minimal benefit
- Counter still needed for loop control

Changes applied to:
- test_sload_empty_erc20_balanceof
- test_sstore_erc20_approve
- test_mixed_sload_sstore (both SLOAD and SSTORE loops)

* refactor(benchmark): simplify SLOAD benchmark memory layout and fix calldata encoding

- Move selector MSTORE outside for-loop (saves gas per contract)
- Use single counter at MEM[32] instead of duplicate at MEM[0] and MEM[64]
- Fix calldata encoding by using args_offset=28 for correct ABI format
- Selector now properly positioned at start of calldata

* refactor(benchmark): simplify SSTORE benchmark memory layout and fix calldata encoding

- Move selector MSTORE outside for-loop (saves gas per contract)
- Use single counter at MEM[32] instead of duplicate at MEM[0]
- Fix calldata encoding by using args_offset=28 for correct ABI format
- Selector now properly positioned at start of calldata

* refactor(benchmark): simplify mixed SLOAD/SSTORE memory layout and fix calldata encoding

- Move selectors MSTORE outside for-loop (saves gas per contract)
- Use separate memory regions for balanceOf and approve to avoid conflicts
- Fix calldata encoding by using correct args_offset for proper ABI format
- Selectors now properly positioned at start of calldata

* refactor(benchmark): simplify mixed test to reuse memory layout consistently

- Reuse MEM[0] for both selectors (sequential operations, no conflict)
- Reuse MEM[32] for both counters (balanceOf then approve)
- Reuse MEM[64] and MEM[96] for parameters
- Consistent args_offset=28 for both operations (was 28 and 128)
- Matches single-opcode test pattern for easier understanding
- Reduces memory footprint from 196 bytes to 96 bytes

* feat(benchmark): add parametrized contract count and stub filtering to single-opcode tests

- Add parametrization for num_contracts [1, 5, 10, 20, 100]
- Implement stub prefix filtering based on test function name
- Add validation to error if insufficient matching stubs
- Add SSTORE benchmark architecture documentation
- Create README with setup instructions and stubs.json format

* fix(benchmark): add type annotations to test functions

* fix(benchmark): add AddressStubs type annotation to address_stubs parameter

* feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations

- Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test
- Implement stub prefix filtering for all benchmarks
- Fix gas cost calculations to account for COLD/WARM account access
- CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100)
- SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes
- Update gas calculation formulas to solve for calls per contract correctly

* feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations

- Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test
- Implement stub prefix filtering for all benchmarks
- Fix gas cost calculations to account for COLD/WARM account access
- CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100)
- SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes
- Update gas calculation formulas to solve for calls per contract correctly

Oct 23, 2025
54b46ea
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v5.4.0

bal@v2.0.0

bal@v1.8.0

bal@v1.7.0

benchmark@v0.0.6

bal@v1.6.0

bal@v1.5.0

bal@v1.4.1

benchmark@v0.0.5

bal@v1.4.0

Tags: ethereum/execution-spec-tests