Tags: ethereum/execution-spec-tests
Tags
feat(benchmark): add SLOAD/SSTORE benchmark test with multi-contract … …support (#2256) * feat(benchmark): add SLOAD benchmark test with multi-contract support Add test_sload_empty_erc20_balanceof to benchmark SLOAD operations on non-existing storage slots using ERC20 balanceOf() queries. The idea of this benchmark is to exploit within a single or series of N contracts calls to non-existing addresses. On this way, we force clients to resolve as many tree branches as possible. * feat(benchmark): add SSTORE benchmark test using ERC20 approve Add test_sstore_erc20_approve that benchmarks SSTORE operations by calling approve(spender, amount) on pre-deployed ERC20 contracts. Follows the same pattern as the SLOAD benchmark: - Auto-discovers ERC20 contracts from stubs - Splits gas budget evenly across all discovered contracts - Uses counter as both spender address and amount - Forces SSTOREs to allowance mapping storage slots The test measures client performance when writing to many storage slots across multiple contracts, stressing state-handling write operations. * fix(benchmark): correct SSTORE benchmark gas calculation Fixed gas calculation for test_sstore_erc20_approve to ensure accurate gas usage prediction and prevent transaction reverts: Key fixes: - Added memory expansion cost (15 gas per contract) - Corrected G_LOW gas values in comments (5 gas, not 3) - Separated per-contract overhead from per-iteration costs - Improved cost calculation clarity with detailed opcode breakdown Gas calculation (10M gas, 3 contracts): - Intrinsic: 21,000 - Overhead per contract: 38 - Cost per iteration: 20,226 - Calls per contract: 164 - Expected gas used: 9,972,306 (99.72% utilization) * feat(benchmark): add mixed SLOAD/SSTORE benchmark with configurable ratios Add test_mixed_sload_sstore to test_multi_opcode.py that combines SLOAD and SSTORE operations with parameterized gas distribution ratios (50-50, 70-30, 90-10). The test stresses clients with mixed read/write workloads by: - Dividing gas budget evenly across all discovered ERC20 contract stubs - Splitting each contract's allocation by the specified percentage ratio - Executing balanceOf (cold SLOAD on empty slots) for the SLOAD portion - Executing approve (SSTORE to new allowance slots) for the SSTORE portion Verified gas calculations for 10M gas budget with 3 contracts (50-50 ratio): - SLOAD operations: ~2,312 gas/iteration → 719 calls per contract - SSTORE operations: ~20,226 gas/iteration → 82 calls per contract - Total operations: 2,403 state operations (2,157 SLOADs + 246 SSTOREs) - Gas usage: 9.98M / 10M (16K buffer, no out-of-gas errors) This benchmark enables testing different read/write ratios to identify client performance characteristics under varying state operation mixes. * refactor(benchmark): optimize SLOAD/SSTORE benchmarks per review feedback Address review comments by optimizing loop efficiency: 1. Move function selector MSTORE outside loops (Comment #2) - BALANCEOF_SELECTOR and APPROVE_SELECTOR now stored once per contract - Saves 3 gas (G_VERY_LOW) per iteration - Total savings: ~6,471 gas for 50-50 ratio with 10M budget and 3 contracts 2. Remove unused return data from CALL operations (Comment #1) - Changed ret_offset=96/128, ret_size=32 to ret_offset=0, ret_size=0 - Eliminates unnecessary memory expansion - Minor gas savings, cleaner implementation Skipped Comment #3 (use Op.GAS for addresses): - Would lose determinism (GAS varies per iteration) - Adds complexity for minimal benefit - Counter still needed for loop control Changes applied to: - test_sload_empty_erc20_balanceof - test_sstore_erc20_approve - test_mixed_sload_sstore (both SLOAD and SSTORE loops) * refactor(benchmark): simplify SLOAD benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] and MEM[64] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify SSTORE benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify mixed SLOAD/SSTORE memory layout and fix calldata encoding - Move selectors MSTORE outside for-loop (saves gas per contract) - Use separate memory regions for balanceOf and approve to avoid conflicts - Fix calldata encoding by using correct args_offset for proper ABI format - Selectors now properly positioned at start of calldata * refactor(benchmark): simplify mixed test to reuse memory layout consistently - Reuse MEM[0] for both selectors (sequential operations, no conflict) - Reuse MEM[32] for both counters (balanceOf then approve) - Reuse MEM[64] and MEM[96] for parameters - Consistent args_offset=28 for both operations (was 28 and 128) - Matches single-opcode test pattern for easier understanding - Reduces memory footprint from 196 bytes to 96 bytes * feat(benchmark): add parametrized contract count and stub filtering to single-opcode tests - Add parametrization for num_contracts [1, 5, 10, 20, 100] - Implement stub prefix filtering based on test function name - Add validation to error if insufficient matching stubs - Add SSTORE benchmark architecture documentation - Create README with setup instructions and stubs.json format * fix(benchmark): add type annotations to test functions * fix(benchmark): add AddressStubs type annotation to address_stubs parameter * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly
feat(benchmark): add SLOAD/SSTORE benchmark test with multi-contract … …support (#2256) * feat(benchmark): add SLOAD benchmark test with multi-contract support Add test_sload_empty_erc20_balanceof to benchmark SLOAD operations on non-existing storage slots using ERC20 balanceOf() queries. The idea of this benchmark is to exploit within a single or series of N contracts calls to non-existing addresses. On this way, we force clients to resolve as many tree branches as possible. * feat(benchmark): add SSTORE benchmark test using ERC20 approve Add test_sstore_erc20_approve that benchmarks SSTORE operations by calling approve(spender, amount) on pre-deployed ERC20 contracts. Follows the same pattern as the SLOAD benchmark: - Auto-discovers ERC20 contracts from stubs - Splits gas budget evenly across all discovered contracts - Uses counter as both spender address and amount - Forces SSTOREs to allowance mapping storage slots The test measures client performance when writing to many storage slots across multiple contracts, stressing state-handling write operations. * fix(benchmark): correct SSTORE benchmark gas calculation Fixed gas calculation for test_sstore_erc20_approve to ensure accurate gas usage prediction and prevent transaction reverts: Key fixes: - Added memory expansion cost (15 gas per contract) - Corrected G_LOW gas values in comments (5 gas, not 3) - Separated per-contract overhead from per-iteration costs - Improved cost calculation clarity with detailed opcode breakdown Gas calculation (10M gas, 3 contracts): - Intrinsic: 21,000 - Overhead per contract: 38 - Cost per iteration: 20,226 - Calls per contract: 164 - Expected gas used: 9,972,306 (99.72% utilization) * feat(benchmark): add mixed SLOAD/SSTORE benchmark with configurable ratios Add test_mixed_sload_sstore to test_multi_opcode.py that combines SLOAD and SSTORE operations with parameterized gas distribution ratios (50-50, 70-30, 90-10). The test stresses clients with mixed read/write workloads by: - Dividing gas budget evenly across all discovered ERC20 contract stubs - Splitting each contract's allocation by the specified percentage ratio - Executing balanceOf (cold SLOAD on empty slots) for the SLOAD portion - Executing approve (SSTORE to new allowance slots) for the SSTORE portion Verified gas calculations for 10M gas budget with 3 contracts (50-50 ratio): - SLOAD operations: ~2,312 gas/iteration → 719 calls per contract - SSTORE operations: ~20,226 gas/iteration → 82 calls per contract - Total operations: 2,403 state operations (2,157 SLOADs + 246 SSTOREs) - Gas usage: 9.98M / 10M (16K buffer, no out-of-gas errors) This benchmark enables testing different read/write ratios to identify client performance characteristics under varying state operation mixes. * refactor(benchmark): optimize SLOAD/SSTORE benchmarks per review feedback Address review comments by optimizing loop efficiency: 1. Move function selector MSTORE outside loops (Comment #2) - BALANCEOF_SELECTOR and APPROVE_SELECTOR now stored once per contract - Saves 3 gas (G_VERY_LOW) per iteration - Total savings: ~6,471 gas for 50-50 ratio with 10M budget and 3 contracts 2. Remove unused return data from CALL operations (Comment #1) - Changed ret_offset=96/128, ret_size=32 to ret_offset=0, ret_size=0 - Eliminates unnecessary memory expansion - Minor gas savings, cleaner implementation Skipped Comment #3 (use Op.GAS for addresses): - Would lose determinism (GAS varies per iteration) - Adds complexity for minimal benefit - Counter still needed for loop control Changes applied to: - test_sload_empty_erc20_balanceof - test_sstore_erc20_approve - test_mixed_sload_sstore (both SLOAD and SSTORE loops) * refactor(benchmark): simplify SLOAD benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] and MEM[64] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify SSTORE benchmark memory layout and fix calldata encoding - Move selector MSTORE outside for-loop (saves gas per contract) - Use single counter at MEM[32] instead of duplicate at MEM[0] - Fix calldata encoding by using args_offset=28 for correct ABI format - Selector now properly positioned at start of calldata * refactor(benchmark): simplify mixed SLOAD/SSTORE memory layout and fix calldata encoding - Move selectors MSTORE outside for-loop (saves gas per contract) - Use separate memory regions for balanceOf and approve to avoid conflicts - Fix calldata encoding by using correct args_offset for proper ABI format - Selectors now properly positioned at start of calldata * refactor(benchmark): simplify mixed test to reuse memory layout consistently - Reuse MEM[0] for both selectors (sequential operations, no conflict) - Reuse MEM[32] for both counters (balanceOf then approve) - Reuse MEM[64] and MEM[96] for parameters - Consistent args_offset=28 for both operations (was 28 and 128) - Matches single-opcode test pattern for easier understanding - Reduces memory footprint from 196 bytes to 96 bytes * feat(benchmark): add parametrized contract count and stub filtering to single-opcode tests - Add parametrization for num_contracts [1, 5, 10, 20, 100] - Implement stub prefix filtering based on test function name - Add validation to error if insufficient matching stubs - Add SSTORE benchmark architecture documentation - Create README with setup instructions and stubs.json format * fix(benchmark): add type annotations to test functions * fix(benchmark): add AddressStubs type annotation to address_stubs parameter * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly * feat(benchmark): add parametrized contract count, stub filtering, and correct gas calculations - Add num_contracts parametrization [1, 5, 10, 20, 100] to multi-opcode test - Implement stub prefix filtering for all benchmarks - Fix gas cost calculations to account for COLD/WARM account access - CALL operations: first call to each contract is COLD (2600), subsequent are WARM (100) - SSTORE operations: add cold storage access cost (2100) for zero-to-non-zero writes - Update gas calculation formulas to solve for calls per contract correctly
PreviousNext