Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about opcode msize #981

Open
Alleysira opened this issue Jul 27, 2024 · 1 comment
Open

Question about opcode msize #981

Alleysira opened this issue Jul 27, 2024 · 1 comment
Labels
A-spec Area: specification C-bug Category: this is a bug, deviation, or other problem.

Comments

@Alleysira
Copy link

Metadata

  • Hardfork: cancun

What was wrong?

Hello developers,

I am testing various Ethereum virtual machines and have observed that when executing the opcode 0x59 (MSIZE), all the EVM implementations I tested (Geth, Besu, EthereumJS, Py-EVM) consistently return a value that is a multiple of 32 to the stack.

However, when I check the execution-spec at

def msize(evm: Evm) -> None:
"""
Push the size of active memory in bytes onto the stack.
Parameters
----------
evm :
The current EVM frame.
"""
# STACK
pass
# GAS
charge_gas(evm, GAS_BASE)
# OPERATION
push(evm.stack, U256(len(evm.memory)))
# PROGRAM COUNTER
evm.pc += 1

It seems that the specification directly pushes the length of evm.memory to the stack without accounting for the fact that the current memory size may not be a multiple of 32.

I noticed that when executing some opcodes that may cause memory expansion, this function will be called to make sure the length of memory to be multiple of 32.

def calculate_gas_extend_memory(
memory: bytearray, extensions: List[Tuple[U256, U256]]
) -> ExtendMemory:
"""
Calculates the gas amount to extend memory
Parameters
----------
memory :
Memory contents of the EVM.
extensions:
List of extensions to be made to the memory.
Consists of a tuple of start position and size.
Returns
-------
extend_memory: `ExtendMemory`
"""
size_to_extend = Uint(0)
to_be_paid = Uint(0)
current_size = Uint(len(memory))
for start_position, size in extensions:
if size == 0:
continue
before_size = ceil32(current_size)
after_size = ceil32(Uint(start_position) + Uint(size))
if after_size <= before_size:
continue
size_to_extend += after_size - before_size
already_paid = calculate_memory_gas_cost(before_size)
total_cost = calculate_memory_gas_cost(after_size)
to_be_paid += total_cost - already_paid
current_size = after_size
return ExtendMemory(to_be_paid, size_to_extend)

My question is whether the specification ensures that the length of memory will always be a multiple of 32, or if we need to add conditional checks or annotations to the MSIZE opcode to guarantee that it outputs a value that is a multiple of 32. Thank you for your attention.

Sources

evm.code

According to the definition of msize at evm.code, "The size is always a multiple of a word (32 bytes)."

yellow paper

I also check the shanghai version yellow book, in p36 it said:

image-20240727204640532

EVM implementation

I will give a bytecode example and its result after execution on different EVM implementations.

Bytecode as input : 631456015261af0b525960005260406000f3

PUSH4 0x14560152
PUSH2 0xaf0b
MSTORE 
MSIZE
PUSH1 0x40
PUSH1 0x00
RETURN

After execution, all EVMs returned with output 000000000000000000000000000000000000000000000000000000000000af400000000000000000000000000000000000000000000000000000000000000000

where 0xaf40 is the result of opcode msize and 0xaf40 is a multiple of 32.

The details are in the json files.
gethout.json, besuout.json, jsout.json, pyout.json

Thanks for your time.

@Alleysira Alleysira added A-spec Area: specification C-bug Category: this is a bug, deviation, or other problem. labels Jul 27, 2024
@SamWilsn
Copy link
Collaborator

I noticed that when executing some opcodes that may cause memory expansion, this function will be called to make sure the length of memory to be multiple of 32.

Did you notice anywhere that expanded memory without using calculate_gas_extend_memory? I think we're pretty consistent in using it, but I've been wrong many times before 😅


It might not be a bad idea to insert an assert into the MSIZE implementation (or maybe in the interpreter loop), to catch any bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-spec Area: specification C-bug Category: this is a bug, deviation, or other problem.
Projects
None yet
Development

No branches or pull requests

2 participants