Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eof: Introduce EOF container format support in Assembly::assemble (EIP-3540) #15367

Merged
merged 3 commits into from
Sep 20, 2024

Conversation

rodiazet
Copy link
Contributor

No description provided.

Copy link

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

@rodiazet rodiazet force-pushed the eof-prep-3 branch 5 times, most recently from 89784b7 to 5b61995 Compare August 29, 2024 11:19
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a full review yet - I still need to finish going through assemble() to make sure it preserves its old non-EOF behavior - but I'm done for today so here's what I have so far.

Overall nothing serious - in one place the non-EOF calculations seem to have changed, which I'm not sure if it's intentional or not, but does not seem dangerous either way. A bit wasteful at most.

Other than that, we could use more asserts in a few places. And some smaller tweaks that should be straightforward to apply.

libevmasm/Assembly.cpp Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
}
case PushProgramSize:
{
assertThrow(!eof, AssemblyException, "Push program size in EOF code");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the style you used for these messages above was much clearer. Here it sounds like it's telling me to push program size into the code which makes no sense :)

Suggested change
assertThrow(!eof, AssemblyException, "Push program size in EOF code");
assertThrow(!eof, AssemblyException, "PushProgramSize in EOF code");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is it commented out for PushData above?

libevmasm/Assembly.h Outdated Show resolved Hide resolved
libyul/YulStack.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
@cameel
Copy link
Member

cameel commented Aug 29, 2024

Ah wait. Just realized that the initial commits here are also in #15356. So some of my comments really should have gone into that other PR.

Dependencies of this kind should be mentioned in the description. We also have a has dependencies label that we can put on such PRs to avoid mistakenly reviewing or merging them out of order.

@cameel cameel added has dependencies The PR depends on other PRs that must be merged first EOF labels Aug 29, 2024
@rodiazet rodiazet force-pushed the eof-prep-3 branch 5 times, most recently from 859e524 to c82e78b Compare September 4, 2024 14:29
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through it again, this time there are no major issues and the behavior on legacy seems preserved. The main remaining problem is that this has a bit more duplication than it has to and I think it would be best to deal with it early instead of letting it grow (especially the opcode processing, which should be shared).

Overall there are still a few things that are worth adjusting before we merge but TBH none of them affects correctness of the implementation or non-EOF stuff so if you think some of them are better off dealt with separately or left as is I wouldn't be strongly opposed to approving the PR even without them.

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.h Outdated Show resolved Hide resolved

std::vector<size_t> codeSectionSizeOffsets;
auto setCodeSectionSize = [&](size_t _section, size_t _size) {
toBigEndian(_size, bytesRef(ret.bytecode.data() + codeSectionSizeOffsets.at(_section), 2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the _size <= 0xFFFF assert.

Actually, maybe you should just add a setBigEndianUint16() helper next and replace this and setDataSectionSize()? This seems to be a common pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I see you don't have that assert in setBigEndianUint16() now :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because setBigEndian checks this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asserts in appendBigEndianUint16 are also not needed because appendBigEndian also calls setBigEndian which check the value to set already. Moreover toBigEndian also verifies that value is unsigned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. So it would be best to remove them from appendBigEndianUint16(). It confused me because I naturally assumed, they should be in neither or in both so if that's not the case, it's probably wrong :)

Comment on lines 1323 to 1343
switch (i.type())
{
case Operation:
ret.bytecode.push_back(static_cast<uint8_t>(i.instruction()));
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit that's worth extracting into a function shared between EOF and legacy, with ifs for the bits that differ between the two. What's here now already seems identical, and I think that in the end there will be more common code here than not.

The bit that loops over m_namedTags also looks like it would work for both and could be a helper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side-note, I know this was just carried over from our old code, so not requesting you to change it, but just wanted to say that names like byr, b or i really suck when you have to read the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I wanted to make as few as possible changes in legacy code, so decided to postpone functions encapsulation in a separated PR. If you want me to do it in this PR I can do it. LMK.
Will update the naming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want me to do it in this PR I can do it. LMK.

Sure, this can be done in a follow-up PR.

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libsolidity/interface/CompilerStack.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-prep-3 branch 8 times, most recently from 6341b7b to 57d4c82 Compare September 20, 2024 11:09
@cameel cameel removed the has dependencies The PR depends on other PRs that must be merged first label Sep 20, 2024
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-prep-3 branch 2 times, most recently from dea3e8f to 4d4982a Compare September 20, 2024 12:46
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
======= yul_eof_container/input.yul (EVM) =======

Binary representation:
ef00010100040200010003030004001a001b003b001b040000000080ffff5f80fdef00010100040200010007040000000080ffff5f805260205ffdef00010100040200010008040000000080ffff60015f5260205ffdef00010100040200010008030001001b040000000080ffff60025f5260205ffdef00010100040200010008040000000080ffff60205f5260205ffdef00010100040200010008040000000080ffff60035f5260205ffd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we had a better way to output this. This is quite tedious to verify so I'm just going to assume it's doing what it's supposed to :)

You could use --opcodes instead of --bin, but what that will print will only be correct for the code sections, not for the header itself. And we're planning to deprecate it anyway: #14033.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to use default outputs for --strict-assembly input but assembly output is not supported for EOF yet. When it will be I will remove --bin and we will have Text representation: output also.

For --opcodes now unfortunately I get this error Error: The following outputs are not supported in assembler mode: --opcodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text representation sounds good, as long as this is a format we're going to commit to and is suitable to show to users.

If not, we can still use it internally though - we can create a new test case type based on YulOptimizerTest and use it there just for testing.

cameel
cameel previously approved these changes Sep 20, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finished going through the PR, so to summarize, #15367 (comment) is basically the only real issue remaining.

I added a few more comments for minor things but those are just cosmetics and would not block the approval.

@cameel
Copy link
Member

cameel commented Sep 20, 2024

Ah wait, I didn't mean to approve yet. But well, given that you're fixing this as we speak, it would be approved in a moment anyway I guess.

@cameel cameel merged commit 33b67f0 into ethereum:develop Sep 20, 2024
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants