-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eof: Introduce EOF container format support in Assembly::assemble
(EIP-3540)
#15367
Conversation
Thank you for your contribution to the Solidity compiler! A team member will follow up shortly. If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother. If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix. |
89784b7
to
5b61995
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a full review yet - I still need to finish going through assemble()
to make sure it preserves its old non-EOF behavior - but I'm done for today so here's what I have so far.
Overall nothing serious - in one place the non-EOF calculations seem to have changed, which I'm not sure if it's intentional or not, but does not seem dangerous either way. A bit wasteful at most.
Other than that, we could use more asserts in a few places. And some smaller tweaks that should be straightforward to apply.
libevmasm/Assembly.cpp
Outdated
} | ||
case PushProgramSize: | ||
{ | ||
assertThrow(!eof, AssemblyException, "Push program size in EOF code"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the style you used for these messages above was much clearer. Here it sounds like it's telling me to push program size into the code which makes no sense :)
assertThrow(!eof, AssemblyException, "Push program size in EOF code"); | |
assertThrow(!eof, AssemblyException, "PushProgramSize in EOF code"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why is it commented out for PushData
above?
Ah wait. Just realized that the initial commits here are also in #15356. So some of my comments really should have gone into that other PR. Dependencies of this kind should be mentioned in the description. We also have a |
859e524
to
c82e78b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through it again, this time there are no major issues and the behavior on legacy seems preserved. The main remaining problem is that this has a bit more duplication than it has to and I think it would be best to deal with it early instead of letting it grow (especially the opcode processing, which should be shared).
Overall there are still a few things that are worth adjusting before we merge but TBH none of them affects correctness of the implementation or non-EOF stuff so if you think some of them are better off dealt with separately or left as is I wouldn't be strongly opposed to approving the PR even without them.
libevmasm/Assembly.cpp
Outdated
|
||
std::vector<size_t> codeSectionSizeOffsets; | ||
auto setCodeSectionSize = [&](size_t _section, size_t _size) { | ||
toBigEndian(_size, bytesRef(ret.bytecode.data() + codeSectionSizeOffsets.at(_section), 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the _size <= 0xFFFF
assert.
Actually, maybe you should just add a setBigEndianUint16()
helper next and replace this and setDataSectionSize()
? This seems to be a common pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I see you don't have that assert in setBigEndianUint16()
now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because setBigEndian
checks this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserts in appendBigEndianUint16
are also not needed because appendBigEndian
also calls setBigEndian
which check the value to set already. Moreover toBigEndian
also verifies that value is unsigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. So it would be best to remove them from appendBigEndianUint16()
. It confused me because I naturally assumed, they should be in neither or in both so if that's not the case, it's probably wrong :)
libevmasm/Assembly.cpp
Outdated
switch (i.type()) | ||
{ | ||
case Operation: | ||
ret.bytecode.push_back(static_cast<uint8_t>(i.instruction())); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit that's worth extracting into a function shared between EOF and legacy, with if
s for the bits that differ between the two. What's here now already seems identical, and I think that in the end there will be more common code here than not.
The bit that loops over m_namedTags
also looks like it would work for both and could be a helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side-note, I know this was just carried over from our old code, so not requesting you to change it, but just wanted to say that names like byr
, b
or i
really suck when you have to read the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I wanted to make as few as possible changes in legacy code, so decided to postpone functions encapsulation in a separated PR. If you want me to do it in this PR I can do it. LMK.
Will update the naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want me to do it in this PR I can do it. LMK.
Sure, this can be done in a follow-up PR.
6341b7b
to
57d4c82
Compare
dea3e8f
to
4d4982a
Compare
======= yul_eof_container/input.yul (EVM) ======= | ||
|
||
Binary representation: | ||
ef00010100040200010003030004001a001b003b001b040000000080ffff5f80fdef00010100040200010007040000000080ffff5f805260205ffdef00010100040200010008040000000080ffff60015f5260205ffdef00010100040200010008030001001b040000000080ffff60025f5260205ffdef00010100040200010008040000000080ffff60205f5260205ffdef00010100040200010008040000000080ffff60035f5260205ffd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we had a better way to output this. This is quite tedious to verify so I'm just going to assume it's doing what it's supposed to :)
You could use --opcodes
instead of --bin
, but what that will print will only be correct for the code sections, not for the header itself. And we're planning to deprecate it anyway: #14033.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is to use default outputs for --strict-assembly
input but assembly output is not supported for EOF yet. When it will be I will remove --bin
and we will have Text representation:
output also.
For --opcodes
now unfortunately I get this error Error: The following outputs are not supported in assembler mode: --opcodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Text representation sounds good, as long as this is a format we're going to commit to and is suitable to show to users.
If not, we can still use it internally though - we can create a new test case type based on YulOptimizerTest
and use it there just for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished going through the PR, so to summarize, #15367 (comment) is basically the only real issue remaining.
I added a few more comments for minor things but those are just cosmetics and would not block the approval.
Ah wait, I didn't mean to approve yet. But well, given that you're fixing this as we speak, it would be approved in a moment anyway I guess. |
…EIP-3540) Co-authored-by: Daniel Kirchner <daniel@ekpyron.org>
4d4982a
to
bc9f3e0
Compare
No description provided.