-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861
Conversation
@@ -479,6 +479,8 @@ bool SemanticInformation::invalidInPureFunctions(Instruction _instruction) | |||
case Instruction::EXTCODESIZE: | |||
case Instruction::EXTCODECOPY: | |||
case Instruction::EXTCODEHASH: | |||
case Instruction::RETURNDATASIZE: | |||
case Instruction::RETURNDATACOPY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what should or should not be pure. AFAIK, there is no proper definition for what is a pure function in EVM. So this is entirely arbitrary.
The only restriction I can think about is related to compile time constant evaluation. Which we don't properly have yet. If that is indeed the case, there should be more instructions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we don't even have full agreement if pure
should refer to stuff that can be evaluated at compilation time or to stuff that does not touch state (#8153). So I think this out of scope of this PR.
This particular change to RETURNDATACOPY was explicitly requested by @chriseth:
0age just mentioned it the ethsecurity chat that returndatacopt is not marked as invalid in pure functions - can someone please create an issue about it and potentially even fix?
What I'm not clear about is whether this should really be considered a bug. I'd consider it a breaking change but from that comment it seems to me like @chriseth sees it as a bugfix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. why is this actually needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh.. I guess because pure
is considered as being evaluatable at compile time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. Currently pure
is closer to "not reading or writing", though there are discussions about changing/clarifying that. Reading returndata is a bit borderline IMO. I could see it being allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seem to depend on how pure
is exactly defined. I also understood it (now) as you said - not reading or writing from state (after reading the docs again). Isn't it correct to see it like that RETURNDATACOPY
& RETURNDATASIZE
is not reading or writing any state?
Why should RETURNDATACOPY
& RETURNDATASIZE
not be pure
? What is the big advantage to change that? Or isn't it like that RETURNDATACOPY
& RETURNDATASIZE
can basically only be used, if a call was done before? STATICCALL
is already considered as reading the state - it is already not allowed in pure functions.
If RETURNDATACOPY
& RETURNDATASIZE
is used without a call (e.g. in pure functions) I see it like that RETURNDATACOPY
will just not do anything. Where RETURNDATASIZE
will always return 0
.
For all other cases, calls are needed, that are not allowed in pure
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After having a short discussion with @chriseth and @hrkrshnn, I finally understand the underlying issue. The problem is basically, that it is possible to do a call before calling a pure function that maybe using RETURNDATACOPY
and RETURNDATASIZE
. In this case the values depend on the previous calls.
Thats why it makes totally sense to disallow RETURNDATACOPY
and RETURNDATASIZE
in pure functions.
697dbfb
to
3cf5a1e
Compare
f9d5ea0
to
a73ce85
Compare
Notes from the call: allow this. Might need to think about memory operations in the future. Also, |
mstore8(0, 1) | ||
//pop(sload(0)) | ||
//sstore(0, 1) | ||
//pop(gas()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not check the actual current behaviour. But wouldn't it be better to not comment those out, so that we can see the error messages later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why I also added inline_assembly_instructions_disallowed_pure.sol
. It contains only those commented-out opcodes.
I left them commented-out here because otherwise it's not that easy to see which ones are missing from the list.
Also, I want this case to pass without error because then it goes all the way through codegen. The failing one (obviously) stops at the analysis stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not uncomment, but let the errors there.. e.g.
// ====
// EVMVersion: >=london
// ----
// Warning 5740: (94-1721): Unreachable code.
...
// TypeError 8961: (1759-1774): Function cannot be declared as pure because this expression (potentially) modifies the state.
// TypeError 8961: (1809-1819): Function cannot be declared as pure because this expression (potentially) modifies the state.
...
// TypeError 8961: (1919-1941): Function cannot be declared as pure because this expression (potentially) modifies the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I would prefer having those errors.. but I also find the error messages a bit poor.. I think it would be better to know exactly what expression actually caused it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe two versions would make sense.. one where the stuff is commented out, and one that will produce an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. If you don't uncomment them, how would you get the error messages?
Everything is good! I did not saw that you already did what I meant with the *disallowed*
tests. That makes totally sense now. Somehow I just missed those. However, I meant the error messages that are shown e.g. in the *disallowed*
tests.. E.g. TypeError 2527: (153-162): Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view".
I was missing somehow the exact reference in the source code.. I was expecting that the thing is somehow written directly in the message. However, I also noticed now that the reference is shown, if solc
is used:
Error: Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view".
--> /Users/alex/git/solidity/test/libsolidity/syntaxTests/test2.sol:68:17:
|
68 | pop(chainid())
| ^^^^^^^^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have e.g. TypeError 2527: (153-162)
... wouldn't it be nice to extract the corresponding string (eg. string between characters 153-162
).. so we could also read the tests more easily.. e.g. TypeError 2527: (153-162): 'chainid()' Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there were similar ideas in #4620 and #9581. Maybe we should restart the discussion there. @nishant-sachdeva also recently commented on the fact that we don't even have line numbers in tests, which is another similar idea.
The thing is, we don't really have an agreement on what to do so it requires some design discussion.
I think that adding line numbers would probably be uncontroversial, easy to implement and would help a bit so maybe we should create a good first issue
for that? Not sure if we have enough manpower now to push through a bigger change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, having line numbers would make the navigation in files much more easy. But it will not help during reviews, because the user still need to look at numbers. I just created #12890 that adds the location string to the message. I think that increases the readability of such tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a monster PR :)
I think line numbers still help a bit. You can see them in diffs. With them it's much easier than what I do now when reviewing, i.e. counting errors manually.
test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_allowed_pure.sol
Show resolved
Hide resolved
Allow what? Allow RETURNDATACOPY in pure functions or allow this PR to be merged? |
test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_disallowed_pure.sol
Show resolved
Hide resolved
I think @hrkrshnn meant allow the disallow ;) |
I meant to say that allow this change ;) Just do add one more comment. These opcodes contract C {
function f() pure external returns (uint) {}
}
contract Test {
function test() pure external returns (uint ret) {
ret = C(address(123)).f();
}
} |
@cameel Maybe we should rename this PR (and commit messages) to communicate that this is only disallowing this in inline assembly? |
@aarlt About high level calls: I'd argue, yes. In particular, we should make pure to be as close to compile time evaluation as possible. Changing the behaviour for high level calls would definitely be a breaking change. Technically, this one as well, but we're trying to argue that we haven't mentioned the assembly part in the documentation. |
a73ce85
to
f567eb1
Compare
Good point. Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually still a bit wary about calling this a non-breaking bugfix - especially since occurrences after an external call to an external pure function are arguably just fine :-)...
But it's probably fine and the code change and tests look good, so I'm approving!
We could consider mentioning this in the release notes though, or something like that.
Personally I think it would be better to treat it as breaking because it's more of an omission than a real bug. But on the other hand we were called out on this which means that people already expect it to work that way... |
} | ||
} | ||
// ---- | ||
// SyntaxError 6553: (47-362): The msize instruction cannot be used when the Yul optimizer is activated because it can change its semantics. Either disable the Yul optimizer or do not use the instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this actually work :-)? We only run these tests with enabled yul optimizer? I wouldn't have guessed :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently. I was surprised too since I usually run isoltest
without the --optimize
flag.
I'd consider it a proper bug - if these are called before any external call in the pure function itself, then it's definitely contrary to the intended meaning of |
Is returndata really a part of state though? I see it more like some random piece of memory, which is why it seems like a grey area and is just easier to lump together with state than treat as a thing of its own. |
Yeah, I was a bit torn on this when writing that comment myself :-)... I mean - in a lot of cases the returndata of the previous call will literally be lying around somewhere in memory and thereby be accessible via |
We need to maintain a map of polluted memory slots :P Anyway, I think the change is good, just not 100% convinced that the case is clear enough that it should be a bugfix. |
Another argument for considering this breaking after all is I guess those helper functions were specifically marked |
Yeah, let's make it breaking. |
Fixes #12860.
The PR also adds comprehensive test cases for opcodes allowed/disallowed at each visibility level.