Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861

cameel · 2022-03-25T19:28:24Z

The PR also adds comprehensive test cases for opcodes allowed/disallowed at each visibility level.

hrkrshnn · 2022-03-25T19:36:42Z

libevmasm/SemanticInformation.cpp

@@ -479,6 +479,8 @@ bool SemanticInformation::invalidInPureFunctions(Instruction _instruction)
 	case Instruction::EXTCODESIZE:
 	case Instruction::EXTCODECOPY:
 	case Instruction::EXTCODEHASH:
+	case Instruction::RETURNDATASIZE:
+	case Instruction::RETURNDATACOPY:


I'm not sure what should or should not be pure. AFAIK, there is no proper definition for what is a pure function in EVM. So this is entirely arbitrary.

The only restriction I can think about is related to compile time constant evaluation. Which we don't properly have yet. If that is indeed the case, there should be more instructions here.

Well, we don't even have full agreement if pure should refer to stuff that can be evaluated at compilation time or to stuff that does not touch state (#8153). So I think this out of scope of this PR.

This particular change to RETURNDATACOPY was explicitly requested by @chriseth:

0age just mentioned it the ethsecurity chat that returndatacopt is not marked as invalid in pure functions - can someone please create an issue about it and potentially even fix?

What I'm not clear about is whether this should really be considered a bug. I'd consider it a breaking change but from that comment it seems to me like @chriseth sees it as a bugfix.

Hmm.. why is this actually needed?

Ahh.. I guess because pure is considered as being evaluatable at compile time, right?

Not exactly. Currently pure is closer to "not reading or writing", though there are discussions about changing/clarifying that. Reading returndata is a bit borderline IMO. I could see it being allowed.

Yeah, seem to depend on how pure is exactly defined. I also understood it (now) as you said - not reading or writing from state (after reading the docs again). Isn't it correct to see it like that RETURNDATACOPY & RETURNDATASIZE is not reading or writing any state?

Why should RETURNDATACOPY & RETURNDATASIZE not be pure? What is the big advantage to change that? Or isn't it like that RETURNDATACOPY & RETURNDATASIZE can basically only be used, if a call was done before? STATICCALL is already considered as reading the state - it is already not allowed in pure functions.

If RETURNDATACOPY & RETURNDATASIZE is used without a call (e.g. in pure functions) I see it like that RETURNDATACOPY will just not do anything. Where RETURNDATASIZE will always return 0.

For all other cases, calls are needed, that are not allowed in pure functions.

After having a short discussion with @chriseth and @hrkrshnn, I finally understand the underlying issue. The problem is basically, that it is possible to do a call before calling a pure function that maybe using RETURNDATACOPY and RETURNDATASIZE. In this case the values depend on the previous calls.

Thats why it makes totally sense to disallow RETURNDATACOPY and RETURNDATASIZE in pure functions.

Changelog.md

hrkrshnn · 2022-03-30T12:36:46Z

Notes from the call: allow this. Might need to think about memory operations in the future.

Also, returndatasize may be used by gas optimizers instead of push1 0. In that sense, this can break some contracts.

aarlt · 2022-04-01T15:35:12Z

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_allowed_pure.sol

+            mstore8(0, 1)
+            //pop(sload(0))
+            //sstore(0, 1)
+            //pop(gas())


I did not check the actual current behaviour. But wouldn't it be better to not comment those out, so that we can see the error messages later?

Yeah, that's why I also added inline_assembly_instructions_disallowed_pure.sol. It contains only those commented-out opcodes.

I left them commented-out here because otherwise it's not that easy to see which ones are missing from the list.

Also, I want this case to pass without error because then it goes all the way through codegen. The failing one (obviously) stops at the analysis stage.

not uncomment, but let the errors there.. e.g.

// ==== // EVMVersion: >=london // ---- // Warning 5740: (94-1721): Unreachable code. ... // TypeError 8961: (1759-1774): Function cannot be declared as pure because this expression (potentially) modifies the state. // TypeError 8961: (1809-1819): Function cannot be declared as pure because this expression (potentially) modifies the state. ... // TypeError 8961: (1919-1941): Function cannot be declared as pure because this expression (potentially) modifies the state.

Right now I would prefer having those errors.. but I also find the error messages a bit poor.. I think it would be better to know exactly what expression actually caused it

Maybe two versions would make sense.. one where the stuff is commented out, and one that will produce an error?

I don't understand. If you don't uncomment them, how would you get the error messages?

Everything is good! I did not saw that you already did what I meant with the *disallowed* tests. That makes totally sense now. Somehow I just missed those. However, I meant the error messages that are shown e.g. in the *disallowed* tests.. E.g. TypeError 2527: (153-162): Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view". I was missing somehow the exact reference in the source code.. I was expecting that the thing is somehow written directly in the message. However, I also noticed now that the reference is shown, if solc is used:

Error: Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view". --> /Users/alex/git/solidity/test/libsolidity/syntaxTests/test2.sol:68:17: | 68 | pop(chainid()) | ^^^^^^^^^

If we have e.g. TypeError 2527: (153-162)... wouldn't it be nice to extract the corresponding string (eg. string between characters 153-162).. so we could also read the tests more easily.. e.g. TypeError 2527: (153-162): 'chainid()' Function declared as pure, but this expression (potentially) reads from the environment or state and thus requires "view".

Yeah, there were similar ideas in #4620 and #9581. Maybe we should restart the discussion there. @nishant-sachdeva also recently commented on the fact that we don't even have line numbers in tests, which is another similar idea.

The thing is, we don't really have an agreement on what to do so it requires some design discussion.

I think that adding line numbers would probably be uncontroversial, easy to implement and would help a bit so maybe we should create a good first issue for that? Not sure if we have enough manpower now to push through a bigger change.

Sure, having line numbers would make the navigation in files much more easy. But it will not help during reviews, because the user still need to look at numbers. I just created #12890 that adds the location string to the message. I think that increases the readability of such tests.

That's a monster PR :)

I think line numbers still help a bit. You can see them in diffs. With them it's much easier than what I do now when reviewing, i.e. counting errors manually.

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_allowed_pure.sol

cameel · 2022-04-01T15:44:50Z

Notes from the call: allow this. Might need to think about memory operations in the future.

Allow what? Allow RETURNDATACOPY in pure functions or allow this PR to be merged?

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_disallowed_pure.sol

aarlt · 2022-04-01T16:11:28Z

Notes from the call: allow this. Might need to think about memory operations in the future.

Allow what? Allow RETURNDATACOPY in pure functions or allow this PR to be merged?

I think @hrkrshnn meant allow the disallow ;)

hrkrshnn · 2022-04-01T16:14:32Z

I meant to say that allow this change ;)

Just do add one more comment. These opcodes returndatasize and returndatacopy are indeed used in high level calls. For example:

contract C {
    function f() pure external returns (uint) {}
}

contract Test {
    function test() pure external returns (uint ret) {
        ret = C(address(123)).f();
    }
}

aarlt · 2022-04-01T16:20:21Z

I meant to say that allow this change ;)

Just do add one more comment. These opcodes returndatasize and returndatacopy are indeed used in high level calls. For example:
contract C {
    function f() pure external returns (uint) {}
}

contract Test {
    function test() pure external returns (uint ret) {
        ret = C(address(123)).f();
    }
}

@cameel Maybe we should rename this PR (and commit messages) to communicate that this is only disallowing this in inline assembly?
@hrkrshnn I guess we don't want to allow this for high-level calls, right? Shouldn't be all calls disallowed in pure?

hrkrshnn · 2022-04-01T17:08:41Z

@aarlt About high level calls: I'd argue, yes. In particular, we should make pure to be as close to compile time evaluation as possible. Changing the behaviour for high level calls would definitely be a breaking change. Technically, this one as well, but we're trying to argue that we haven't mentioned the assembly part in the documentation.

…in pure functions

cameel · 2022-04-08T12:48:51Z

@aarlt

Maybe we should rename this PR (and commit messages) to communicate that this is only disallowing this in inline assembly?

Good point. Done.

ekpyron

I'm actually still a bit wary about calling this a non-breaking bugfix - especially since occurrences after an external call to an external pure function are arguably just fine :-)...
But it's probably fine and the code change and tests look good, so I'm approving!

We could consider mentioning this in the release notes though, or something like that.

cameel · 2022-04-08T13:09:01Z

Personally I think it would be better to treat it as breaking because it's more of an omission than a real bug. But on the other hand we were called out on this which means that people already expect it to work that way...

ekpyron · 2022-04-08T13:13:16Z

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_disallowed.sol

+    }
+}
+// ----
+// SyntaxError 6553: (47-362): The msize instruction cannot be used when the Yul optimizer is activated because it can change its semantics. Either disable the Yul optimizer or do not use the instruction.


How does this actually work :-)? We only run these tests with enabled yul optimizer? I wouldn't have guessed :-).

Apparently. I was surprised too since I usually run isoltest without the --optimize flag.

ekpyron · 2022-04-08T13:20:57Z

Personally I think it would be better to treat it as breaking because it's more of an omission than a real bug. But on the other hand we were called out on this which means that people already expect it to work that way...

I'd consider it a proper bug - if these are called before any external call in the pure function itself, then it's definitely contrary to the intended meaning of pure (under every of the different views on what pure means ;-)) - you can construct pure functions that read state like that after all... (respectively pure functions with different behaviour based on some state that affected the previous external call).
We could of course take drastic measures and do control flow analysis and only flag this in paths, in which there is no pure external call up front - but that's significant effort and in 0.9 we may disallow external calls in pure functions anyways (undecided though)... and it also wouldn't be perfect and would also have false positives... so whatever we do, it's not perfectly nice and maybe disallowing as non-breaking bug-fix is indeed the least harmful way to deal with it.

cameel · 2022-04-08T13:25:39Z

Is returndata really a part of state though? I see it more like some random piece of memory, which is why it seems like a grey area and is just easier to lump together with state than treat as a thing of its own.

ekpyron · 2022-04-08T13:41:49Z

Is returndata really a part of state though? I see it more like some random piece of memory, which is why it seems like a grey area and is just easier to lump together with state than treat as a thing of its own.

Yeah, I was a bit torn on this when writing that comment myself :-)... I mean - in a lot of cases the returndata of the previous call will literally be lying around somewhere in memory and thereby be accessible via mload... so is mload also impure :-)?

cameel · 2022-04-08T13:52:37Z

We need to maintain a map of polluted memory slots :P

Anyway, I think the change is good, just not 100% convinced that the case is clear enough that it should be a bugfix.

ekpyron · 2022-05-11T10:51:18Z

Another argument for considering this breaking after all is
https://github.com/ensdomains/ens-contracts/blob/master/contracts/utils/LowLevelCallUtils.sol

I guess those helper functions were specifically marked pure because before this change the compiler actively suggested to make them pure... after which making this an error in a non-breaking release is probably a bit mean...

chriseth · 2022-05-11T11:17:08Z

Yeah, let's make it breaking.

cameel requested review from chriseth and hrkrshnn March 25, 2022 19:28

cameel self-assigned this Mar 25, 2022

hrkrshnn reviewed Mar 25, 2022

View reviewed changes

cameel mentioned this pull request Mar 25, 2022

Disallow codecopy in pure functions (and check for other cases) #12260

Open

cameel force-pushed the disallow-returndatacopy-in-pure-functions branch from 697dbfb to 3cf5a1e Compare March 25, 2022 19:51

cameel mentioned this pull request Mar 25, 2022

soltest CI jobs no longer properly report errors #12862

Closed

cameel force-pushed the disallow-returndatacopy-in-pure-functions branch 3 times, most recently from f9d5ea0 to a73ce85 Compare March 26, 2022 00:30

aarlt reviewed Apr 1, 2022

View reviewed changes

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_allowed_pure.sol Show resolved Hide resolved

aarlt reviewed Apr 1, 2022

View reviewed changes

test/libsolidity/syntaxTests/viewPureChecker/inline_assembly_instructions_disallowed_pure.sol Show resolved Hide resolved

aarlt mentioned this pull request Apr 2, 2022

[test] Add location string to syntax tests. #12890

Closed

3 tasks

cameel changed the title ~~Disallow RETURNDATACOPY and RETURNDATASIZE in pure functions~~ Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly pure functions Apr 8, 2022

cameel changed the title ~~Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly pure functions~~ Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions Apr 8, 2022

cameel added 2 commits April 8, 2022 14:47

Tests for assembly instructions allowed with mutable/view/pure functions

fb7c969

Disallow RETURNDATASIZE and RETURNDATACOPY in inline assembly blocks …

f567eb1

…in pure functions

cameel force-pushed the disallow-returndatacopy-in-pure-functions branch from a73ce85 to f567eb1 Compare April 8, 2022 12:48

ekpyron approved these changes Apr 8, 2022

View reviewed changes

ekpyron reviewed Apr 8, 2022

View reviewed changes

chriseth approved these changes Apr 28, 2022

View reviewed changes

chriseth merged commit d55b84f into develop Apr 28, 2022

chriseth deleted the disallow-returndatacopy-in-pure-functions branch April 28, 2022 11:15

This was referenced May 12, 2022

Revert "Disallow RETURNDATASIZE and RETURNDATACOPY in inline assembly blocks in pure functions" #13013

Merged

Disallow RETURNDATASIZE and RETURNDATACOPY in inline assembly blocks in pure functions #13028

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861

Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861

cameel commented Mar 25, 2022

hrkrshnn Mar 25, 2022

cameel Mar 25, 2022 •

edited

Loading

aarlt Mar 29, 2022

aarlt Mar 29, 2022 •

edited

Loading

cameel Mar 29, 2022

aarlt Mar 29, 2022

aarlt Mar 30, 2022

hrkrshnn commented Mar 30, 2022 •

edited

Loading

aarlt Apr 1, 2022

cameel Apr 1, 2022

aarlt Apr 1, 2022

aarlt Apr 1, 2022

aarlt Apr 1, 2022 •

edited

Loading

aarlt Apr 1, 2022

aarlt Apr 1, 2022

cameel Apr 1, 2022

aarlt Apr 2, 2022

cameel Apr 8, 2022

cameel commented Apr 1, 2022

aarlt commented Apr 1, 2022

hrkrshnn commented Apr 1, 2022

aarlt commented Apr 1, 2022 •

edited

Loading

hrkrshnn commented Apr 1, 2022

cameel commented Apr 8, 2022

ekpyron left a comment

cameel commented Apr 8, 2022

ekpyron Apr 8, 2022

cameel Apr 8, 2022

ekpyron commented Apr 8, 2022 •

edited

Loading

cameel commented Apr 8, 2022

ekpyron commented Apr 8, 2022

cameel commented Apr 8, 2022 •

edited

Loading

ekpyron commented May 11, 2022

chriseth commented May 11, 2022

Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861

Disallow RETURNDATACOPY and RETURNDATASIZE in inline assembly in pure functions #12861

Conversation

cameel commented Mar 25, 2022

Choose a reason for hiding this comment

cameel Mar 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aarlt Mar 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hrkrshnn commented Mar 30, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aarlt Apr 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel commented Apr 1, 2022

aarlt commented Apr 1, 2022

hrkrshnn commented Apr 1, 2022

aarlt commented Apr 1, 2022 • edited Loading

hrkrshnn commented Apr 1, 2022

cameel commented Apr 8, 2022

ekpyron left a comment

Choose a reason for hiding this comment

cameel commented Apr 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekpyron commented Apr 8, 2022 • edited Loading

cameel commented Apr 8, 2022

ekpyron commented Apr 8, 2022

cameel commented Apr 8, 2022 • edited Loading

ekpyron commented May 11, 2022

chriseth commented May 11, 2022

cameel Mar 25, 2022 •

edited

Loading

aarlt Mar 29, 2022 •

edited

Loading

hrkrshnn commented Mar 30, 2022 •

edited

Loading

aarlt Apr 1, 2022 •

edited

Loading

aarlt commented Apr 1, 2022 •

edited

Loading

ekpyron commented Apr 8, 2022 •

edited

Loading

cameel commented Apr 8, 2022 •

edited

Loading