Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strengthen the definition of pure potentially disallowing e.g. codecopy - or "the future of pure". #8153

Closed
ekpyron opened this issue Jan 15, 2020 · 18 comments
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. high impact Changes are very prominent and affect users or the project in a major way. medium effort Default level of effort needs design The proposal is too vague to be implemented right away stale The issue/PR was marked as stale because it has been open for too long.

Comments

@ekpyron
Copy link
Member

ekpyron commented Jan 15, 2020

Came up in #3835, in particular #3835 (comment) states the issue.

The main question is: since view already allows staticcall, is there any use for the coarse definition of pure, that allows for expressions that are not compile-time constant?

Personally, I'd say we should officially strengthen pure, so that everything pure can be used in place of compile-time-constant expressions, e.g. the most basic definition of pure is "everything that can be used as array size of a statically sized array".

We could also move towards the opposite direction and e.g. allow reading the address of this in pure functions and weaken pure to mainly mean "does not read from storage". But then we need yet another stricter concept for compile-time-constant expressions. So I'm not a fan of that course of action :-).

@ekpyron ekpyron changed the title Strengthen the definition of pure potentially disallowing e.g. codecopy. Strengthen the definition of pure potentially disallowing e.g. codecopy - or "the future of pure". Jan 15, 2020
@axic
Copy link
Member

axic commented Jan 15, 2020

This is what pure was supposed to be, but it was softened up somehow 😉

@chriseth
Copy link
Contributor

I agree, pure functions should not be allowed to read from data that can be modified by a constructor.

@ekpyron
Copy link
Member Author

ekpyron commented Jan 15, 2020

I agree, but then SemanticInformation::invalidInPureFunctions should be extended by CODECOPY - and there may be others, we should check. I'll move this to "accepted in 0.7.0" right away then (unless we call it "bug" and want to do it earlier).

@ekpyron ekpyron added S and removed to discuss labels Jan 15, 2020
@MicahZoltu
Copy link
Contributor

(sorry for the double-post, missed the comment that discussion about pure should be moved here)

@ekpyron Great breakdown (in the linked issue) of the two potential definitions of pure. I have always treated pure as along the lines of "if you call this method on this contract multiple times with the same inputs, you'll get the same result every time no matter what other conditions change between calls". This interpretation generally aligns with what I want as a dapp developer, and it also makes sense that codecopy would be allowed. To me, pure does not mean "this function would yield the same result no matter what contract it was implemented in".

I can kind of see the merits to the other definitions, but IMO "it yields the same result if called multiple times" feels both the most useful and the most natural.

Can someone express the argument as to why a user would want pure to mean "provides the same result no matter what bytecode the function is deployed with" is? I want to be careful that we are implementing pure in the most pure (pun intended) way because it offers the best developer experience, not just because it is the strictest definition we can come up with.

@chriseth
Copy link
Contributor

@MicahZoltu the only problem I see is a potential confusion that a pure method in the less restrictive sense yields a different result when moved to a different contract. Or put a different way: In the functional sense, pure means that there is no other state to take into account except the arguments to the function call.

On the other hand, when seen at the EVM level, then a pure function always reads the code, because that is what is executed, so maybe it should also be able to use codecopy.

Maybe we should ask the same question from the other end: How would you use "pure" as a feature when it is defined in the one or the other way?

@MicahZoltu
Copy link
Contributor

Honestly, I don't use it much. view gives huge benefits over non-view, it is an even bigger step than from payable to non-payable. pure however is a pretty small/marginal step. It is useful sometimes during auditing, just as any compiler guarantees are, but it doesn't come up that often.

When it does come up in auditing for me, I am usually just using it as a shortcut to avoid going down some particular code path when I need to validate that a call won't read some dirty storage variable. If pure didn't exist (only view did) I wouldn't really miss it that much personally, though I don't advocate for removing it as I'm a big fan of any additional guarantees the compiler can make.

I would also support splitting pure into two different modifiers:

  1. functionality can be determined by looking at the code of this function only
  2. function will not read any storage variables or block details, but its execution may be dependent on static parts of the contract like the contract's code, the contract's hash, the contract's address, immutable variables, etc.

@elenadimitrova elenadimitrova removed the S label Feb 12, 2020
@chriseth
Copy link
Contributor

Agreement from call: codecopy should not be pure (let alone for consistency with immutable).

@ekpyron which other things did you have in mind?

@ekpyron
Copy link
Member Author

ekpyron commented Apr 2, 2020

We should probably go through the list of opcodes to check if there's anything else, but codecopy was the main thing so far, that should be stricter.

@ekpyron
Copy link
Member Author

ekpyron commented Dec 10, 2020

This issue just came up again recently, when we realized that we - while we wanted to get all this sorted for 0.7.0 - don't really have consensus about pure and sorted this out for the upcoming 0.8.0 release :-). The plan now is to discuss this further and finally come to some consensus for 0.9.0.

While I think the two ways to think of pure have been described well already, let me offer another perspective on that that came to my mind:

I think that intuitively there is a difference between external pure and internal pure, so there is no wonder, that there is disagreement about what a general public pure is supposed to be :-).
As an example msg.data came up recently. In an external function msg.data can effectively be considered the same as a function argument, so - thinking of pure as "a function only depends on inputs" - msg.data should be allowed in external pure functions.
Also, it does not make sense to think about "compile-time constantness" in external pure functions - e.g. I can never use the result of a call to an external function as static array size, function f() external pure returns(uint) {...} function g() public { uint[this.f()] x; }.

However, internal pure functions are different. For those msg.data is not effectively the same as the function arguments, but rather "additional state". I can call the same internal pure function with the same arguments, expecting the same results, while having different contents in msg.data in the background.
Also, now for internal pure functions I would say "only depends on its inputs" effectively means it is "compile-time constant". So for internal pure functions, I would in fact expect to be able to use them in the size of statically sized arrays.

I'm definitely not saying that we should make the meaning of pure depend on visibility (especially since that would still not clarify what to do for public), but maybe this helps in the discussion. I actually - contrary to my earlier position - might be starting to lean towards going for what I just described as external pure (having internal pure just be "usable in external pure") and then have "compile-time constantness", which is quite a bit stricter, just be determined by the compiler via analysis (or there were suggestions to introduce something like constexpr for it, alternatively).

Still, even if we come to an agreement about this part, we still have to precisely clarify what counts as "state" and what as "input arguments" for those external pure functions, because the distinction is not always entirely clear.

@chriseth
Copy link
Contributor

Your reasoning sounds consistent, @ekpyron . It might be not that confusing to strengthen the restrictions imposed by 'pure` at least for free functions.

@wjmelements
Copy link
Contributor

Since the evaluation of a pure function depends on its immutable code, it seems silly to prevent codecopy or reading from immutable variables inside pure functions. Pushing a constant to the stack isn't conceptually different from pushing an immutable; both are in the code. But currently the constant allows pure but the immutable requires view.

@ekpyron
Copy link
Member Author

ekpyron commented Nov 1, 2021

The main reason for strengthening pure would be to be able to syntactically mark functions as compile-time-constant.

uint[f()] compileTimeConstantlySizedArray;

will only work sanely if f does not depend on its code and does not involve e.g. codecopy or especially immutable access (unless you want to run a recompilation in the constructor ;-)). It can very well involve pushing constants as much as it wants, though. So there is a conceptual difference between the two.

There is two ways to proceed with this:

  1. ultimately decide to weaken pure to allow it to depend on code, thereby splitting "compile-time-constant" and pure (this would involve making immutable access pure) or
  2. strengthen pure to really mean "compile-time-constant" (this would involve making codecopy view).

Both are OK, but we have an inconsistent mixture right now. 1 would mean that we either need a different way to mark functions "compile-time constant" or merely rely on the compiler to determine whether it is or not (both would be feasible, though). The main question is whether 2 forces things to be view that would really benefit from being marked as a weaker pure (as long as I don't evaluate things at compile-time there is no difference between pure and view execution-wise at all, so it is mainly a question of which kind of annotation has more value).

@deluca-mike
Copy link

deluca-mike commented Dec 12, 2021

As someone who is heavy user (and proponent) of pure functions (and functional programming in general), I prefer option 1.

It seems reasonable to strive for objective technical accuracy, rather than striving for some subjective semantic accuracy.

Immutables end up as in-line litterals. The on-chain byte code for a function that returns a constant, immutable, or literal, is identical.

With a perfect decompiler that could translate any on-chain bytecode into human-readable Solidity (as one would write it), the following 4 smart contracts would have identically decompiled foo functions, all marked at pure:

contract ImmutableTest1 {
    function foo() external pure returns (uint256) { return uint256(4); }
}

contract ImmutableTest2 {
    uint256 internal constant _foo = uint256(4);
    function foo() external pure returns (uint256) { return _foo; }
}

contract ImmutableTest3 {
    uint256 internal immutable _foo;
    constructor() { _foo = uint256(4); }
    function foo() external view returns (uint256) { return _foo; }
}

contract ImmutableTest4 {
    uint256 internal immutable _foo;
    constructor(uint256 foo_) { _foo = foo_; }
    function foo() external view returns (uint256) { return _foo; }
}

In my opinion, pure should define what happens (and can't happen) on-chain, not what humans expect to happen before the contract is even created.

When you deploy a contract to the chain, you already need to wait for the receipt to be sure of its address (i.e. consider a factory that deploys children using the blockhash as a create2 salt), so to interact with it, you need at least some data/event from the chain. If you really needed to know what the resulting immutables were, if any, of a contract, just get more data from the chain (i.e. emit them in events, or expose externally).

@chriseth
Copy link
Contributor

Thank you for your input, @deluca-mike ! I think your examples fall a bit short on the most generic case. Of course a function that uses an immutable that in fact is a compile-time constant can be pure in the most strict functional-programming sense.

The question is if code whose semantics depend on the time and circumstances it was deployed at can be considered pure or not.

@ekpyron
Copy link
Member Author

ekpyron commented Apr 8, 2022

Just because this has come up in the context of #12860 again:

I'm personally starting to change my mind on this issue. In #3835 (comment) I basically argued that it makes a difference whether one considers external pure (does not read from storage nor from blockchain state) or internal pure (which one may interpret as "fully determined by the function arguments").

However, there is a different notion one can use for internal pure, i.e. solely "it is admissible to call this internal function from an external pure function".

In this interpretation, it's perfectly fine to read arbitrary calldata and memory in internal pure functions - while arbitrary mload or calldataload (resp. access to msg.data) clearly violates for a function to be fully determined by its arguments. Also the argument that returndatacopy is an issue in pure functions does not apply to the "admissible to call in an external pure function" interpretation.

I think the weak notion of it being admissible to call an internal function from an external pure function, is useful - for example in inheritance context (e.g. leaving an internal pure function unimplemented, but calling it, for a derived contract to provide an implementation, which may e.g. read msg.data, but not state).

So I'm more and more leaning towards actually weakening pure to just the external pure interpretation.

There is still merit in decorating internal functions that are compile-time constant, i.e. that only depend on their immediate arguments and are pure in the classical functional sense. However, these days I'd rather introduce a new separate keyword for this property, like C++'s constexpr.

@howardpen9
Copy link

follow from #12829

@cameel cameel added medium effort Default level of effort high impact Changes are very prominent and affect users or the project in a major way. needs design The proposal is too vague to be implemented right away labels Sep 26, 2022
@github-actions
Copy link

This issue has been marked as stale due to inactivity for the last 90 days.
It will be automatically closed in 7 days.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Feb 26, 2023
@github-actions
Copy link

github-actions bot commented Mar 5, 2023

Hi everyone! This issue has been automatically closed due to inactivity.
If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen.
However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.

@github-actions github-actions bot added the closed due inactivity The issue/PR was automatically closed due to inactivity. label Mar 5, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed due inactivity The issue/PR was automatically closed due to inactivity. high impact Changes are very prominent and affect users or the project in a major way. medium effort Default level of effort needs design The proposal is too vague to be implemented right away stale The issue/PR was marked as stale because it has been open for too long.
Projects
None yet
Development

No branches or pull requests

9 participants