Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.7.0 Plan (Experimental) #346

Closed
cgewecke opened this issue Jul 14, 2019 · 9 comments
Closed

v0.7.0 Plan (Experimental) #346

cgewecke opened this issue Jul 14, 2019 · 9 comments

Comments

@cgewecke
Copy link
Member

cgewecke commented Jul 14, 2019

Rewrite this tool as a hybrid of solidity-coverage and 0xProject's opcode tracing sol-cov.

Mechanics

  • instead of a file of event hashes written by testrpc-sc, maintain an in memory map of them.
  • example instrumentation where bytes32(0xabc..etc) is a hash we can pick off the top of the VM step's stack and associate with an entry in the injection map.
contract Test {
  function coverage_0xab123(bytes32 c__0xab123) public pure {}

  function a(uint x) public {

    coverage_0xab123(0xdc08f...08ed1); /* function */ 
    coverage_0xab123(0xa9065...549e2); /* line */         
    coverage_0xab123(0x6ac83...f2037); /* statement */ 
  
    if (x == 1) {
  
      coverage_0xab123(0x85e7e...afacf); /* branch */ 
      coverage_0xab123(0xfa262...4fb28); /* line */  
      coverage_0xab123(0xa3e17...38d49); /* statement */ 
  
      x = 3;
  
    } else { 
      coverage_0xab123(0xb54c1...3ef6b); /* branch */ 
    }
}
  • listen to the vm step of a ganache-core in-process provider
const Ganache = require("ganache-core");
// ... some setup w/ web3
const vm = web3.currentProvider.engine.manager.state.blockchain.vm;

const self = this;
this.vm.on("step", function(info){

  if (info.opcode.name.includes("PUSH1") && info.stack.length > 0){
    const idx = info.stack.length - 1;
    const hash = web3Utils.toHex(info.stack[idx]).toString(); // Could be ours! 

    if(self.instrumentationData[hash]){
      self.instrumentationData[hash].hits++;
    }
  }
})

Interface

  • Truffle plugin. We'll get fed the config and then we:

    • Instantiate a ganache provider (hooking into the step)
    • instrument
    • run truffle-workflow-compile
      • .coverageEnv --> build directory
      • optimization off
      • hide compilation warnings about unused variables
    • run truffle-core.test
      • .coverageEnv --> contracts_directory
      • listen to step
    • generate coverage
  • Buidler plugin - similar, simpler.

  • Abstract the design enough that any build tool can consume.

Benefits

  • No more copying folders etc.
  • No more abi modification, double compilation, weird problems w/ libraries
  • No more testrpc-sc to update (eventually)
  • Get rid of most options, user config
  • Much faster & simpler.

Problems

  • Gas cost distortion (+13 gas per instrumentation statement).
  • Ternary expressions still un-coverable.
  • Stack too deep? This blog post suggests risks might be comparable to present. Not sure, could be a blocker though. (Solved)
  • Requires in-process client
  • Are there cases where gas sensitive solidity checks require compilation w/ solc optimization enabled? (This would impact any opcode tracer though) Does solc optimize assembly?

Why is this preferable to the opcode-to-sourcemap approach?

  • small modification of what we already have.
  • truffle-debugger and the 0x tools are complex
  • Not currently supported on 0x. Why?
  • source to opcode mapping / trace has gaps & oddities
  • embark, after spending months building an opcode tracing solution, abandoned that work to pursue event instrumentation (?!). Why?
  • we know we instrument ok (mostly).

In the medium term we can probably resolve any gas distortion issues using etherumjs-vm V4's Interpreter API which lets you process opcodes arbitrarily following a new EEI spec. Have opened an issue there seeking advice about this.

We can probably instrument assembly eventually (#176)

@alcuadrado
Copy link
Collaborator

Hey @cgewecke, I'm very interested in this redesign!

  • No more abi modification, double compilation, weird problems w/ libraries

Won't you have to recompile the project anyway? Or am I missing something from how it currently works?

  • Stack too deep? This blog post suggests risks might be comparable to present. Not sure, could be a blocker though.

Maybe this can also be workarounded by hooking into the vm

  • Requires in-process client

We have an experimental in-process client in Buidler, it's already functional, but it hasn't been released yet. It's also based on ethereumjs-vm

@alcuadrado
Copy link
Collaborator

  • Not sure, could be a blocker though.

Also, do you have any idea how this can be tested?

@cgewecke
Copy link
Member Author

cgewecke commented Aug 5, 2019

@alcuadrado Great!

We have an experimental in-process client in Buidler,

Oh awesome, I'd love to see this.

Won't you have to recompile the project anyway?

Yes, that's correct - but only once. At the moment we have to compile twice. Solidity does not allow event statements in view or pure functions since v0.5.0. To get around this we do an initial compilation of the original contracts to get the 'correct' ABI (with view/pure). Then we strip the state-mutability modifiers off and instrument the code to get the bytecode 'with coverage'. At the end we swap the original ABI back into the artifact to trick Truffle into invoking view/pure methods as .call rather than .send. (This strategy was thought up by Zeppelin engineers in #146 if you're interested in its history.)

Stack too deep .... maybe this can also be workarounded by hooking into the vm

Unfortunately this error is emitted by solc rather than at runtime which makes it difficult to address. In the strategy proposed above, we will introduce a single additional variable into each function, reducing the available stack depth by 1 (from 16 to 15). I don't know how problematic this will be yet... to some extent it's a question of how common it is for people to be at the limit. In general when this error is hit people are advised to use structs to organize their data and I believe abi-encoder V2 has made this more convenient.

do you have any idea how this can be tested?

Well . . . I think I can trigger the error, but my only ideas for resolving this are:

  • Detect the failure and explain to the user that coverage entails greater stack usage - ask them reorg their data. (idk about this)
  • Fallback to the current strategy (which is OK but comes with additional Truffle related problems when there are solidity files in the test folder)

(Possibly more info than you wanted :))

@alcuadrado
Copy link
Collaborator

I'd love to see this.

I'll publish something as soon as I have some time to polish it a little. There's not that much to see though, it's pretty simple 😅It's a minimal provider implemented on top of ethereumjs-vm.

Yes, that's correct - but only once. At the moment we have to compile twice. Solidity does not allow event statements in view or pure functions since v0.5.0. To get around this we do an initial compilation of the original contracts to get the 'correct' ABI (with view/pure). Then we strip the state-mutability modifiers off and instrument the code to get the bytecode 'with coverage'. At the end we swap the original ABI back into the artifact to trick Truffle into invoking those methods as .call rather than .send. (This strategy was thought up by Zeppelin engineers in #146 if you're interested in its history.)

It was way more complicated than I suspected. Nice hack though 😁

Unfortunately this error is emitted by solc rather than at runtime

Oh, yes, I was confused. I think detecting the error and giving a clear explanation would work for the vast majority of the cases. I've only hit this error once while getting lots of data from another contract.

I wonder if something else can be implemented with a small modification/hook into the vm. Something not using the stack? Is that even possible? Maybe using an instruction in an unexpected but recognizable way?

@cgewecke
Copy link
Member Author

cgewecke commented Aug 5, 2019

Something not using the stack? Is that even possible? Maybe using an instruction in an unexpected but recognizable way?

@alcuadrado This is the holy grail as far I'm concerned. If you think of something please lmk - my understanding of how solidity works at the opcode level is very limited & there could well be a cool trick somewhere.

@alcuadrado
Copy link
Collaborator

Did people complain about "stack too deep" errors with the previous approach? How often? I'm almost certain that it should be a cap on how often the new approach will fail. Solidity events use LOG1 - LOG4, which use more stack arguments than assigning a variable.

The only thing that makes me doubt it, is that in your example you are doing three assignments. Why three? I know there are some comments there, but I don't see what it can't be done with a single one.

    _sc_82e0891[0] = bytes32(0x85e7e...afacf); /* branch */ 
    _sc_82e0891[0] = bytes32(0xfa262...4fb28); /* line */  
    _sc_82e0891[0] = bytes32(0xa3e17...38d49); /* statement */ 

@cgewecke
Copy link
Member Author

cgewecke commented Aug 6, 2019

@alcuadrado

I'm almost certain that it should be a cap on how often the new approach will fail. Solidity events use LOG1 - LOG4, which use more stack arguments than assigning a variable.

Ah interesting.

Did people complain about "stack too deep" errors with the previous approach?

Occasionally, but now that I look again it's not clear our event injection was the direct cause. . . I think there are several slightly different cases here:

  • the current design (0.6.x) declares Events in the contract level scope and writes arbitrary data to them at the function scope.
  • in the blog post referenced above, the author passes parameters into the function and tries to reference them in an event statement
  • 0.7.0 proposes to declare a single new variable in the function scope and repeatedly assign to it.

I think the number of assignments as such doesn't matter - there's an open PR here for the core of this idea and it passes unit tests which introduce dozens of new ones.

you are doing three assignments

Yes, the report tracks branch, line, statement and function coverage. The case for each of these is evaluated by stepping through an AST and identifying instrumentation points at the relevant node. You are right that they could be aggregated ... I will look at this.

@alcuadrado
Copy link
Collaborator

alcuadrado commented Aug 7, 2019

I think the number of assignments as such doesn't matte

This is a good point. I think they shouldn't matter. As the stack before and after an assignment should be the same. If that's the case, there's no need to combine the assignments.

@cgewecke
Copy link
Member Author

0.7.0 is under way at #372 now. Closing in favor.

Copying over the remaining questions raised here....

  • Official API at ganache for obtaining the VM step (atm we're exposed to their internals)
  • Buidler field in package.json.
  • Can we minimize gas distortion through the new ethereumjs-vm API?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants