-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Introduce ByteCode format #121
Comments
cc @onbjerg @brockelmore for when we'll want to further speedup fuzz tests |
We would need a utility in either Forge or ethers-rs to convert from the old format to the new format given that the Solidity compiler does not support EOF yet |
@onbjerg shoudn't utility be inside revm? What i thought is to create/create2 use
And you would call something like
Or just directly use it inside the db. |
Makes sense, however I'm not sure what the gains are for Forge here, since we run each test in a database we later discard. The main benefactor of an optimization here would be fuzz tests, but since we discard the database and cannot be sure that the same bytecode will be at the same address in the next run, we can't really change that behavior - is there any way we can have a performant-ish bytecode cache based on the bytecode itself instead of addresses, that we could then reuse across test runs? |
@onbjerg hm so contracts for fuzz tests are created at evm runtime. Can you run with me an outline of how fuzzing is done? |
Setup phase (this state IS persisted):
Then, for each fuzz run we decide on some parameters using different strategies (not important for this outline) and we:
In most cases we will probably end up deploying the same contracts, however, we don't enforce that that is the case, so some users might conditionally deploy extra contracts based on the fuzz inputs Does that make sense? |
This would help only to preanalyse contracts deployed inside Setup phase, as you could save analysed contracts in db. |
Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one |
Here is an example of how one analysed contract is called multiple times: https://github.com/bluealloy/revm/pull/156/files#diff-c28df8837c5c77a683c9974376c1b53605fe01730f1b81b25c38771d58ecda6bR52-R59 Does this work for you? |
I think that it should work, yeah. An update of REVM in Foundry should probably provide a noticeable speedup for fuzz tests that deploy their contracts in Will give it a shot soon and report back 😄 |
Seems to have shaved off a second of solmate's testing time on my machine with no special handling added! (just replaced the Left some comments on the PR, but this is great work 😄 Thank you! (Note: I couldn't believe it so I recompiled master locally to make sure it wasn't my local Rust compiler being smart, but same results - using the new bytecode format is faster) |
Okay, i didnt expect it to be this big of a difference. Can you do two flamegraph between executions to better see where execution went down? And thank you for review, will make those changes before pr merge. |
i needed to use This is running solmate on old foundry: sidenote: Additional improvement can potentially be done inside Second flamegraph is with the new revm and it does not have any I did optimize to_analysis to be better and more compact but results on the real contract test are amazing. @onbjerg instead of |
@rakita For sure! Thanks for the pointer. Are there any other changes you see having some impact we can do? I noticed you changed some stuff about the In some places (the debugger and coverage) we also build our own PC -> IC maps and it's not really ideal. Do you think it is possible to reuse the analysis made by REVM for this purpose as well? For context, source maps in Solidity use instruction counters (e.g. the offset of the instruction in the bytecode minus any push bytes), but obviously during execution we get the program counter instead. |
* build: use new revm with analysis cache * refactor: use checked bytecode See bluealloy/revm#121 (comment) * build: use git revm * build: use revm 1.8 * test: fix test * fix: correct bytecode getters/setters Whenever we output the bytecode of an account, whether to a file or in a response, we need to return the *original* bytecode using `Bytecode::len`, otherwise the bytecode will differ depending on whether the bytecode has been checked, analyzed or not. * refactor: use `Bytecode::hash` * fix: get original account code for traces * refactor: remove unsafe code
I just transformed internal representation of
|
Just for comparison, on old version of revm 1.7 on same test, i am getting around ~570ms. Foundry uses inspector and
|
gg |
* build: use new revm with analysis cache * refactor: use checked bytecode See bluealloy/revm#121 (comment) * build: use git revm * build: use revm 1.8 * test: fix test * fix: correct bytecode getters/setters Whenever we output the bytecode of an account, whether to a file or in a response, we need to return the *original* bytecode using `Bytecode::len`, otherwise the bytecode will differ depending on whether the bytecode has been checked, analyzed or not. * refactor: use `Bytecode::hash` * fix: get original account code for traces * refactor: remove unsafe code
* build: use new revm with analysis cache * refactor: use checked bytecode See bluealloy/revm#121 (comment) * build: use git revm * build: use revm 1.8 * test: fix test * fix: correct bytecode getters/setters Whenever we output the bytecode of an account, whether to a file or in a response, we need to return the *original* bytecode using `Bytecode::len`, otherwise the bytecode will differ depending on whether the bytecode has been checked, analyzed or not. * refactor: use `Bytecode::hash` * fix: get original account code for traces * refactor: remove unsafe code
With a restriction that contacts can not start with
0xEF
and with the possibility that in future we will have differentByteCode
for EVM Object Format, it makes sense to introduce support for it into the revm. https://eips.ethereum.org/EIPS/eip-3541The biggest benefit atm would be introducing
ByteCode
type that will contain jumptable and gas blocks and with that we would skip the needed analysis of the contract. This is a fairly big speed-u for some use cases noticed in the foundry as fuzzing tests call a lot of contracts with not much execution and analysis, in that case, is taking a lot of time.The proposal is to introduce enum
ByteCode
that would somehow return:code
,code_size
andjumptable
.Basically replace this input with
ByteCode
:revm/crates/revm/src/interpreter/contract.rs
Line 60 in 0852824
and propagate change wherever needed.
ByteCode
should set these fields in the contract:revm/crates/revm/src/interpreter/contract.rs
Lines 11 to 13 in 0852824
revm/crates/revm/src/interpreter/contract.rs
Line 21 in 0852824
and do the same for database struct:
ByteCode
:revm/crates/revm/src/db/traits.rs
Line 14 in 0852824
BytesCode
:revm/crates/revm/src/models.rs
Line 24 in 0852824
To fully circle the story with EVM Object Format, CREATE and CREATE2 will need to return 0xEF type format in bytes that would get transformed to
ByteCode
for evm usage here:revm/crates/revm/src/evm_impl.rs
Lines 395 to 399 in 0852824
But this is a consensus rule and out of scope for this feat, we should just continue using legacy
ByteCode
here. If needed we could in Inspectorcreate_end
transform that intoByteCode
.The text was updated successfully, but these errors were encountered: