-
-
Notifications
You must be signed in to change notification settings - Fork 670
Optimize shadow stack pass #1667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
any updates? I know this is not huge improvement but still improvement =) |
I guess we could just give it a try, but there is a potential for unexpected breakage there. On the other hand, without actually using it in the wild, it may be hard to find problems as these can be quite random. |
How about hide this changes under some experimental flag? And ask people turn it on for testing on real projects? But by default turn it off |
Hmm, in this case this may actually work. Going to check :) |
Or just add extra "incremental-experimental" mode for runtime |
Together with #1692:
|
Currently running this branch against Rtrace and worst-case config options, similar to what has been done in initial bootstrapping, and so far is looking good. |
Could you also test with assemblyscript-regexp? I guess it pretty great battletest example |
Did that now (after figuring out how), and it passes the tests. For everyone curious, this is how one can do it without hitting any of the errors (missing dist files, missing transforms, ...) I've encountered while trying more convenient things. In your local AS fork: npm run build
node scripts/prepublish
npm pack
node scripts/postpublish then replace |
Unable to provoke problems here, so probably not worth the trouble to introduce a flag. Suggesting to release it into the wild. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
This is an experiment to reduce overhead of the shadow stack pass. The pass is currently maintaining more stack slots than necessary (anywhere something managed naively goes to the stack) to be sure, but there is room for improvement.
The relatively simple changes in this PR reduce optimized binary size by about 10% and increase throughput by a little less than 10% (testing with the compiler). Optimizations here are risky, though, because any mistake, not only in the shadow stack pass itself but also in compiler code, for instance where an actually necessary (pre-)instrumentation is missing, can lead to unforeseen consequences that only show themselves randomly.
As such it may make sense to test with various programs and various GC configurations first to minimize the risk.