-
Notifications
You must be signed in to change notification settings - Fork 87
Speed up linking by shrinking symbol and relocation tables #1222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is there a reason not to use the following type definition ?
|
Not particularly? I don't have a very strong opinion here: the record is slightly closer to the truth (the symbol name is independent of its global status) but the variant has more lightweight syntax. |
I'm reviewing this as well as working on a version with flambda2 that would try and make local every symbol non-reachable from the cmx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good from my point of review, but note that I did not review the cfg part, and that I skimmed over the the to_cmm
part, since I rewrote most of it in #1246 which I expect to be merged very soon after this one.
Ideally, in the future, I'd very much like to do some more cleanup around symbols: this PR has small changes to make the local vs global symbols work, but this also means that there remains a lot of code handling symbols as strings. This makes it very easy to introduce bugs wrt locality (I have had to debug a few while writing #1246 ), so I think that after this one and #1246 , we should probably try and write another PR to use symbols instead of strings everywhere it makes sense (I'll probably do it). We could then consider all this work for upstreaming.
I looked at @stedolan 's fixes for @gretay-js 's comments and they look fine. Merging. |
Linking currently takes a long time because many symbol references (in particular, to statically allocated data) are not resolved until link time, despite many of these references being to data defined in the same file.
So far, this patch removes about half the symbols in compiler-libs, mostly unexported data. (The benefit applies only to Closure builds at the moment).
The major change is to use the following type for symbols in Cmm onwards, instead of
string
:At assembly time,
Local
symbols are generated as assembler labels rather than ELF symbols, so references can be resolved eagerly.Global
symbols are generated as both an ELF symbol and an assembler label, so thatLocal
references within the same file can use the label form.There is quite a lot of low-hanging fruit here. The last commit here removes another ~8% of symbol references by wrapping
caml_call_gc
once per file. We should also be able to turn references to functions in the same file into assembler label references, and to improve the encoding of debug info and frame tables to generate fewer relocations.