-
Notifications
You must be signed in to change notification settings - Fork 918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LTO support and C/CXX flags override hard to achieve #97
Comments
if you use Note that
Thanks, we have tried hard to make things "easy" for the beginner, but give the power user full control. This is quite hard to do (especially when we have both native and device compilation going on). So there are certainly things that are still a little hidden/hard coded (Note if you search elsewhere you can also configure per source flags with
Note things you can try to tweak your build config further (if you should so wish) other than just editing what is there.
Some of this stuff might be a little off the happy path for now, but we are keen to improve. |
Are you sure? It does use -O3 here for release builds. Not sure if this is due to some cmake system default (i definitely didn't set that). Maybe that should be explicitly set in the SDK. And btw i have always been disabling -fdata-sections because it can have a negative impact. I stumbled upon these issues when i noticed even the blink example is 12 KiB in size which is a bit heavy for something that toggles a GPIO + delays. I have barely any experience with cmake and rather stayed away from it when possible. I guess i need to dive deeper into what i can and can't set in CMakeLists.txt. Yes, i know the pain. LTO can do insane optimizations that break non-standard conforming or code with undefined behavior in weird ways. Like moving global variable writes completely elsewhere when other functions depend on it. edit: |
So if I add
For very small projects (aka hello world) it seems to work fine. But for anything useful, it simply fails with the message above or similar. gcc is:
binutils is v2.36.1 |
yeah, i think |
Note assigning myself, as we do plan to look at LTO in 1.3.0 release |
See #253 (comment) (and #260 ) |
If it helps, disabling LTO for The linker however then spits out a ton of undefined references related to
I tried using the I tried using
... so I'm assuming figuring out what to do with the ... Good luck with this :) |
Looks like I got a bit further after all. In #define WRAPPER_FUNC(x) __attribute__((used)) __wrap_ ## x Now it successfully compiles and run. Compared to having LTO activated for everything except the SDK (assuming I didn't mess up something), the generated
I have not tested for performance, however. (EDIT: as an interesting note, setting |
Thanks; ugh i wasn't aware of the |
While everything seems to run fine with my workarounds applied and I have not encountered any breakage, I have realized something (which in retrospect is rather explicit in the reports I've linked to): gcc+bfd with LTO could inline some of the original, non-wrapped functions. I have probably not encountered issues because function wraps that may have mattered were not inlined and the call site still referred to the symbol that got wrapped later on. This might be worth investigating further, because I'm not quite sure if that's happening for me here. Interestingly, using x86-64 desktop linux compilers and a simple test case:
... only clang+lld properly emitted a call to That being said, with clang thin LTO ( Rather disappointingly, in the above tests, Unfortunately, I'm quite not sure what the best approach would be. Since clang support is in the same milestone as this, it might be worth trying to support clang LTO with lld only first if it turns out to be simpler to support - or use an approach different from (FWIW, LTO with any linker but |
This is such a simple fix, why after almost 3 years hasn't it been implemented properly yet? Edit: Though the |
I am not up to date, but unless pico-sdk stopped using wrap functions, I do not think the problem I described is gone. In particular, gcc+bfd LTO is insidiously broken. |
Adding that attribute to all wrap functions as a proof of concept fixed it for me. |
Do make sure that this solution does not cause the issue that I mentioned above (i.e. that inlining may inline the original function rather than the wrapper). Depending on the wrapped function, there may not be a noticeable side-effect for your usecase (it has been fine for me), but it would likely not be upstreamable. |
do you have a fork of the pico-sdk where we could try that? |
This change would be a one-liner afaict, but beware the caveats I mentioned in my replies above. |
There was some activity in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88643 and it appears that the wrapping issue should be resolved with ld as of binutils EDIT: no dice, still getting undefined references to |
First of all thanks for making this so painless. It worked basically right out of the box even with a different arm-none-eabi toolchain.
LTO support
Linking is failing with these errors:
These errors are quite strange since it's usually the other way around with something not being supported in thumb mode. Have not digged too deep yet to find out what's causing these.
C/CXX flags
I tried to override the C and linker flags to have different optimization than default by doing
The result is this order of flags to the compiler:
...-I/home/blub/pico/pico-sdk/src/rp2_common/pico_stdio_uart/include -O2 -flto -fno-data-sections -O3 -DNDEBUG -ffunction-sections -fdata-sections -o CMakeFiles/blink.dir/home/blub/pico/pico-sdk/src/rp2_common/pico_stdio_uart/stdio_uart.c.obj -c /home/blub/pico/pico-sdk/src/rp2_common/pico_stdio_uart/stdio_uart.c
Which means it's basically overriding my flags instead of the other way around. Somewhere it's including the override flags before the usual SDK given flags.
Would be nice if at least the C/CXX flags issue can be fixed. I understand if LTO is not happening for a while since it tends to uncover undefined behavior or bugs not seen without but in my experience it does help shrink down code size more.
Full build log: https://gist.github.com/profi200/718e20b9d3ebec87db60f78d0deb994b
The text was updated successfully, but these errors were encountered: