-
Notifications
You must be signed in to change notification settings - Fork 2.1k
cpu/arm7_common: Make irq_*() compiler barriers #11440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously the compiler was allowed to reorder access to the interrupt control registers in regard to memory access not marked as `volatile` (at least some people - most notably some compiler developers - read the C standard this way). In practise this did not happen as irq_disable(), irq_restore(), irq_enable() are part of a separate compilation unit: Calls to external functions unknown to the compiler are treated as if they were memory barriers. But if link time optimization (LTO) is enabled, this no longer would work: The compiler could inline the code accessing the interrupt control registers and reorder the memory accesses wrapped in irq_disable() and irq_restore() outside of their protection. This commit adds the "memory" clobber to the inline assembly accessing the interrupt control registers. This makes those accesses explicit compiler memory barriers. The machine code generated without LTO enabled should not differ in any way by this commit. But the use of irq_*() should now be safe with LTO.
The binary diffs I created were incorrect: I did not build within the same branch name, but the branch name is printed in the boot up message. And I also used Surprisingly, the builds without LTO now differ in size (+4 B .text), while those with LTO have the same size. |
@gschorcht: Can you check if the ESP8266 and the ESP32 implementation of The Xtensa is out-of-order, so a compiler only barrier would not be sufficient there. But maybe the RSIL instruction does implicitly work as a full barrier? (I found no information on that. In other code I only found an (Note that an external function is not effectively a barrier, as the compiler can move memory access across theses calls once that external function is no longer a black box. That is the case with LTO enabled, see #11438 (comment)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
@maribu I will try to take a look, but it will take some time since I have to understand the problem first. I'm quite busy by my regular business at the moment. |
@gschorcht: I do lack the knowledge about the Xtensa ISA, but from the other code I so I would guess is that only So from my point of view this not urgent (even though I think it is important to eventually check/fix it). |
@maribu I am not sure whether any problem could occur with the following code in uint32_t _saved_intlevel;
__asm__ volatile ("rsil %0, 0" : "=a" (_saved_intlevel));
_saved_intlevel &= 0xf;
return _saved_intlevel; Local variables are hold in registers and the compiler produces the following instructions for the code above: rsil a2, 0
extui a2, a2, 0, 4
retw.n The return value is stored in Xtensa Instruction Set Architecture (ISA) Reference Manual says:
The code produced for __asm__ volatile ("wsr %0, ps; rsync" :: "a" (state)); is of course wsr.ps a2
rsync
ret.n where Xtensa Instruction Set Architecture (ISA) Reference Manual says: |
@gschorcht: Perfect, so only the |
Sorry to ask again, I'm still not sure whether I understand the problem. If I understood you correclty, you are afraid that LTO might inline the code of these functions and reorder the memory access arround. Right? If so, do you think that this could happen even if the functions are present in ELF binaries? One other question arose to me related to the topic. GCC for Xtensa knows an option |
Didn't know about that, thanks for the hint!
|
Anyhow, this PR is good to go. |
@kaspar030 > It seems it is default anyways: (from "man gcc") Not for ESPs:
|
Exactly. The experiment in #11438 (comment) shows, that GCC does this for AVR.
Yes :-/
There have been some heated discussions lately on whether My personal point of view is to stick with that consensus and do not use |
Contribution description
Previously the compiler was allowed to reorder access to the interrupt control registers in regard to memory access not marked as
volatile
(at least some people - most notably some compiler developers - read the C standard this way). In practise this did not happen as irq_disable(), irq_restore(), irq_enable() are part of a separate compilation unit: Calls to external functions unknown to the compiler are treated as if they were memory barriers. But if link time optimization (LTO) is enabled, this no longer would work: The compiler could inline the code accessing the interrupt control registers and reorder the memory accesses wrapped in irq_disable() and irq_restore() outside of their protection.This commit adds the "memory" clobber to the inline assembly accessing the interrupt control registers. This makes those accesses explicit compiler memory barriers. The machine code generated without LTO enabled should not differ in any way by this commit. But the use of irq_*() should now be safe with LTO.
Testing procedure
Compile some application for an ARMv7 (no CortexM) board, e.g. the MSB-A2. Without LTO no differences before and after this PR should be present, with this PR they could be present.
E.g. here is the output of elf_diff generated for the binaries in
examples/default
generated by:make BOARD=msba2
https://mari-bu.de/arm7_no_lto.htmlmake BOARD=msba2 LTO=1
https://mari-bu.de/arm7_lto.htmlIssues/PRs references
Related to #11438, as this implements the barrier claimed to be present in the doc there
Update 1: Updated binary diffs