cpu/arm7_common: Make irq_*() compiler barriers #11440

maribu · 2019-04-24T15:22:37Z

Contribution description

Previously the compiler was allowed to reorder access to the interrupt control registers in regard to memory access not marked as volatile (at least some people - most notably some compiler developers - read the C standard this way). In practise this did not happen as irq_disable(), irq_restore(), irq_enable() are part of a separate compilation unit: Calls to external functions unknown to the compiler are treated as if they were memory barriers. But if link time optimization (LTO) is enabled, this no longer would work: The compiler could inline the code accessing the interrupt control registers and reorder the memory accesses wrapped in irq_disable() and irq_restore() outside of their protection.

This commit adds the "memory" clobber to the inline assembly accessing the interrupt control registers. This makes those accesses explicit compiler memory barriers. The machine code generated without LTO enabled should not differ in any way by this commit. But the use of irq_*() should now be safe with LTO.

Testing procedure

Compile some application for an ARMv7 (no CortexM) board, e.g. the MSB-A2. Without LTO no differences before and after this PR should be present, with this PR they could be present.

E.g. here is the output of elf_diff generated for the binaries in examples/default generated by:

make BOARD=msba2 https://mari-bu.de/arm7_no_lto.html
make BOARD=msba2 LTO=1 https://mari-bu.de/arm7_lto.html

Issues/PRs references

Related to #11438, as this implements the barrier claimed to be present in the doc there

Update 1: Updated binary diffs

Previously the compiler was allowed to reorder access to the interrupt control registers in regard to memory access not marked as `volatile` (at least some people - most notably some compiler developers - read the C standard this way). In practise this did not happen as irq_disable(), irq_restore(), irq_enable() are part of a separate compilation unit: Calls to external functions unknown to the compiler are treated as if they were memory barriers. But if link time optimization (LTO) is enabled, this no longer would work: The compiler could inline the code accessing the interrupt control registers and reorder the memory accesses wrapped in irq_disable() and irq_restore() outside of their protection. This commit adds the "memory" clobber to the inline assembly accessing the interrupt control registers. This makes those accesses explicit compiler memory barriers. The machine code generated without LTO enabled should not differ in any way by this commit. But the use of irq_*() should now be safe with LTO.

cpu/arm7_common/VIC.c

maribu · 2019-04-26T08:14:08Z

The binary diffs I created were incorrect: I did not build within the same branch name, but the branch name is printed in the boot up message. And I also used LTO=yes instead of LTO=1. The bin diffs are now fixed accordingly.

Surprisingly, the builds without LTO now differ in size (+4 B .text), while those with LTO have the same size.

maribu · 2019-05-23T07:45:58Z

@gschorcht: Can you check if the ESP8266 and the ESP32 implementation of irq_disable(), irq_enable()andirq_restore()` is not affected by the same bug?

The Xtensa is out-of-order, so a compiler only barrier would not be sufficient there. But maybe the RSIL instruction does implicitly work as a full barrier? (I found no information on that. In other code I only found an rsync issued after wsr which your code in irq_restore() does as well.) If no additional barrier instructions need to be issued in the inline assembly, I still think that the inline assembly should have memory as clobber, so that the compiler does not move memory accesses across irq_disable() and irq_restore().

(Note that an external function is not effectively a barrier, as the compiler can move memory access across theses calls once that external function is no longer a black box. That is the case with LTO enabled, see #11438 (comment))

kaspar030

ACK.

gschorcht · 2019-05-23T09:59:03Z

@maribu I will try to take a look, but it will take some time since I have to understand the problem first. I'm quite busy by my regular business at the moment.

maribu · 2019-05-23T10:44:00Z

@gschorcht: I do lack the knowledge about the Xtensa ISA, but from the other code I so I would guess is that only memory needs to be added to the clobbers. If that is true, issues will not arise unless LTO is enabled, which by default is disabled.

So from my point of view this not urgent (even though I think it is important to eventually check/fix it).

gschorcht · 2019-05-23T22:20:06Z

@maribu I am not sure whether any problem could occur with the following code in irq_disable/irq_enable:

    uint32_t _saved_intlevel;
    __asm__ volatile ("rsil %0, 0" : "=a" (_saved_intlevel));
    _saved_intlevel &= 0xf;
    return _saved_intlevel;

Local variables are hold in registers and the compiler produces the following instructions for the code above:

    rsil   a2, 0
    extui  a2, a2, 0, 4
    retw.n

The return value is stored in a2.

Xtensa Instruction Set Architecture (ISA) Reference Manual says:
RSIL first reads the PS Special Register, writes this value to address register at, and then sets PS.INTLEVEL to a constant in the range 0..15 encoded in the instruction word. ... The instruction following the RSIL is guaranteed to be executed at the new interrupt level specified in PS.INTLEVEL, therefore it is not necessary to insert one of the SYNC instructions to force the interrupt level change to take effect.

rsil is a single instruction and the old interrupt level is stored in register a2 in any case before the new interrupt level is set. Even if a context switch would happen between rsil and the following extui to extract the lowest four bits, it shouldn't be a problem.

The code produced for irq_restore

    __asm__ volatile ("wsr %0, ps; rsync" :: "a" (state));

is of course

   wsr.ps  a2
   rsync
   ret.n

where a2 contains the new interrupt level as parameter.

Xtensa Instruction Set Architecture (ISA) Reference Manual says:
RSYNC waits for all previously fetched WSR.* instructions to be performed before interpreting the register fields of the next instruction.

maribu · 2019-05-24T05:35:22Z

@gschorcht: Perfect, so only the memory clobbers are needed there to make sure the compiler does not move memory accesses wrappen in irq_disable() / irq_restore() outside of it with LTO enabled.

gschorcht · 2019-05-24T08:24:06Z

@gschorcht: Perfect, so only the memory clobbers are needed there to make sure the compiler does not move memory accesses wrappen in irq_disable() / irq_restore() outside of it with LTO enabled.

Sorry to ask again, I'm still not sure whether I understand the problem. If I understood you correclty, you are afraid that LTO might inline the code of these functions and reorder the memory access arround. Right? If so, do you think that this could happen even if the functions are present in ELF binaries?

One other question arose to me related to the topic. GCC for Xtensa knows an option -mserialize-volatile. When this option is enabled, GCC inserts MEMW instructions before volatile memory references to guarantee sequential consistency. This option is disabled in GCC for ESPs by default. Do you think it makes sense to enable them?

kaspar030 · 2019-05-24T09:04:04Z

-mserialize-volatile

Didn't know about that, thanks for the hint!
It seems it is default anyways: (from "man gcc")

       -mserialize-volatile
       -mno-serialize-volatile
           When this option is enabled, GCC inserts "MEMW" instructions before "volatile" memory references to guarantee sequential
           consistency.  The default is -mserialize-volatile.  Use -mno-serialize-volatile to omit the "MEMW" instructions.

kaspar030 · 2019-05-24T09:05:09Z

Anyhow, this PR is good to go.

gschorcht · 2019-05-24T09:25:19Z

@kaspar030 > It seems it is default anyways: (from "man gcc")

Not for ESPs:

xtensa-lx106-elf-gcc -Q --help=target
The following options are target specific:
...
  -mno-serialize-volatile     		[enabled]

maribu · 2019-05-24T10:16:28Z

If I understood you correclty, you are afraid that LTO might inline the code of these functions and reorder the memory access arround.

Exactly. The experiment in #11438 (comment) shows, that GCC does this for AVR.

If so, do you think that this could happen even if the functions are present in ELF binaries?

Yes :-/

One other question arose to me related to the topic. GCC for Xtensa knows an option -mserialize-volatile. When this option is enabled, GCC inserts MEMW instructions before volatile memory references to guarantee sequential consistency. This option is disabled in GCC for ESPs by default. Do you think it makes sense to enable them?

There have been some heated discussions lately on whether volatile can be used for synchronization. As far as I know there is pretty much a consensus: E.g. the Linux Kernel Developers explicitly forbid this use of volatile, e.g. Intel considers volatile almost useless for multi-threaded programming, and the C standard guys have added an standard atomic library since C11, as they felt previous versions of C did not provide the required means to do synchronization.

My personal point of view is to stick with that consensus and do not use volatile for means of synchronization. As far as I know irq_disable() and irq_restore() are either used explicitly (in low level code) or indirectly through higher level APIs such as mutex (in high level code). Therefore, -mno-serialize-volatile should be safe. It would have a performance cost to enable -serialize-volatile and would slightly increase code size.

maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: ARM Platform: This PR/issue effects ARM-based platforms labels Apr 24, 2019

maribu requested a review from kaspar030 April 24, 2019 15:22

maribu mentioned this pull request Apr 24, 2019

sys: Added simple memory barrier API #11438

Closed

kaspar030 reviewed Apr 24, 2019

View reviewed changes

cpu/arm7_common/VIC.c Show resolved Hide resolved

maribu mentioned this pull request Apr 26, 2019

cpu/atmega_common: Bugfixes in irq_arch.c #11452

Merged

maribu requested a review from jcarrano May 23, 2019 07:12

kaspar030 approved these changes May 23, 2019

View reviewed changes

kaspar030 added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label May 23, 2019

kaspar030 merged commit 35d43cc into RIOT-OS:master May 24, 2019

gschorcht mentioned this pull request May 24, 2019

cpu/esp*: memory clobbers required for irq_* functions #11571

Closed

maribu deleted the arm7_barriers branch November 4, 2019 19:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu/arm7_common: Make irq_*() compiler barriers #11440

cpu/arm7_common: Make irq_*() compiler barriers #11440

Uh oh!

maribu commented Apr 24, 2019 •

edited

Loading

Uh oh!

Uh oh!

maribu commented Apr 26, 2019

Uh oh!

maribu commented May 23, 2019

Uh oh!

kaspar030 left a comment

Uh oh!

gschorcht commented May 23, 2019

Uh oh!

maribu commented May 23, 2019

Uh oh!

gschorcht commented May 23, 2019

Uh oh!

maribu commented May 24, 2019

Uh oh!

gschorcht commented May 24, 2019

Uh oh!

kaspar030 commented May 24, 2019

Uh oh!

kaspar030 commented May 24, 2019

Uh oh!

gschorcht commented May 24, 2019

Uh oh!

maribu commented May 24, 2019

Uh oh!

Uh oh!

cpu/arm7_common: Make irq_*() compiler barriers #11440

cpu/arm7_common: Make irq_*() compiler barriers #11440

Uh oh!

Conversation

maribu commented Apr 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution description

Testing procedure

Issues/PRs references

Uh oh!

Uh oh!

maribu commented Apr 26, 2019

Uh oh!

maribu commented May 23, 2019

Uh oh!

kaspar030 left a comment

Choose a reason for hiding this comment

Uh oh!

gschorcht commented May 23, 2019

Uh oh!

maribu commented May 23, 2019

Uh oh!

gschorcht commented May 23, 2019

Uh oh!

maribu commented May 24, 2019

Uh oh!

gschorcht commented May 24, 2019

Uh oh!

kaspar030 commented May 24, 2019

Uh oh!

kaspar030 commented May 24, 2019

Uh oh!

gschorcht commented May 24, 2019

Uh oh!

maribu commented May 24, 2019

Uh oh!

Uh oh!

maribu commented Apr 24, 2019 •

edited

Loading