Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toolchain: gcc: Add compiler barrier at the end of UNALIGNED_PUT() #8251

Merged

Conversation

pfalcon
Copy link
Contributor

@pfalcon pfalcon commented Jun 7, 2018

compiler_barrier() is itself defined down in this file. Without
adding it, newer versions of GCC (7+) for ARM Cortex-M may mistakenly
coalesce multiple strb/strh/str (store byte/half-word/word)
instructions, which support unaligned access on some
sub-architectures (Cortex-M3 and higher, but not on Cortex-M0),
into strd (store double), which doesn't support unaligned access.

Fixes: #6307

Signed-off-by: Paul Sokolovsky paul.sokolovsky@linaro.org

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

@therealprof, @mazhenke: FYI.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

This should really fix #6307.

The testing I did (besides peering at the assembly is running dump_http_server sample re: #7831).

@pfalcon pfalcon requested a review from galak June 7, 2018 11:51
@codecov-io
Copy link

codecov-io commented Jun 7, 2018

Codecov Report

Merging #8251 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8251      +/-   ##
==========================================
- Coverage   64.54%   64.53%   -0.01%     
==========================================
  Files         420      420              
  Lines       40117    40117              
  Branches     6762     6762              
==========================================
- Hits        25892    25891       -1     
  Misses      11108    11108              
- Partials     3117     3118       +1
Impacted Files Coverage Δ
boards/posix/native_posix/timer_model.c 98.21% <0%> (-1.79%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b9042c...35fc5f0. Read the comment docs.

@galak
Copy link
Collaborator

galak commented Jun 7, 2018

@pfalcon curious how you figured out a compiler_barrier was the thing to do.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

@galak, @mazhenke actually did, I guess I should have given him credit in the commit message. He suggested it in #6307 (comment) , and I googled it up as indeed suggested way to deal with related problems: #6307 (comment) . It's somewhat stronger than we need, but as we deal with apparent compiler bug, there's little choice, e.g. volatile would be even worse (didn't try if it helps though).

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is happening, that's a compiler bug. A packed struct is by definition allowed to be misaligned.

Can you predicate this on the known-bad compiler/version so we don't take the performance hit everywhere? A memory clobber is very expensive (in the general case it means the compiler needs to re-fill every register it's populated).

@andyross
Copy link
Contributor

andyross commented Jun 7, 2018

Actually thinking about this, I think this might be our brain damage. A field in a packed struct is allowed to be missaligned, and the compiler is required to generate correct code to load it. I think the struct itself isn't subject to that rule and the compiler is free to assume it's naturally aligned. Maybe.

If we want to do this portably we probably need to be smarter about it, and maybe make these macros part of the arch layer.

@andyross
Copy link
Contributor

andyross commented Jun 7, 2018

Thinking more.... we can save the use of this trick by putting a single char in front of the __v element in the struct to force 1-byte alignment. Then the compiler is required to generate code for arbitrary alignment. Might work without the need for a memory clobber.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

A field in a packed struct is allowed to be missaligned, and the compiler is required to generate correct code to load it.

@andyross, you might be interested to read comments on the original issue, e.g. I followed just the same stream of thought yesterday: #6307 (comment)

@andyross
Copy link
Contributor

andyross commented Jun 7, 2018

Maybe try something like this? Note that it adds the char, offsets the pointer. It also removes a senseless typeof in a cast expression as void* works fine for that.

#define UNALIGNED_GET(p)
__extension__ ({
    struct  __attribute__((__packed__)) {
        char __dummy;                
        __typeof__(*(p)) __v;
    } *__p = (void *) (((char *)(p)) - 1);
    __p->__v;
})

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

a single char in front of the __v

Linking directly to the pertinent comments if you don't feel like reading all of them: #6307 (comment)

@andyross
Copy link
Contributor

andyross commented Jun 7, 2018

Hah, OK. If it's a bug it's a bug. I'm happy if you just #if it to ARM + GCC 7/8.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

I'm happy if you just #if it to ARM + GCC 7/8.

I can do that, but that doesn't guarantee that another arch won't hit it.

@andyross
Copy link
Contributor

andyross commented Jun 7, 2018

I can do that, but that doesn't guarantee that another arch won't hit it.

I 100% guarantee x86 won't hit it. :)

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

I 100% guarantee x86 won't hit it. :)

What about xtensa or arc, who'll debug them? ;-) No problem, I'll do the #if.

@galak
Copy link
Collaborator

galak commented Jun 7, 2018

Hah, OK. If it's a bug it's a bug. I'm happy if you just #if it to ARM + GCC 7/8.

if its a bug we should file a bug w/gcc arm toolchain:

https://bugs.launchpad.net/gcc-arm-embedded/+filebug/+login

@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 7, 2018

@galak :

if its a bug we should file a bug w/gcc arm toolchain:

I guess it would need to go directly to gcc (maybe via hops, yeah). I can try to do it in Le Mans ;-).

compiler_barrier() is itself defined down in this file. Without
adding it, newer versions of GCC (7+) for ARM Cortex-M may mistakenly
coalesce multiple strb/strh/str (store byte/half-word/word)
instructions, which support unaligned access on some
sub-architectures (Cortex-M3 and higher, but not on Cortex-M0),
into strd (store double), which doesn't support unaligned access.

Fixes: zephyrproject-rtos#6307

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 8, 2018

@andyross :

I'm happy if you just #if it to ARM + GCC 7/8.

Done.

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempted to whine that it would have been cleaner to #define a BARRIER_WORKAROUND() to put in there that could have been an empty macro on other platforms and saved a bunch of lines in the patch. But style arguments shouldn't keep crash fixes out of the tree.

@MaureenHelm MaureenHelm merged commit a37e037 into zephyrproject-rtos:master Jun 8, 2018
@pfalcon pfalcon deleted the unaligned-put-barrier branch September 6, 2018 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants