-
Notifications
You must be signed in to change notification settings - Fork 12
Fix UBSan "misaligned-pointer-use" warning on aarch64 #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
32197d1
to
43180d7
Compare
From bitcoin/bitcoin#29178 (comment):
Done. cc @theuni |
This doesn't make sense. 10 lines above, we already do the same thing - why isn't that an issue? Between then and this, data is only incremented by sizeof(uint64_t) which shouldn't break alignment... Can't we just guarantee the data is always aligned somehow? |
Compiling on an aarch64 Fedora machine, with So if there is an actual bug, this change just makes the UBSan (false positive?) warning "go away", and the bug would still be in the binary in any case? |
As far as I understand, there are two possible cases depending on the actual compiler and the ARM hardware.
Assuming that it is not feasible to test every ARM hardware and compiler combination, this change is an improvement. |
Agree this is strange. I have no explanation. Maybe an instruction that assumes alignment only gets emitted in the second loop? I don't have enough compiler-fu to even hazard a guess as to the difference. |
Oh 🤦 This could be the case if |
@hebasto Would it be possible for you to test the above theory by reproducing the UBSan warning with either:
? |
I have no access to the aarch64 machine at the moment. I'll try to regain it as soon as possible. |
Ok, I can confirm this is the case. I managed to reproduce on macos, and added some assertions and printfs. While fuzzing with So I think @hebasto and @luke-jr are both correct here. The cast is not safe and should be a memcpy. But that's the case for the I can confirm that we also fall into the Ping @hebasto . Mind extending the memcpy treatment to the other cases as well just to be safe? @dergoegge ping for visibility |
43180d7
to
4d32620
Compare
Thanks! Done. |
What about these: crc32c-subtree/src/crc32c_arm64.cc Lines 115 to 123 in 4d32620
I think I agree with cory that it'd be best to cover those too. |
I discussed this with @madars yesterday, who suggested using crc32c/src/crc32c_arm64.cc:91:28: warning: cast from ‘const uint8_t*’ {aka ‘const unsigned char*’} to ‘uint64_t*’ {aka ‘long unsigned int*’} increases required alignment of target type [-Wcast-align]
91 | crc = __crc32cd(crc3, *(uint64_t *)data);
| ^~~~~~~~~~~~~~~~
crc32c/src/crc32c_arm64.cc:101:27: warning: cast from ‘const uint8_t*’ {aka ‘const unsigned char*’} to ‘uint64_t*’ {aka ‘long unsigned int*’} increases required alignment of target type [-Wcast-align]
101 | crc = __crc32cd(crc, *(uint64_t *)data);
| ^~~~~~~~~~~~~~~~
crc32c/src/crc32c_arm64.cc:107:27: warning: cast from ‘const uint8_t*’ {aka ‘const unsigned char*’} to ‘uint32_t*’ {aka ‘unsigned int*’} increases required alignment of target type [-Wcast-align]
107 | crc = __crc32cw(crc, *(uint32_t *)data);
| ^~~~~~~~~~~~~~~~
crc32c/src/crc32c_arm64.cc:112:27: warning: cast from ‘const uint8_t*’ {aka ‘const unsigned char*’} to ‘uint16_t*’ {aka ‘short unsigned int*’} increases required alignment of target type [-Wcast-align]
112 | crc = __crc32ch(crc, *(uint16_t *)data); I think those warnings are legitimate. It also complains about the macros which lead to So basically the entire accelerated arm64 path is at best iffy and at worst unsafe. I'm wondering if we should just disable it completely? Edit: s/ |
Sure enough. I cloned the upstream crc32c repo and built/ran tests with sanitizers on my m1 macbook:
Yikes. |
4d32620
to
6b61672
Compare
Addressed all cases pointed out when using |
Nice. With this patch hacked into the upstream benchmarks, I don't see any significant difference before/after:
after:
However the sanitizer warnings are now gone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several different ways to do this, but I'm fine with this as-is because it demonstrates no obvious slowdown in the benchmarks.
ACK modulo optional nits.
6b61672
to
4b5d72a
Compare
Addressed @theuni's nits. |
70a8d49
to
68ab55a
Compare
As `data` type is a pointer to `uint8_t`, the access to the memory via the `(uint64_t *)data` pointer is misaligned. It can be an issue on ARM hardware. As suggested in https://en.cppreference.com/w/cpp/language/reinterpret_cast: "When it is needed to interpret the bytes of an object as a value of a different type, std::memcpy ... can be used". For more details, please refer to bitcoin/bitcoin#29178 Also: https://stackoverflow.com/questions/32062894/take-advantage-of-arm-unaligned-memory-access-while-writing-clean-c-code
68ab55a
to
1ac401e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you submit the same upstream? It may not be maintained anymore, but if it is, they may want to pick this as well? |
|
@luke-jr @dergoegge @maflcko Mind having a quick look at this to confirm correctness? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1ac401e
Code change looks correct. I've also done some differential fuzzing of the x86 implementation against both pre- and post-patch arm64 version.
b60d2b7334 Merge bitcoin-core/crc32c-subtree#6: Fix UBSan "misaligned-pointer-use" warning on aarch64 1ac401e32b Fix UBSan "misaligned-pointer-use" warning on aarch64 git-subtree-dir: src/crc32c git-subtree-split: b60d2b733406cc64025095c6c2cb3933e222b529
5d45552 Squashed 'src/crc32c/' changes from 0bac72c455..b60d2b7334 (fanquake) Pull request description: Update the crc32c subtree. Includes: * bitcoin-core/crc32c-subtree#6 Which fixes #29178. ACKs for top commit: hebasto: ACK 359a8d9. theuni: ACK 359a8d9 dergoegge: ACK 359a8d9 Tree-SHA512: 2cec81a34ad26bbbc298aea5daffa41e56114d31cc2eb5fe486f46a77c3467bba22bdeca1c52ae97220e119d98818304272fc6337442af55282accabcd4c5833
b60d2b73340 Merge bitcoin-core/crc32c-subtree#6: Fix UBSan "misaligned-pointer-use" warning on aarch64 1ac401e32bd Fix UBSan "misaligned-pointer-use" warning on aarch64 git-subtree-dir: src/crc32c git-subtree-split: b60d2b733406cc64025095c6c2cb3933e222b529
b60d2b7334 Merge bitcoin-core/crc32c-subtree#6: Fix UBSan "misaligned-pointer-use" warning on aarch64 1ac401e32b Fix UBSan "misaligned-pointer-use" warning on aarch64 git-subtree-dir: src/crc32c git-subtree-split: b60d2b733406cc64025095c6c2cb3933e222b529
b60d2b7334 Merge bitcoin-core/crc32c-subtree#6: Fix UBSan "misaligned-pointer-use" warning on aarch64 1ac401e32b Fix UBSan "misaligned-pointer-use" warning on aarch64 git-subtree-dir: src/crc32c git-subtree-split: b60d2b733406cc64025095c6c2cb3933e222b529
b60d2b7334 Merge bitcoin-core/crc32c-subtree#6: Fix UBSan "misaligned-pointer-use" warning on aarch64 1ac401e32b Fix UBSan "misaligned-pointer-use" warning on aarch64 git-subtree-dir: src/crc32c git-subtree-split: b60d2b733406cc64025095c6c2cb3933e222b529
5d45552 Squashed 'src/crc32c/' changes from 0bac72c455..b60d2b7334 (fanquake) Pull request description: Update the crc32c subtree. Includes: * bitcoin-core/crc32c-subtree#6 Which fixes bitcoin#29178. ACKs for top commit: hebasto: ACK 359a8d9. theuni: ACK 359a8d9 dergoegge: ACK 359a8d9 Tree-SHA512: 2cec81a34ad26bbbc298aea5daffa41e56114d31cc2eb5fe486f46a77c3467bba22bdeca1c52ae97220e119d98818304272fc6337442af55282accabcd4c5833
5d45552 Squashed 'src/crc32c/' changes from 0bac72c455..b60d2b7334 (fanquake) Pull request description: Update the crc32c subtree. Includes: * bitcoin-core/crc32c-subtree#6 Which fixes bitcoin#29178. ACKs for top commit: hebasto: ACK 359a8d9. theuni: ACK 359a8d9 dergoegge: ACK 359a8d9 Tree-SHA512: 2cec81a34ad26bbbc298aea5daffa41e56114d31cc2eb5fe486f46a77c3467bba22bdeca1c52ae97220e119d98818304272fc6337442af55282accabcd4c5833
As
data
type is a pointer touint8_t
, the access to the memory via the(uint64_t *)data
pointer is misaligned. It can be an issue on ARM hardware. As suggested in https://en.cppreference.com/w/cpp/language/reinterpret_cast:For more details, please refer to bitcoin/bitcoin#29178.
Also: https://stackoverflow.com/questions/32062894/take-advantage-of-arm-unaligned-memory-access-while-writing-clean-c-code.