-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problems found by clang+UBSan on Ubuntu 20.04 #6614
base: development
Are you sure you want to change the base?
Fix problems found by clang+UBSan on Ubuntu 20.04 #6614
Conversation
psa_cipher_encrypt() and psa_cipher_decrypt() sometimes add a zero offset to a null pointer when the cipher does not use an IV. This is undefined behavior, although it works as naively expected on most platforms. This can cause a crash under modern Asan (depending on compiler optimizations). Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
We had a GCC+ASan+UBSan full test run on both the default configuration and the full configuration. In the full configuration, use Clang instead. Empirically, a modern Clang's UBSan finds things that GCC's doesn't. Insist on a modern Clang (by default, our CI would pick the oldest available, which doesn't find as much stuff). Define "modern" as the most recent that's available on our CI right now, which is Ubuntu 20.04. Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
status = mbedtls_psa_cipher_finish( &operation, output + accumulated_length, | ||
status = mbedtls_psa_cipher_finish( &operation, | ||
( output == NULL ? NULL : | ||
output + accumulated_length ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the introduction of mpi_sint_abs()
, the obvious question here is: since this construct is repeated four times, is it worth abstracting it, either with a function or a macro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence. This is considerably simpler than mpi_sint_abs
. It feels like giving it a meaningful name will make the code longer and won't really be more readable.
Are you happy with seeing the code, or are you going “WTF why do we need a special case for NULL”? If the latter, then a macro/function with documentation that explains why it's needed sometimes is definitely warranted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's clear that it would be meaningless to compute an address based on NULL + offset, and so having a special case makes sense, but less obvious that it's UB (as opposed to simply a meaningless result) without the special case. So I think either way would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's clear why it is done - no comment needed. It's just that the construct is clunky, and much less readable than the (UB-problematic) original code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A helper function would be useful here indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is from #6609 and I won't change it again here. The pull request here is to arrange for such bugs to be caught by our CI.
My bad, only the bignum part was in the earlier PR, not the PSA crypto part. So this is still up for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hmm, CI failing with
|
tests/scripts/all.sh
Outdated
@@ -924,10 +924,10 @@ component_test_default_cmake_gcc_asan () { | |||
tests/context-info.sh | |||
} | |||
|
|||
component_test_full_cmake_gcc_asan () { | |||
msg "build: full config, cmake, gcc, ASan" | |||
component_test_full_cmake_clang_asan () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a component_test_full_cmake_clang
and it would seem natural to merge it with component_test_full_cmake_clang_asan
. After all, we don't need to test both an asan and a non-instrumented build.
Unfortunately, the two components have different requirements. The whole point of this change is to insist on a recent clang. When running Ubuntu 20.04, that also means a recent openssl which doesn't support the same parts of compat.sh
. Between the two components, all of compat.sh
should run. But I don't think there's a single version of OpenSSL that supports everything (ARIA requires ≥1.1, null ciphersuites require <1.1, and for 2.28 there's also DES).
This is more difficult than I'd hoped and I may revise my ambitions down for Travis, especially considering that it's kind of on its way out now that we have OpenCI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In today's team meeting we decided that we could remove this component from Travis. The main reason were keeping Travis in addition to Jenkins was that it has public logs, but now with OpenCI so does Jenkins. We aren't going to invest time into Travis at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then again, having a run in the full config on Travis is really convenient. It lets us see at first glance if the “baseline” fails. So I've kept one, just not using all.sh
, so it doesn't have to have the same expectations of completeness.
Since this run doesn't use our official openssl and gnutls versions, it is not exactly identical to even a subset of a Jenkins job. This was already the case before: while we use the same versions of openssl and gnutls on Jenkins as in Ubuntu 16.04, we don't use the same builds, and this has bitten us before when Ubuntu removed some obsolete algorithms from the default configuration.
Having something on the CI that's close to a typical developer machine (at least closer than an out-of-support Ubuntu release) helps maintain the property that just running make && make test && tests/ssl-opt.sh && tests/compat.sh
can be expected to work out of the box.
I hope that by the time we retire Travis, we'll have modernized our interoperability testing (specifically, version reliance in the SSL test scripts) so that Jenkins can ensure this property.
Regarding the |
There was one with ASan and one without, for historical reasons. Have a single component with ASan. Build with make rather than CMake. We don't really care, and ASan+TEST_CPP is currently broken with CMake. Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
a5326df
d3e14f9
to
a5326df
Compare
a5326df
to
063b724
Compare
This makes ssl-opt.sh pass with non-ancient builds of openssl. Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
063b724
to
1a6894c
Compare
GnuTLS excludes some ciphers by default, and has no direct way to enable them all. So list them all and enable them. This fixes, at least, the exclusion of testing CAMELLIA with GnuTLS servers >= 3.4, which don't include CAMELLIA in NORMAL (GnuTLS 3.3.8 does). We no longer need an ad hoc, version-dependent list of things to enable beyond NORMAL. Remove the variable G_CLIENT_PRIO which was unused. Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Don't use an all.sh component because there isn't one that does what we want (modern clang with ASan, and test everything). * We need to set CC explicitly or tweak PATH, because clang in $PATH on Travis focal instances is Clang 7 which is too old (we want Clang 10). * Travis lacks the array of versions of openssl and gnutls that we normally use for testing, so we need to exclude some cipher suites (or build our own multiple versions of openssl and gnutls). Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
- make CC=clang-10 CXX=clang-10 CFLAGS='-Werror -Wall -Wextra -fsanitize=address,undefined -fno-sanitize-recover=all -O2' LDFLAGS='-Werror -Wall -Wextra -fsanitize=address,undefined -fno-sanitize-recover=all' | ||
- make test | ||
- tests/ssl-opt.sh | ||
# Modern openssl does not support fixed ECDH or null ciphers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that in 2.28, we'll need to exclude more stuff, because 2.28 supports more obsolete features that the non-antique OpenSSL and GnuTLS don't.
This means that the backport risks being an even longer chain of whack-a-mole to upgrade or work around our obsolescent test data and scripts. If this can be merged before the 3.3 freeze (depending on CI success and reviewer time), I'd like to go ahead and merge even if the backport misses the release freeze.
@@ -3467,7 +3467,8 @@ psa_status_t psa_cipher_encrypt( mbedtls_svc_key_id_t key, | |||
status = psa_driver_wrapper_cipher_encrypt( | |||
&attributes, slot->key.data, slot->key.bytes, | |||
alg, local_iv, default_iv_length, input, input_length, | |||
output + default_iv_length, output_size - default_iv_length, | |||
( output == NULL ? NULL : output + default_iv_length ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a possibility, so we can have something concrete to discuss
( output == NULL ? NULL : output + default_iv_length ), | |
NULL_SAFE_PTR_ADD( output, default_iv_length ), |
which has the properties: the words NULL
and SAFE
give an indication of WHY this is done; PTR_ADD
should be sufficiently explanatory and easy to learn; doesn't require repeating output
or NULL
and is actually shorter.
The only disadvantage (depending how understandable you find PTR_ADD
) is that the +
sign disappears
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is from #6609 and I won't change it again here. I'm going to rebase those commits out, in fact, since they're causing confusion. (I hadn't rebased them because there are no merge conflicts and I thought keeping them would help incremental review.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like #6609 touched psa_crypto.c
- what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, my bad, only the bignum part was in the earlier PR, not the PSA crypto part. So this is still up for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, why not separate that bit out into an independent PR? Then this one really can be "get the CI running a later/better sanitiser"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally wanted to have non-regression testing for the issues I'm fixing. But given how hard getting even one of Travis or Jenkins to run on Ubuntu 20.04 is turning out, I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentative name mbedtls_buffer_offset
. I made it a function rather than a macro because it's safer, but a downside is that it's typed so there needs to be a separate name for the const
version.
@@ -924,27 +924,47 @@ component_test_default_cmake_gcc_asan () { | |||
tests/context-info.sh | |||
} | |||
|
|||
component_test_full_cmake_gcc_asan () { | |||
msg "build: full config, cmake, gcc, ASan" | |||
component_test_full_make_clang_asan () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://ci.trustedfirmware.org/job/mbed-tls-pr-merge/job/PR-6614-merge/8/flowGraphTable/ doesn't list this job. What's going on?
The pre-test check should have caught test_full_make_clang_asan
from all.sh --list-all-components
, and should have either complained that none of Docker images can run it or decided to dispatch on one of the Docker images, based on all.sh --list-components
output. And the Ubuntu 20.04 image should have included test_full_make_clang_asan
in its all.sh --list-components
.
The Jenkins logs don't show the ouput of all.sh --list-…
, unfortunately. The pipeline has no branch for all_u20-test_full_make_clang_asan
. So where did test_full_make_clang_asan
go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our Groovy code is silently skipping ubuntu-20.04 jobs 😦 Patch incoming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daverodgman @tom-cosgrove-arm The UB fix part, plus a reduced Travis test, is up for review at #6648 |
What's happening with this PR now? |
|
As part of our review of historical PRs we have made the decision to convert older PRs that have not been updated in 3 months into drafts until they are worked on again. |
Fix some issues reported by UBSan with a recent Clang (but not by a recent GCC or an old Clang). They are genuine instances of undefined behavior, although in practice compilers and processors are very likely to do what we want. This includes an issue fixed in #6609. Now tests will pass when building with Clang and UBSan on Ubuntu 20.04.
Run UBSan with a recent Clang on the CI. I replaced a GCC+UBSan component, so the CI load will remain the same. Fixes #6610.
Gatekeeper checklist