Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problems found by clang+UBSan on Ubuntu 20.04 #6614

Draft
wants to merge 6 commits into
base: development
Choose a base branch
from

Conversation

gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Nov 16, 2022

Fix some issues reported by UBSan with a recent Clang (but not by a recent GCC or an old Clang). They are genuine instances of undefined behavior, although in practice compilers and processors are very likely to do what we want. This includes an issue fixed in #6609. Now tests will pass when building with Clang and UBSan on Ubuntu 20.04.

Run UBSan with a recent Clang on the CI. I replaced a GCC+UBSan component, so the CI load will remain the same. Fixes #6610.

Gatekeeper checklist

  • changelog no (test only)
  • backport TODO
  • tests N/A (it's all test)

psa_cipher_encrypt() and psa_cipher_decrypt() sometimes add a zero offset to
a null pointer when the cipher does not use an IV. This is undefined
behavior, although it works as naively expected on most platforms. This
can cause a crash under modern Asan (depending on compiler optimizations).

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
We had a GCC+ASan+UBSan full test run on both the default configuration and
the full configuration. In the full configuration, use Clang instead.
Empirically, a modern Clang's UBSan finds things that GCC's doesn't.

Insist on a modern Clang (by default, our CI would pick the oldest
available, which doesn't find as much stuff). Define "modern" as the most
recent that's available on our CI right now, which is Ubuntu 20.04.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
@gilles-peskine-arm gilles-peskine-arm added bug enhancement component-crypto Crypto primitives and low-level interfaces needs-ci Needs to pass CI tests needs-preceding-pr Requires another PR to be merged first component-test Test framework and CI scripts priority-medium Medium priority - this can be reviewed as time permits labels Nov 16, 2022
status = mbedtls_psa_cipher_finish( &operation, output + accumulated_length,
status = mbedtls_psa_cipher_finish( &operation,
( output == NULL ? NULL :
output + accumulated_length ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the introduction of mpi_sint_abs(), the obvious question here is: since this construct is repeated four times, is it worth abstracting it, either with a function or a macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence. This is considerably simpler than mpi_sint_abs. It feels like giving it a meaningful name will make the code longer and won't really be more readable.

Are you happy with seeing the code, or are you going “WTF why do we need a special case for NULL”? If the latter, then a macro/function with documentation that explains why it's needed sometimes is definitely warranted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's clear that it would be meaningless to compute an address based on NULL + offset, and so having a special case makes sense, but less obvious that it's UB (as opposed to simply a meaningless result) without the special case. So I think either way would be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's clear why it is done - no comment needed. It's just that the construct is clunky, and much less readable than the (UB-problematic) original code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A helper function would be useful here indeed

Copy link
Contributor Author

@gilles-peskine-arm gilles-peskine-arm Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is from #6609 and I won't change it again here. The pull request here is to arrange for such bugs to be caught by our CI.
My bad, only the bignum part was in the earlier PR, not the PSA crypto part. So this is still up for review.

@daverodgman daverodgman self-requested a review November 16, 2022 15:56
daverodgman
daverodgman previously approved these changes Nov 16, 2022
Copy link
Contributor

@tom-cosgrove-arm tom-cosgrove-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tom-cosgrove-arm
Copy link
Contributor

Hmm, CI failing with

Component test_full_cmake_clang_asan was explicitly requested, but is not known or not supported.

.travis.yml Outdated Show resolved Hide resolved
@@ -924,10 +924,10 @@ component_test_default_cmake_gcc_asan () {
tests/context-info.sh
}

component_test_full_cmake_gcc_asan () {
msg "build: full config, cmake, gcc, ASan"
component_test_full_cmake_clang_asan () {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a component_test_full_cmake_clang and it would seem natural to merge it with component_test_full_cmake_clang_asan. After all, we don't need to test both an asan and a non-instrumented build.

Unfortunately, the two components have different requirements. The whole point of this change is to insist on a recent clang. When running Ubuntu 20.04, that also means a recent openssl which doesn't support the same parts of compat.sh. Between the two components, all of compat.sh should run. But I don't think there's a single version of OpenSSL that supports everything (ARIA requires ≥1.1, null ciphersuites require <1.1, and for 2.28 there's also DES).

This is more difficult than I'd hoped and I may revise my ambitions down for Travis, especially considering that it's kind of on its way out now that we have OpenCI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In today's team meeting we decided that we could remove this component from Travis. The main reason were keeping Travis in addition to Jenkins was that it has public logs, but now with OpenCI so does Jenkins. We aren't going to invest time into Travis at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, having a run in the full config on Travis is really convenient. It lets us see at first glance if the “baseline” fails. So I've kept one, just not using all.sh, so it doesn't have to have the same expectations of completeness.

Since this run doesn't use our official openssl and gnutls versions, it is not exactly identical to even a subset of a Jenkins job. This was already the case before: while we use the same versions of openssl and gnutls on Jenkins as in Ubuntu 16.04, we don't use the same builds, and this has bitten us before when Ubuntu removed some obsolete algorithms from the default configuration.

Having something on the CI that's close to a typical developer machine (at least closer than an out-of-support Ubuntu release) helps maintain the property that just running make && make test && tests/ssl-opt.sh && tests/compat.sh can be expected to work out of the box.

I hope that by the time we retire Travis, we'll have modernized our interoperability testing (specifically, version reliance in the SSL test scripts) so that Jenkins can ensure this property.

@tom-cosgrove-arm
Copy link
Contributor

tom-cosgrove-arm commented Nov 17, 2022

Regarding the preceding-pr #6609, is this one instead of that older one, or does 6609 need reviewing in additon?

@gilles-peskine-arm
Copy link
Contributor Author

This PR has all the commits of #6609 (so far) plus more stuff. I'd prefer to get #6609 (and trivial backport) in first, then rebase this one (and backport which might be nontrivial depending on exactly how all.sh evolves and what this means with SSL testing).

There was one with ASan and one without, for historical reasons. Have a
single component with ASan.

Build with make rather than CMake. We don't really care, and ASan+TEST_CPP
is currently broken with CMake.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
This makes ssl-opt.sh pass with non-ancient builds of openssl.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
GnuTLS excludes some ciphers by default, and has no direct way to enable
them all. So list them all and enable them.

This fixes, at least, the exclusion of testing CAMELLIA with GnuTLS servers
>= 3.4, which don't include CAMELLIA in NORMAL (GnuTLS 3.3.8 does).

We no longer need an ad hoc, version-dependent list of things to enable
beyond NORMAL.

Remove the variable G_CLIENT_PRIO which was unused.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
Don't use an all.sh component because there isn't one that does what we
want (modern clang with ASan, and test everything).

* We need to set CC explicitly or tweak PATH, because clang in $PATH on
  Travis focal instances is Clang 7 which is too old (we want Clang 10).
* Travis lacks the array of versions of openssl and gnutls that we normally
  use for testing, so we need to exclude some cipher suites (or build our
  own multiple versions of openssl and gnutls).

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
- make CC=clang-10 CXX=clang-10 CFLAGS='-Werror -Wall -Wextra -fsanitize=address,undefined -fno-sanitize-recover=all -O2' LDFLAGS='-Werror -Wall -Wextra -fsanitize=address,undefined -fno-sanitize-recover=all'
- make test
- tests/ssl-opt.sh
# Modern openssl does not support fixed ECDH or null ciphers.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that in 2.28, we'll need to exclude more stuff, because 2.28 supports more obsolete features that the non-antique OpenSSL and GnuTLS don't.

This means that the backport risks being an even longer chain of whack-a-mole to upgrade or work around our obsolescent test data and scripts. If this can be merged before the 3.3 freeze (depending on CI success and reviewer time), I'd like to go ahead and merge even if the backport misses the release freeze.

@@ -3467,7 +3467,8 @@ psa_status_t psa_cipher_encrypt( mbedtls_svc_key_id_t key,
status = psa_driver_wrapper_cipher_encrypt(
&attributes, slot->key.data, slot->key.bytes,
alg, local_iv, default_iv_length, input, input_length,
output + default_iv_length, output_size - default_iv_length,
( output == NULL ? NULL : output + default_iv_length ),
Copy link
Contributor

@tom-cosgrove-arm tom-cosgrove-arm Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a possibility, so we can have something concrete to discuss

Suggested change
( output == NULL ? NULL : output + default_iv_length ),
NULL_SAFE_PTR_ADD( output, default_iv_length ),

which has the properties: the words NULL and SAFE give an indication of WHY this is done; PTR_ADD should be sufficiently explanatory and easy to learn; doesn't require repeating output or NULL and is actually shorter.

The only disadvantage (depending how understandable you find PTR_ADD) is that the + sign disappears

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is from #6609 and I won't change it again here. I'm going to rebase those commits out, in fact, since they're causing confusion. (I hadn't rebased them because there are no merge conflicts and I thought keeping them would help incremental review.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like #6609 touched psa_crypto.c - what am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, my bad, only the bignum part was in the earlier PR, not the PSA crypto part. So this is still up for review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, why not separate that bit out into an independent PR? Then this one really can be "get the CI running a later/better sanitiser"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wanted to have non-regression testing for the issues I'm fixing. But given how hard getting even one of Travis or Jenkins to run on Ubuntu 20.04 is turning out, I'll do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentative name mbedtls_buffer_offset. I made it a function rather than a macro because it's safer, but a downside is that it's typed so there needs to be a separate name for the const version.

@@ -924,27 +924,47 @@ component_test_default_cmake_gcc_asan () {
tests/context-info.sh
}

component_test_full_cmake_gcc_asan () {
msg "build: full config, cmake, gcc, ASan"
component_test_full_make_clang_asan () {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://ci.trustedfirmware.org/job/mbed-tls-pr-merge/job/PR-6614-merge/8/flowGraphTable/ doesn't list this job. What's going on?

The pre-test check should have caught test_full_make_clang_asan from all.sh --list-all-components, and should have either complained that none of Docker images can run it or decided to dispatch on one of the Docker images, based on all.sh --list-components output. And the Ubuntu 20.04 image should have included test_full_make_clang_asan in its all.sh --list-components.

The Jenkins logs don't show the ouput of all.sh --list-…, unfortunately. The pipeline has no branch for all_u20-test_full_make_clang_asan. So where did test_full_make_clang_asan go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our Groovy code is silently skipping ubuntu-20.04 jobs 😦 Patch incoming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gilles-peskine-arm
Copy link
Contributor Author

@daverodgman @tom-cosgrove-arm The UB fix part, plus a reduced Travis test, is up for review at #6648

@tom-cosgrove-arm
Copy link
Contributor

What's happening with this PR now?

@gilles-peskine-arm
Copy link
Contributor Author

@tom-cosgrove-arm

  1. Get the UB fix and minimum-effort Travis test in (Fix NULL+0 undefined behavior in PSA crypto ECB #6648 and I'm still working on the minimum-effort Travis test for 2.28, hope to get that passing today).
  2. Merge Fix dispatch to Ubuntu 20.04 mbedtls-test#76
  3. Rebase this PR to be just the all.sh changes (which are passing! But not running until the Groovy code is fixed)

@gilles-peskine-arm gilles-peskine-arm added the needs-preceding-pr Requires another PR to be merged first label Nov 24, 2022
@tom-daubney-arm tom-daubney-arm added the historical-reviewed Reviewed & agreed to keep legacy PR/issue label May 18, 2023
@tom-daubney-arm
Copy link
Contributor

As part of our review of historical PRs we have made the decision to convert older PRs that have not been updated in 3 months into drafts until they are worked on again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component-crypto Crypto primitives and low-level interfaces component-test Test framework and CI scripts enhancement historical-reviewed Reviewed & agreed to keep legacy PR/issue needs-ci Needs to pass CI tests needs-preceding-pr Requires another PR to be merged first needs-work priority-medium Medium priority - this can be reviewed as time permits
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI run with modern Clang+ASan+UBSan
6 participants