Skip to content

Enabling FIPS mode on plain Ubuntu 22.04 and using crypto leads to infinite hang in CSPRNG #46200

Closed
@addaleax

Description

@addaleax

Version

v18.13.0, v19.4.0, main

Platform

Ubuntu 22.04 without modifications; Linux desktop-ua 5.15.0-57-generic #63-Ubuntu SMP Thu Nov 24 13:43:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

crypto

What steps will reproduce the bug?

Dockerfile

$ cat fips-loop.js 
const crypto = require('crypto');
crypto.setFips(1);
crypto.randomBytes(20, console.log);
$ gdb --args ./node-v19.4.0-linux-x64/bin/node ./fips-loop.js
[…]
(gdb) r
[…]
^C
[…]
(gdb) thread apply all bt
[…]
#0  0x0000555b0aef4018 in _dopr ()
#1  0x0000555b0aef5112 in BIO_vsnprintf ()
#2  0x0000555b0af952dd in ERR_vset_error ()
#3  0x0000555b0af95433 in ERR_set_error ()
#4  0x0000555b0afbcacb in evp_generic_fetch ()
#5  0x0000555b0afc2192 in EVP_RAND_fetch ()
#6  0x0000555b0b026660 in rand_new_drbg ()
#7  0x0000555b0b0277d6 in RAND_get0_public ()
#8  0x0000555b0b027858 in RAND_bytes_ex ()
#9  0x0000555b09ac730f in node::crypto::CSPRNG (buffer=0x555b0f0b7960, length=20) at ../src/crypto/crypto_util.cc:66
#10 0x0000555b09ab9054 in node::crypto::RandomBytesTraits::DeriveBits (env=0x555b0f32db30, params=..., unused=0x555b0f34cca0) at ../src/crypto/crypto_random.cc:69
#11 0x0000555b09abd1c3 in node::crypto::DeriveBitsJob<node::crypto::RandomBytesTraits>::DoThreadPoolWork (this=0x555b0f34cb90) at ../src/crypto/crypto_util.h:500
#12 0x0000555b097dd98c in node::ThreadPoolWork::ScheduleWork()::{lambda(uv_work_s*)#1}::operator()(uv_work_s*) const (__closure=0x0, req=0x555b0f34cbd8) at ../src/threadpoolwork-inl.h:44
#13 0x0000555b097dda7d in node::ThreadPoolWork::ScheduleWork()::{lambda(uv_work_s*)#1}::_FUN(uv_work_s*) () at ../src/threadpoolwork-inl.h:47
#14 0x0000555b0ac81a58 in worker (arg=0x0) at ../deps/uv/src/threadpool.c:122
#15 0x00007fafcb516b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#16 0x00007fafcb5a8a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

How often does it reproduce? Is there a required condition?

Always. No.

What is the expected behavior?

Some type of error indicating that OpenSSL is not configured properly for FIPS mode on the machine, which I assume this is the root cause here.

(I am not expecting this to really work and give me random bytes.)

What do you see instead?

Infinite hang.

Additional information

I think this is a problem that other people have run into before, e.g. #38633 (review) cc @danbev @richardlau

In the debugger, it’s visible that RAND_poll and RAND_status keep returning 1 but RAND_bytes keeps returning 0 (code).

Activity

added
cryptoIssues and PRs related to the crypto subsystem.
opensslIssues and PRs related to the OpenSSL dependency.
on Jan 13, 2023
addaleax

addaleax commented on Jan 13, 2023

@addaleax
MemberAuthor

@nodejs/crypto

tniessen

tniessen commented on Jan 13, 2023

@tniessen
Member

Quoting myself from an internal discussion of 5cc36c3:

FWIW this can result in an endless loop when RAND_bytes() fails for some reason other than missing entropy, but that probably does not happen in practice.

@richardlau mentioned running into this case when experimenting with FIPS.

richardlau

richardlau commented on Jan 13, 2023

@richardlau
Member

Yes, I ran into this last year when attempting to extend #44148 to cover the FIPS provider. I got sidetracked onto other Build things, but from memory:

The issue here is that one of the RAND_* calls under the covers attempts to load an algorithm from the providers. On OpenSSL 3, Node.js' crypto.setFips() calls EVP_default_properties_enable_fips, which effectively filters the algorithms to those from the FIPS provider:

if (!EVP_default_properties_enable_fips(nullptr, enable)) {
.
However, it is possible to load Node.js without the FIPS provider configured (either not there at all or incorrectly configured in the openssl conf being loaded) and still enable the filter, in which case no matching algorithm will be available and we end up in the observed loop.

(Note that we do not currently attempt to load the FIPS provider in crypto.setFips() (but do if the command line options are used, although we then unload, which I'm uncertain is what we want:

bool ProcessFipsOptions() {
/* Override FIPS settings in configuration file, if needed. */
if (per_process::cli_options->enable_fips_crypto ||
per_process::cli_options->force_fips_crypto) {
#if OPENSSL_VERSION_MAJOR >= 3
OSSL_PROVIDER* fips_provider = OSSL_PROVIDER_load(nullptr, "fips");
if (fips_provider == nullptr)
return false;
OSSL_PROVIDER_unload(fips_provider);
return EVP_default_properties_enable_fips(nullptr, 1) &&
EVP_default_properties_is_fips_enabled(nullptr);
#else
return FIPS_mode() == 0 && FIPS_mode_set(1);
#endif
}
return true;
}
).)

I think I looked at, or was going to look at, adding a OSSL_provider_available() (which I also remember confusingly checks if the provider is actually loaded rather than being available to load) check to crypto.setFips(). I don't remember if that actually worked, or if I had second thoughts because theoretically you could have FIPS providers that were not called "fips" (i.e. someone added their own provider). Supporting FIPS when your FIPS provider is not called "fips" could be an edge case that we choose not to support via a Node.js API (i.e. you'd have to do it via openssl config).

Or another option may have been to unconditionally load the "fips" provider (an assumption here being this is always the provider called "fips") in crypto.setFips() when enabling fips -- I think OpenSSL does reference counting so loading the provider multiple times should be okay. This should error out in the recreate in this issue's description. I think my question then was whether disabling FIPS via crypto.setFips() should also attempt to unload the "fips" provider.

richardlau

richardlau commented on Jan 13, 2023

@richardlau
Member

I've just written all of that and then went back to find:

@richardlau mentioned running into this case when experimenting with FIPS.

and I actually quoted an example (contrived) without FIPS being involved 😆:

$ cat test/fixtures/openssl3-conf/base_only.cnf
nodejs_conf = nodejs_init

[nodejs_init]
providers = provider_sect

# List of providers to load
[provider_sect]
base = base_sect

[base_sect]
activate = 1
$ OPENSSL_CONF=test/fixtures/openssl3-conf/base_only.cnf ./node

Same basic cause -- the RAND_* function that is attempting to load an algorithm doesn't find a matching one and then our code continually loops. I don't recall if I identified the algorithm it was looking for.

addaleax

addaleax commented on Jan 17, 2023

@addaleax
MemberAuthor

@richardlau Sooo … I definitely don’t feel like I am familiar enough with this subject to have an opinion about how to move forward, exactly.

I don’t think this would be a full solution to this problem, but would it make sense to at least detect this specific error in the CSPRNG call and return instead of infinite looping?

diff --git a/src/crypto/crypto_util.cc b/src/crypto/crypto_util.cc
index 780dab082459..2927645e6405 100644
--- a/src/crypto/crypto_util.cc
+++ b/src/crypto/crypto_util.cc
@@ -62,9 +62,22 @@ int VerifyCallback(int preverify_ok, X509_STORE_CTX* ctx) {
 
 MUST_USE_RESULT CSPRNGResult CSPRNG(void* buffer, size_t length) {
   do {
-    if (1 == RAND_status())
-      if (1 == RAND_bytes(static_cast<unsigned char*>(buffer), length))
+    if (1 == RAND_status()) {
+      if (1 == RAND_bytes(static_cast<unsigned char*>(buffer), length)) {
         return {true};
+      } else {
+#if OPENSSL_VERSION_MAJOR >= 3
+        auto code = ERR_peek_last_error();
+        // A misconfigured OpenSSL 3 installation may report 1 from RAND_poll()
+        // and RAND_status() but fail in RAND_bytes() if it cannot look up
+        // a matching algorithm for the CSPRNG.
+        if (ERR_GET_LIB(code) == ERR_LIB_RAND &&
+            ERR_GET_REASON(code) == RAND_R_UNABLE_TO_FETCH_DRBG) {
+          return {false};
+        }
+      }
+#endif
+    }
   } while (1 == RAND_poll());
 
   return {false};

or would that be too naïve? I think for me it would solve the problem because then it would be possible to try to see what crypto.randomBytes() does immediately after enabling FIPS and seeing if it works, which is at least easily detectable, unlike an infinite loop.

richardlau

richardlau commented on Jan 17, 2023

@richardlau
Member

@addaleax I've been playing around with something similar: richardlau@def2b01

diff --git a/src/crypto/crypto_util.cc b/src/crypto/crypto_util.cc
index 780dab082459..be467457887d 100644
--- a/src/crypto/crypto_util.cc
+++ b/src/crypto/crypto_util.cc
@@ -62,6 +62,12 @@ int VerifyCallback(int preverify_ok, X509_STORE_CTX* ctx) {
 
 MUST_USE_RESULT CSPRNGResult CSPRNG(void* buffer, size_t length) {
   do {
+#if OPENSSL_VERSION_MAJOR >= 3
+    const uint32_t err = ERR_peek_error();
+    if (err == ERR_PACK(ERR_LIB_RAND, 0, RAND_R_UNABLE_TO_FETCH_DRBG)) {
+      return {false};
+    }
+#endif
     if (1 == RAND_status())
       if (1 == RAND_bytes(static_cast<unsigned char*>(buffer), length))
         return {true};
(I like your version of using `ERR_GET_LIB` and `ERR_GET_REASON` more.) Hadn't opened the PR yet because I was still working on the testcases (the one in my commit doesn't work properly if FIPS is enabled).

I'm a bit unsure myself whether this checking for this specific error is distinct enough from the missing entropy case the loop is meant to address, but I'm also tending towards having a detectable error in these cases.

addaleax

addaleax commented on Jan 17, 2023

@addaleax
MemberAuthor

I'm a bit unsure myself whether this checking for this specific error is distinct enough from the missing entropy case the loop is meant to address

I’d actually feel like in an ideal world, we’d be checking for the missing-entropy error and only in that case continue to run the loop, but at the same time that comes with a larger risk of unintentional breakage (or at least it feels like that to me).

3 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    cryptoIssues and PRs related to the crypto subsystem.opensslIssues and PRs related to the OpenSSL dependency.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Enabling FIPS mode on plain Ubuntu 22.04 and using crypto leads to infinite hang in CSPRNG · Issue #46200 · nodejs/node