Skip to content

crypto: revert dangerous uses of std::string_view #57816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025

Conversation

tniessen
Copy link
Member

An std::string_view v is a const char* v.data() along with an std::size_t v.size() that guarantees that v.size() contiguous elements of type char can be accessed relative to the pointer v.data().

One of the main reasons behind the existence of std::string_view is the ability to operate on char sequences without requiring null termination, which otherwise often requires expensive copies of strings to be made. As a consequence, it is generally incorrect to assume that v.data() points to a null-terminated sequence of char, and the only way to obtain a null-terminated string from an std::string_view is to make a copy. It is not even possible to check if the sequence pointed to by v.data() is null-terminated because the null character would be at position v.data() + v.size(), which is outside of the range that v guarantees safe access to. (A default-constructed std::string_view even sets its own data pointer to a nullptr, which is fine because it only needs to guarantee safe access to zero elements, i.e., to no elements).

In deps/ncrypto and src/crypto, there are various APIs that consume std::string_view v arguments but then ignore v.size() and treat v.data() as a C-style string of type const char*. However, that is not what call sites would expect from functions that explicitly ask for std::string_view arguments, since it makes assumptions beyond the guarantees provided by std::string_view and leads to undefined behavior unless the given view either contains an embedded null character or the char at address v.data() + v.size() is a null character. This is not a reasonable assumption for std::string_view in general, and it also defeats the purpose of std::string_view for the most part since, when v.size() is being ignored, it is essentially just a const char*.

Constructing an std::string_view from a const char* is also not "free" but requires computing the length of the C-style string (unless the length can be computed at compile time, e.g., because the value is just a string literal). Repeated conversion between const char* as used by OpenSSL and std::string_view as used by ncrypto thus incurs the additional overhead of computing the length of the string whenever an std::string_view is constructed from a const char*. (This seems negligible compared to the safety argument though.)

Similarly, returning a const char* pointer to a C-style string as an std::string_view has two downsides: the function must compute the length of the string in order to construct the view, and the caller can no longer assume that the return value is null-terminated and thus cannot pass the returned view to functions that require their arguments to be null terminated. (And, for the reasons explained above, the caller also cannot check if the value is null-terminated without potentially invoking undefined behavior.)

C++20 unfortunately does not have a type similar to Rust's CStr or GSL czstring. Therefore, this commit changes many occurrences of std::string_view back to const char*, which is conventional for null-terminated C-style strings and does not require computing the length of strings.

There are a lot of occurrences of std::string_view in ncrypto and for each one, we need to evaluate if it is safe and a good abstraction. I tried to do so, but I might have changed too few or too many, so please feel free to give feedback on individual occurrences.

cc @nodejs/cpp-reviewers

@tniessen tniessen added crypto Issues and PRs related to the crypto subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. security Issues and PRs related to security. dependencies Pull requests that update a dependency file. labels Apr 10, 2025
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/crypto
  • @nodejs/security-wg

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Apr 10, 2025
@tniessen tniessen force-pushed the crypto-revert-stringview branch from a8e27f4 to 6fdef91 Compare April 10, 2025 11:00
Copy link

codecov bot commented Apr 10, 2025

Codecov Report

Attention: Patch coverage is 85.71429% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.23%. Comparing base (9bbbe60) to head (bea5e74).
Report is 215 commits behind head on main.

Files with missing lines Patch % Lines
src/crypto/crypto_context.cc 50.00% 2 Missing ⚠️
src/crypto/crypto_hash.cc 50.00% 1 Missing ⚠️
src/crypto/crypto_tls.cc 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #57816      +/-   ##
==========================================
- Coverage   90.23%   90.23%   -0.01%     
==========================================
  Files         630      630              
  Lines      185481   185480       -1     
  Branches    36373    36373              
==========================================
- Hits       167375   167362      -13     
+ Misses      10994    10991       -3     
- Partials     7112     7127      +15     
Files with missing lines Coverage Δ
src/crypto/crypto_cipher.cc 77.05% <100.00%> (ø)
src/crypto/crypto_cipher.h 60.00% <ø> (ø)
src/crypto/crypto_ec.cc 67.56% <100.00%> (ø)
src/crypto/crypto_hkdf.cc 65.27% <100.00%> (ø)
src/crypto/crypto_hmac.cc 68.23% <100.00%> (ø)
src/crypto/crypto_keys.cc 72.71% <100.00%> (ø)
src/crypto/crypto_pbkdf2.cc 68.11% <100.00%> (ø)
src/crypto/crypto_rsa.cc 63.93% <100.00%> (ø)
src/crypto/crypto_sig.cc 70.96% <100.00%> (ø)
src/crypto/crypto_sig.h 63.63% <ø> (ø)
... and 4 more

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

An `std::string_view v` is a `const char* v.data()` along with an
`std::size_t v.size()` that guarantees that `v.size()` contiguous
elements of type `char` can be accessed relative to the pointer
`v.data()`.

One of the main reasons behind the existence of  `std::string_view` is
the ability to operate on `char` sequences without requiring null
termination, which otherwise often requires expensive copies of strings
to be made. As a consequence, it is generally incorrect to assume that
`v.data()` points to a null-terminated sequence of `char`, and the only
way to obtain a null-terminated string from an `std::string_view` is to
make a copy. It is not even possible to check if the sequence pointed to
by `v.data()` is null-terminated because the null character would be at
position `v.data() + v.size()`, which is outside of the range that `v`
guarantees safe access to. (A default-constructed `std::string_view`
even sets its own data pointer to a `nullptr`, which is fine because it
only needs to guarantee safe access to zero elements, i.e., to no
elements).

In `deps/ncrypto` and `src/crypto`, there are various APIs that consume
`std::string_view v` arguments but then ignore `v.size()` and treat
`v.data()` as a C-style string of type `const char*`. However, that is
not what call sites would expect from functions that explicitly ask for
`std::string_view` arguments, since it makes assumptions beyond the
guarantees provided by `std::string_view` and leads to undefined
behavior unless the given view either contains an embedded null
character or the `char` at address `v.data() + v.size()` is a null
character. This is not a reasonable assumption for `std::string_view` in
general, and it also defeats the purpose of `std::string_view` for the
most part since, when `v.size()` is being ignored, it is essentially
just a `const char*`.

Constructing an `std::string_view` from a `const char*` is also not
"free" but requires computing the length of the C-style string (unless
the length can be computed at compile time, e.g., because the value is
just a string literal). Repeated conversion between `const char*` as
used by OpenSSL and `std::string_view` as used by ncrypto thus incurs
the additional overhead of computing the length of the string whenever
an `std::string_view` is constructed from a `const char*`. (This seems
negligible compared to the safety argument though.)

Similarly, returning a `const char*` pointer to a C-style string as an
`std::string_view` has two downsides: the function must compute the
length of the string in order to construct the view, and the caller
can no longer assume that the return value is null-terminated and thus
cannot pass the returned view to functions that require their arguments
to be null terminated. (And, for the reasons explained above, the caller
also cannot check if the value is null-terminated without potentially
invoking undefined behavior.)

C++20 unfortunately does not have a type similar to Rust's `CStr` or
GSL `czstring`. Therefore, this commit changes many occurrences of
`std::string_view` back to `const char*`, which is conventional for
null-terminated C-style strings and does not require computing the
length of strings.

There are _a lot_ of occurrences of `std::string_view` in ncrypto and
for each one, we need to evaluate if it is safe and a good abstraction.
I tried to do so, but I might have changed too few or too many, so
please feel free to give feedback on individual occurrences.
@tniessen tniessen force-pushed the crypto-revert-stringview branch from 6fdef91 to bea5e74 Compare April 10, 2025 12:34
Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice pull request. I love the description. Thank you Tobias!

@nodejs-github-bot
Copy link
Collaborator

@tniessen tniessen added the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 15, 2025
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 15, 2025
@nodejs-github-bot nodejs-github-bot merged commit 0a1d5d3 into nodejs:main Apr 15, 2025
55 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 0a1d5d3

RafaelGSS pushed a commit that referenced this pull request May 1, 2025
An `std::string_view v` is a `const char* v.data()` along with an
`std::size_t v.size()` that guarantees that `v.size()` contiguous
elements of type `char` can be accessed relative to the pointer
`v.data()`.

One of the main reasons behind the existence of  `std::string_view` is
the ability to operate on `char` sequences without requiring null
termination, which otherwise often requires expensive copies of strings
to be made. As a consequence, it is generally incorrect to assume that
`v.data()` points to a null-terminated sequence of `char`, and the only
way to obtain a null-terminated string from an `std::string_view` is to
make a copy. It is not even possible to check if the sequence pointed to
by `v.data()` is null-terminated because the null character would be at
position `v.data() + v.size()`, which is outside of the range that `v`
guarantees safe access to. (A default-constructed `std::string_view`
even sets its own data pointer to a `nullptr`, which is fine because it
only needs to guarantee safe access to zero elements, i.e., to no
elements).

In `deps/ncrypto` and `src/crypto`, there are various APIs that consume
`std::string_view v` arguments but then ignore `v.size()` and treat
`v.data()` as a C-style string of type `const char*`. However, that is
not what call sites would expect from functions that explicitly ask for
`std::string_view` arguments, since it makes assumptions beyond the
guarantees provided by `std::string_view` and leads to undefined
behavior unless the given view either contains an embedded null
character or the `char` at address `v.data() + v.size()` is a null
character. This is not a reasonable assumption for `std::string_view` in
general, and it also defeats the purpose of `std::string_view` for the
most part since, when `v.size()` is being ignored, it is essentially
just a `const char*`.

Constructing an `std::string_view` from a `const char*` is also not
"free" but requires computing the length of the C-style string (unless
the length can be computed at compile time, e.g., because the value is
just a string literal). Repeated conversion between `const char*` as
used by OpenSSL and `std::string_view` as used by ncrypto thus incurs
the additional overhead of computing the length of the string whenever
an `std::string_view` is constructed from a `const char*`. (This seems
negligible compared to the safety argument though.)

Similarly, returning a `const char*` pointer to a C-style string as an
`std::string_view` has two downsides: the function must compute the
length of the string in order to construct the view, and the caller
can no longer assume that the return value is null-terminated and thus
cannot pass the returned view to functions that require their arguments
to be null terminated. (And, for the reasons explained above, the caller
also cannot check if the value is null-terminated without potentially
invoking undefined behavior.)

C++20 unfortunately does not have a type similar to Rust's `CStr` or
GSL `czstring`. Therefore, this commit changes many occurrences of
`std::string_view` back to `const char*`, which is conventional for
null-terminated C-style strings and does not require computing the
length of strings.

There are _a lot_ of occurrences of `std::string_view` in ncrypto and
for each one, we need to evaluate if it is safe and a good abstraction.
I tried to do so, but I might have changed too few or too many, so
please feel free to give feedback on individual occurrences.

PR-URL: #57816
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
RafaelGSS pushed a commit that referenced this pull request May 2, 2025
An `std::string_view v` is a `const char* v.data()` along with an
`std::size_t v.size()` that guarantees that `v.size()` contiguous
elements of type `char` can be accessed relative to the pointer
`v.data()`.

One of the main reasons behind the existence of  `std::string_view` is
the ability to operate on `char` sequences without requiring null
termination, which otherwise often requires expensive copies of strings
to be made. As a consequence, it is generally incorrect to assume that
`v.data()` points to a null-terminated sequence of `char`, and the only
way to obtain a null-terminated string from an `std::string_view` is to
make a copy. It is not even possible to check if the sequence pointed to
by `v.data()` is null-terminated because the null character would be at
position `v.data() + v.size()`, which is outside of the range that `v`
guarantees safe access to. (A default-constructed `std::string_view`
even sets its own data pointer to a `nullptr`, which is fine because it
only needs to guarantee safe access to zero elements, i.e., to no
elements).

In `deps/ncrypto` and `src/crypto`, there are various APIs that consume
`std::string_view v` arguments but then ignore `v.size()` and treat
`v.data()` as a C-style string of type `const char*`. However, that is
not what call sites would expect from functions that explicitly ask for
`std::string_view` arguments, since it makes assumptions beyond the
guarantees provided by `std::string_view` and leads to undefined
behavior unless the given view either contains an embedded null
character or the `char` at address `v.data() + v.size()` is a null
character. This is not a reasonable assumption for `std::string_view` in
general, and it also defeats the purpose of `std::string_view` for the
most part since, when `v.size()` is being ignored, it is essentially
just a `const char*`.

Constructing an `std::string_view` from a `const char*` is also not
"free" but requires computing the length of the C-style string (unless
the length can be computed at compile time, e.g., because the value is
just a string literal). Repeated conversion between `const char*` as
used by OpenSSL and `std::string_view` as used by ncrypto thus incurs
the additional overhead of computing the length of the string whenever
an `std::string_view` is constructed from a `const char*`. (This seems
negligible compared to the safety argument though.)

Similarly, returning a `const char*` pointer to a C-style string as an
`std::string_view` has two downsides: the function must compute the
length of the string in order to construct the view, and the caller
can no longer assume that the return value is null-terminated and thus
cannot pass the returned view to functions that require their arguments
to be null terminated. (And, for the reasons explained above, the caller
also cannot check if the value is null-terminated without potentially
invoking undefined behavior.)

C++20 unfortunately does not have a type similar to Rust's `CStr` or
GSL `czstring`. Therefore, this commit changes many occurrences of
`std::string_view` back to `const char*`, which is conventional for
null-terminated C-style strings and does not require computing the
length of strings.

There are _a lot_ of occurrences of `std::string_view` in ncrypto and
for each one, we need to evaluate if it is safe and a good abstraction.
I tried to do so, but I might have changed too few or too many, so
please feel free to give feedback on individual occurrences.

PR-URL: #57816
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
@aduh95 aduh95 added the backport-requested-v22.x PRs awaiting manual backport to the v22.x-staging branch. label May 6, 2025
@aduh95
Copy link
Contributor

aduh95 commented May 6, 2025

This needs a manual backport for v22.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-requested-v22.x PRs awaiting manual backport to the v22.x-staging branch. c++ Issues and PRs that require attention from people who are familiar with C++. crypto Issues and PRs related to the crypto subsystem. dependencies Pull requests that update a dependency file. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. security Issues and PRs related to security.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants