-
Notifications
You must be signed in to change notification settings - Fork 199
Add support for native root certs #1782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for native root certs #1782
Conversation
Wow. This is brilliant! Thank you @s6eskand. I'm testing the changes locally and then will definitely stamp. Give me a moment. We'll also need to cut a point release to bring this change into a released version cc @luis-munoz @jaroeichler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is almost perfect, but I'm not sure either comment should block this PR for expediency's sake. There are three items that I think are worth discussing at some point, or quickly.
(1) we need to document some example of a config for each tls option (certificate, key, native root CA). This might be better done in a blogpost explaining the different options.
(2) For our safety and usability backlog, we might consider a warning to customers if they have no tls_config
defined. I'm not sure that is a good change at this time because I need to do more thinking on it. Trust is for the customer to decide.
(3) Does it make sense to test and account for a garbage ca file? I think this is a novice yet honest mistake. I don't think we need to optimize for this user quite yet, but we should think about them. At a minimum, I want to have a helpful error a la "Invalid CA. Must be a valid .pem or .crt file."
I am curious what the others think, but I'm ready to approve this because it works.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 6 files reviewed, and pending CI: Installation / macos-15, Local / lre-rs / macos-15, NativeLink.com Cloud / Remote Cache / macos-15, Remote / lre-cc / xlarge-ubuntu-24.04, Remote / lre-rs / xlarge-ubuntu-24.04, windows-2022 / stable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 2 LGTMs obtained, and 0 of 6 files reviewed, and pending CI: Installation / macos-15, Local / lre-rs / macos-15, NativeLink.com Cloud / Remote Cache / macos-15, Remote / lre-cc / xlarge-ubuntu-24.04, Remote / lre-rs / xlarge-ubuntu-24.04, windows-2022 / stable
ebd4e37
to
ae5f9f3
Compare
|
I don't think we need to say something about invalid cert today because of tonic's validation, but I also believe it can be helpful in the future to provide an even easier to understand warning that customers can take specific action on. I don't want our beginning users to need to think about Tonic too much, as great as it is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
cc @jaroeichler I noticed that we use webpki_roots in the S3 which probably causes other issues like the one fixed here down the line, i.e. the s3 client doesn't use the infrastructure that this PR touches. Could you look into implementing something similar for that? I believe contrary to tonic, the aws packages already support native-platform-verifier
.
Reviewed 3 of 6 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: 1 of 2 LGTMs obtained, and 5 of 6 files reviewed, and 3 discussions need to be resolved
-- commits
line 2 at r2:
nit: Add the described behavior change from the PR in the commit message body.
nativelink-util/Cargo.toml
line 54 at r2 (raw file):
serde = { version = "1.0.219", default-features = false } sha2 = { version = "0.10.8", default-features = false } tempfile = "3.20.0"
nit: Add this as tempfile = { version = "xxx", default-features = false }
and add the enabled features explicitly. I.e. tempfile = { version = "xxx", default-features = false, features = ["getrandom"] }
nativelink-util/Cargo.toml
line 66 at r2 (raw file):
tokio-util = { version = "0.7.14" } tonic = { version = "0.13.0", features = [ "tls-native-roots",
FYI I noticed that tonic upstream still appears to use rustls-native-certs
which has been superseded by rustls-platform-verifier
. We might fix this upstream in the future.
nativelink-util/src/tls_utils.rs
line 27 at r2 (raw file):
if config.use_native_roots == Some(true) { return Ok(Some(
In case 3 of your described behavior change, this look like it would entirely ignore the other ca config options. That's fine, but it could lead to UX issues where one is wondering why some specified ca file is not being used.
I think the general behavior is fine, but let's e.g. print a warning (use tracing::warn
and then warn!("XXXX")
in this branch if any of the other ca options are set so that the user gets informed about those options being ignored.
nativelink-util/tests/tls_utils_test.rs
line 20 at r2 (raw file):
use nativelink_util::tls_utils::{endpoint_from, load_client_config}; use tempfile::NamedTempFile;
FYI Thanks a lot for these nice tests ❤️
ae5f9f3
to
2c5ac2a
Compare
The warn is perfect @s6eskand, thank you! |
There currently is no way to connect to a given gRPC endpoint using TLS without manually providing a CA certificate file, certificate file, and/or key file. This causes issues when using managed certificates where it is not possible to connect via TLS to a secured endpoint unless the relevant secrets are distributed across workers and manually accessed via the config. The proposed changes add an option for the ClientTlsConfig of use_native_roots. This will ensure that tonic will use the root native certs for it's tls config. If the field use_native_roots is present ca, cert, and key files will be ignored if provided but are not required. This adds the tonic tls-native-roots feature. NOTE: The ca_file field in the nativelink_config::stores::ClientTlsConfig is no longer required.
2c5ac2a
to
e7e8cd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks!
Reviewed 6 of 6 files at r1, 1 of 2 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: 1 of 2 LGTMs obtained, and all files reviewed, and pending CI: Analyze (javascript-typescript), Bazel Dev / macos-15, Cargo Dev / ubuntu-24.04, Installation / macos-15, NativeLink.com Cloud / Remote Cache / macos-15, Publish image, Web Platform Deployment / ubuntu-24.04, and 4 discussions need to be resolved
-- commits
line 4 at r3:
nit: Limit to 72 characters per line.
I reinitiated all of the jobs that GitHub stopped doing because they failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Reviewed 1 of 6 files at r1, 2 of 2 files at r3, all commit messages.
Reviewable status:complete! 2 of 2 LGTMs obtained, and all files reviewed
Previously, jaroeichler wrote…
nit: Limit to 72 characters per line.
Done. I'll add this during merge.
Description
There currently is no way to connect to a given gRPC endpoint using TLS without manually providing a CA certificate file, certificate file, and/or key file. This causes issues when using managed certificates where it is not possible to connect via TLS to a secured endpoint unless the relevant secrets are distributed across workers and manually accessed via the config.
The proposed changes add an option for the
ClientTlsConfig
ofuse_native_roots
. This will ensure that tonic will use the root native certs for it's tls config. If the fielduse_native_roots
is present ca, cert, and key files will be ignored if provided but are not required. This adds the tonictls-native-roots
feature.NOTE: The
ca_file
field in thenativelink_config::stores::ClientTlsConfig
is no longer required.Type of change
How Has This Been Tested?
Tested connecting remote store and the scheduler using different configs for the
tls_config
. Also added unit tests fortls_utils.rs
Cases:
use_native_roots
option true -> uses native roots and successfully creates connectionuse_native_roots
option false -> Raises same error as 2Checklist
bazel test //...
passes locallygit amend
see some docsThis change is