Skip to content

[Collections] Signing: support for non-Apple platforms #3272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Mar 24, 2021

Conversation

yim-lee
Copy link
Contributor

@yim-lee yim-lee commented Feb 13, 2021

This adds support for package collection signing on all platforms.

@yim-lee yim-lee changed the title [Collections] Signing (all, 4): signature validation [DNM][Collections] Signing (all, 4): signature validation Feb 13, 2021
@tomerd tomerd added the next waiting for next merge window label Feb 16, 2021
yim-lee added a commit to swiftlang/swift-package-collection-generator that referenced this pull request Feb 22, 2021
This is **experimental**!!!

Depends on swiftlang/swift-package-manager#3272
@yim-lee yim-lee force-pushed the wire-signing-all branch 4 times, most recently from 1f3f797 to 33d3f63 Compare March 2, 2021 01:26
@yim-lee yim-lee force-pushed the wire-signing-all branch from 33d3f63 to 170d9a1 Compare March 9, 2021 05:09
@yim-lee yim-lee changed the title [DNM][Collections] Signing (all, 4): signature validation [Collections] Signing: support for non-Apple platforms Mar 9, 2021
@yim-lee
Copy link
Contributor Author

yim-lee commented Mar 9, 2021

This is ready for review.

@Lukasa @FredericJacobs @catmurdoch: the plan is to merge the PR by end of this week. If you have any feedback on the code please include them before then. Thanks.

// Certs could have unknown critical extensions and cause them to be rejected.
// Instead of disabling all critical extension checks with X509_V_FLAG_IGNORE_CRITICAL
// we will just ignore this specific error.
if errorCode == X509_V_ERR_UNHANDLED_CRITICAL_EXTENSION {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we brought this up before but we didn't get to the end of it: why are we allowing unknown critical extensions? The point of the critical bit on the extension is that you explicitly are supposed to fail if the extension is unknown and you cannot act on it.

If we expect specific extensions to be present that Boring does not know but that are critical, it would be fine for us to manage that directly, but I don't think we should just tolerate all unknown critical extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended 98d27c8


// Construct the OCSP request
let digest = CCryptoBoringSSL_EVP_sha1()
guard let certid = OCSP_cert_to_id(digest, certificate.underlying, issuer.underlying),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certid is heap located and needs to be freed.

Copy link
Contributor Author

@yim-lee yim-lee Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since certid is ultimately a part of request, wouldn't freeing request take care of certid too? Or that's my guess at least since it crashes when I try to free certid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, if the immediate next usage is to pass to a function labelled add0 then we are handing ownership to that new type. However, there is still a failure mode here: if the call to OCSP_request_add0_id fails then we will leak this pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCSP_request_add0_id frees OCSP_ONEREQ in case of error:
https://github.com/apple/swift-package-manager/pull/3272/files#diff-9f556dcdb9de0622179dc1aa4fd297a26a83fe0ea78a99c60f4bd5de78e3e90dR50

My assumption is that OCSP_ONEREQ has ownership of certid at that point, so certid should be freed as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if it returned NULL from OCSP_ONEREQ_new. This is actually a nasty part of this API: there appear to be two paths out, one of which calls a free and the other does not. This can only be correct if OCSP_ONEREQ does not take ownership of the pointer. Unfortunately, because this is openssl, it's basically impossible to know if it is correct.

In this instance we can assume that failure path is impossible (we can't survive OOM conditions in Swift) so I think for now we can take the hit. Worst case we leak: not the end of the world in this kind of tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if it returned NULL from OCSP_ONEREQ_new

Ah true.

I think for now we can take the hit. Worst case we leak: not the end of the world in this kind of tool.

Yes, I would rather take the hit than crashing while trying to free certid. Also since the CLI just runs and exits (i.e., not long running) the impact is minimal.

defer { CCryptoBoringSSL_sk_X509_free(x509Stack) }

for i in 1 ..< certChain.count {
guard certChain[i].withUnsafeMutablePointer({ CCryptoBoringSSL_sk_X509_push(x509Stack, $0) }) > 0 else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to call out a very subtle behaviour here that's worth keeping an eye on. sk_X509_push does not increase the refcount of the X509 object you push into it. Essentially, the pointer you pass there is the pointer owned by certChain[i], and that will be potentially be freed when certChain goes out of scope.

As the code is written today, this works: certChain's last usage site is after the last usage site of this stack, and so none of the pointers dangle. However, nothing in this code enforces that constraint, and a refactoring of the code could expose this potential bug.

It may be worth using withExtendedLifetime on the certChain array to keep it alive. Ideally you'd use it to bound the block where the stack itself is alive (i.e. through to the verify_cert call), which will also forcibly clean the stack up early and reveal any misuse of the object as well. Not a hard requirement as the code as written is not wrong per se, but it may be useful to code defensively here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended 6cec1bf

}
CCryptoBoringSSL_X509_STORE_CTX_set_purpose(x509StoreCtx, X509_PURPOSE_ANY)

anchorCerts.forEach { anchorCert in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting for myself (and for completeness) that the extended lifetime point I made above does not apply here: X509_STORE_add_cert explicitly takes a +1 on the X509 object, so we do not need to keep the anchor certs alive. Not change required for this one.


group.notify(queue: callbackQueue) {
// If there's no result then something must have gone wrong
guard !results.isEmpty, results.compactMap({ $0.failure }).isEmpty else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question here: this code as written fails closed in a number of scenarios, including if any of the OCSP responders are unavailable (or indeed all of them). Is that behaviour intentional? What should happen if all but one of the responders is unavailable but one returns success? What about if all of them are unavailable due to a CA outage: should users be unable to download new package collections or verify existing ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this logic to fail only when there is no result or if all results are failures. With the successful results, it fails if at least one of them indicates the cert is bad.

I guess a question to ask is if we want to base our decision on all responders or a subset is acceptable. The implementation that Cory commented on requires all, but arguably that gives more weight to those who don't respond than the ones who do. The updated implementation uses whatever data it can (i.e., subset is ok).

If all responders are unavailable then the code wouldn't be able to verify the signature: new collection cannot be added (user can retry later or skip signature check if allowed); refresh of an existing collection would fail and its data would be removed from the database, but its URL will remain in the config for future refreshes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most common approach would be to "fail open": that is, so long as one responder gave an authoritative response to follow that response, but if no responders are working then to assume the certificate is not revoked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK makes sense. I will update the logic.

results.append(.success(cachedResult.isCertGood))
continue
} else {
_ = self.resultCache.removeValue(forKey: cacheKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking there's a time-of-check-time-of-use problem here where we could race this removal with an insertion from a different thread. Do we care?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove is not really necessary so I am deleting it.

// Set `REAL_CERT_USER_ID` env var when running ENABLE_REAL_CERT_TEST tests
let expectedSubjectUserID = ProcessInfo.processInfo.environment["REAL_CERT_USER_ID"] ?? "<USER ID>"

let callbackQueue = DispatchQueue(label: "org.swift.swiftpm.PackageCollectionsSigningTests", attributes: .concurrent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular need for this queue to be concurrent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might run tests in parallel. Is there any concern with it being concurrent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, none really beyond concurrent queues not being very good! 😁

@yim-lee yim-lee force-pushed the wire-signing-all branch 2 times, most recently from 6cec1bf to b1510fa Compare March 12, 2021 01:00
yim-lee added a commit to yim-lee/swift-package-collection-generator that referenced this pull request Mar 12, 2021
This is **experimental**!!!

Depends on swiftlang/swift-package-manager#3272
let requestData = Data(UnsafeBufferPointer(start: out, count: count))

// X509_STORE is freed up after all OCSP_basic_verify calls finish.
// Don't do defer-free here because it would get freed up. `withExtendedLifetime(anchorCerts)` doesn't help.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is. It happens for sk_x509 as well but we don't need it for OCSP_basic_verify. "free" is called on L389.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me that this use of X509_STORE is thread-safe. Is it going to be better to construct one per invocation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to create a new X509_STORE each time (3514d2c)

// Note (Yim Lee): BoringSSL has X509_get_pubkey but not X509_get0_pubkey.
// The only difference between the two is that the key returned by X509_get_pubkey
// must be freed up after use. We choose to call X509_get_pubkey instead to avoid
// copying a bunch of code and risk complications.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the shortcut here and changed the code to use X509_get_pubkey instead of X509_get0_pubkey. The code we would have to copy/edit was not straight forward and I didn't think it would worth it.

untrusted = sk_X509_dup(bs->certs);
// Note (Yim Lee): BoringSSL doesn't have X509_add_certs. Since here we are just adding contents of
// one stack to another, we'd rather not copy a bunch of code from OpenSSL and risk complications.
// if (!X509_add_certs(untrusted, certs, X509_ADD_FLAG_DEFAULT))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for this. Copying the functions properly would involve x509v3_cache_extensions and I didn't feel comfortable with the idea at all. The usage here doesn't need any of the fancy flags so I change it to just call sk_X509_insert directly.

yim-lee added 20 commits March 22, 2021 15:06
This is part 1 of a series of PRs to support package collection signing on **all** platforms. Originally swiftlang#3238.

Depends on swiftlang#3259
This is part 2 of a series of PRs to support package collection signing on **all** platforms. Originally swiftlang#3242.

Depends on swiftlang#3260, swiftlang#3264
This is part 3 of a series of PRs to support package collection signing on **all** platforms. Originally swiftlang#3245.

Depends on swiftlang#3265, swiftlang#3269
This is part 3.b of a series of PRs to support package collection signing on **all** platforms.

Depends on swiftlang#3270

Modifications:
- Add `PackageCollectionsSigningLibc` module
- Add OCSP support for non-Apple platforms through `PackageCollectionsSigningLibc`
@yim-lee
Copy link
Contributor Author

yim-lee commented Mar 22, 2021

@swift-ci please smoke test

@yim-lee
Copy link
Contributor Author

yim-lee commented Mar 22, 2021

@swift-ci please test

yim-lee added a commit to swiftlang/swift-package-collection-generator that referenced this pull request Mar 22, 2021
This is **experimental**!!!

Depends on swiftlang/swift-package-manager#3272
@yim-lee
Copy link
Contributor Author

yim-lee commented Mar 23, 2021

@swift-ci please test Linux

@yim-lee yim-lee merged commit acc9791 into swiftlang:main Mar 24, 2021
yim-lee added a commit to swiftlang/swift-package-collection-generator that referenced this pull request Mar 29, 2021
* Depends on swiftlang/swift-package-manager#3272
* Add backtrace
* Better handling of paths
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO NOT MERGE ready Author believes the PR is ready to be merged & any feedback has been addressed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants