Skip to content

Revisit the OpenSSL interop exception model #55973

Closed
@bartonjs

Description

@bartonjs

For various reasons, the model we ended up with for OpenSSL is that when we decide it's time to throw an exception we drain out the error queue and throw for whatever the last error was. Largely this worked because each operation (probably) only really reports one error. Any time the error queue had more than one thing in it was because of spillover from a recoverable failure.

e.g.: new X509Certificate2(bytes)

  • call d2i_X509. If that failed, continue to next format
    • If it failed, it probably wrote some ASN decode errors... but all we checked was that it returned NULL. Error queue length 1.
  • call PEM_read_bio_X509.
    • Error queue length is probably 2.
  • call d2i_PKCS7
    • Probably 3.
  • PEM_read_bio_PKCS7
    • ~4.
  • Oh, look, it's a PKCS#12.

So we succeeded, and left 4 errors in the queue (unless we cleared them, but I doubt it).

Later, when we run into some other error (e.g. signing with a public key) we go "oh, exception throwing time... drain down to the last error and throw on that, since it's what we just encountered". Assuming that the operation only assigned one error code, that's true.

The "only one error" didn't quite hold up in an unfortunate way in #55787. Calling EVP_PKEY2PKCS8 with a public-only DSA key on OpenSSL 3 Beta 1 produced two errors: 1) ERR_LIB_PROV | ERR_R_NOT_A_PRIVATE_KEY, and 2) ERR_LIB_PROV | ERR_R_MALLOC_FAILURE. While the second error seems pretty spurious for the situation, it shows that we really wanted "the first error that happened during our operation".

Suggested changes:

  • Each operation probably wants to start off with ERR_clear_error(). It's relatively cheap these days.
    • (Probably worth a perf measurement to quantify this).
  • Change CreateOpenSslCryptographicException to call one shim function that returns the oldest error, uses a ref/out value to indicate anything special (e.g. malloc failure => OOM), and clears the queue
    • Perf on exception paths is less important, but reducing N calls down to 1 seems like goodness either way.
  • Do a one-time test where each P/Invoke ends with reporting if there was more than one error enqueued. For each place that we see multiple errors enqueued check that it is, in fact, the oldest one that looks like the best exception. (Or at least looks like a suitable exception)
    • Checking on 1.0.2 (Ubuntu 16.04), 1.1.0 (Ubuntu 18.04), 1.1.1 (Ubuntu 20.04) and 3.0 (?), of course.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions