Skip to content

Add support for decrypting S/MIME messages #11555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Nov 26, 2024
Merged

Conversation

nitneuqr
Copy link
Contributor

@nitneuqr nitneuqr commented Sep 6, 2024

As discussed on #6263, I'm opening this PR with an initial implementation of S/MIME decryption, in order to better discuss the API design, the algorithms we want to support, and how we want to approach testing.

My essential thoughts for testing were to do the round-trip: encryption using the PKCS7EnvelopeBuilder and decryption using the PKCS7EnvelopeDecryptor (or whatever its name will be). No tests were made against openssl, even though it might interesting.

For the Python API, I've tried to stick as much as possible with the current API design (PKCS7SignatureBuilder, PKCS7EnvelopeBuilder).

I'm new to rust, so please let me know if you see some issues in variable lifetime, or some unnecessary copying between Python & Rust.

During development, I've realized that we could migrate PKCS7 unpadding to rust instead of using types, so I did migrate it. I'll probably open another smaller PR for that matter (if that makes sense?). I still have some code using PKCS7 Unpadding on Python if needed.

cc @alex

@alex
Copy link
Member

alex commented Sep 6, 2024

Thanks for working on this! I haven't had a chance to look in depth, but re: PKCS7Unpadder, it'd be great to have that in a stand-alone PR.

@nitneuqr
Copy link
Contributor Author

nitneuqr commented Sep 6, 2024

Thanks for working on this! I haven't had a chance to look in depth, but re: PKCS7Unpadder, it'd be great to have that in a stand-alone PR.

Sure! Opened #11556 for that matter.

@nitneuqr nitneuqr force-pushed the smime-decryption branch 2 times, most recently from cb082ce to e1c4620 Compare September 8, 2024 09:01
@nitneuqr
Copy link
Contributor Author

nitneuqr commented Sep 8, 2024

Rebased to main after integration of #11556. Should I split it again in smaller PRs? Maybe around symmetric_decrypt or other subfunctions in the code.

@alex
Copy link
Member

alex commented Sep 8, 2024

We're always happy to have smaller PRs split out (though it can sometimes be complicated due to coverage). I'm hoping to have time to review this today, though it might be tomorrow.

@alex
Copy link
Member

alex commented Sep 10, 2024

@facutuesca FYI, if you're interested

@nitneuqr
Copy link
Contributor Author

@alex @reaperhulk took into account the first comments and adapted the tests / coverage accordingly.

Do you have any time for another review? Do you need anything else from me before reviewing? 😊

@alex
Copy link
Member

alex commented Oct 5, 2024

Sorry I got behind, will have a look now. Thanks for your patience!

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial review of the code -- I haven't really sat down and reviewed the core PKCS#7 logic against hte RFC yet though.

};

// Deserialize the content info
let content_info = asn1::parse_single::<pkcs7::ContentInfo<'_>>(&extracted_data).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk This would be our first time using our DER parser for PKCS7. On the one hand: new API, so there's no backwards-compat concerns. On the other hand, surely someone is emitting awful BER PKCS#7. Any concerns?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk just going to keep pinging you on this :-) (Enjoy your vacation)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume we release with DER only parsing. In that scenario we're attempting to empirically determine whether or not there's enough BER out there to matter. If no one comes to complain then great, job done. However, if we discover we're wrong and BER is relevant, what is our plan? Are we going to add BER support in rust?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do determine that this API is basically low utility for many users because their S/MIME messages aren't valid DER, it seems we have a few options:

  1. Declare that's how it goes. This is a net new API, so no users are worse off than they were before. May result in greater timelines for adoption of this, and migration away from other libraries.

  2. Re-implement in terms of OpenSSL S/MIME APIs. Probably straightforward, although as-ever, assessing the compatibility surface will be a PITA.

  3. Add BER support to rust-asn1. No fucking chance, not worth discussing.

  4. We force ourselves to do a survey of the ecosystem before landing this so we can contact producers of non-DER S/MIMEs.

I have a soft spot for 1, I could be persuaded of 4, and if we really need to do it, I guess 2 is the way to go.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Go support this? They're DER only for whatever set of PKCS7 they support right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm comfortable with doing this DER only and finding out how it goes. It's net-new API surface so we're not regressing and I'd love to find out BER doesn't matter.

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk would still like your views on deserialization for PKCS#7 in Rust.

@alex
Copy link
Member

alex commented Oct 28, 2024

(Thanks for updating, I'm going to pause on reviewing until #11843 is merged, hopefully that'll be quick.)

@nitneuqr
Copy link
Contributor Author

nitneuqr commented Nov 6, 2024

#11843 is merged, I've rebased the branch onto main to take the new changes into account. It should b ready to review again!

@alex
Copy link
Member

alex commented Nov 6, 2024 via email

@nitneuqr
Copy link
Contributor Author

Hey, any news on this? Do you need anything else from me? 😊

@alex
Copy link
Member

alex commented Nov 17, 2024

Ooops, sorry this fell off my radar. I will get to it today!

@reaperhulk
Copy link
Member

@nitneuqr Could you rebase this to fix the conflict? 😄

@@ -263,6 +263,81 @@ def encrypt(
return rust_pkcs7.encrypt_and_serialize(self, encoding, options)


def pkcs7_decrypt_der(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move all these functions to rust and export them here.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which also means, passing the smime-related function in Python as private, and calling it on Rust, I guess? And also doing all the initial checks in Rust?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at pkcs7.rs and search for SMIME_ENVELOPED_ENCODE you can see how we call into the S/MIME Python from Rust.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! One last question: should I also move the logic of the new _check_pkcs7_decrypt_arguments function to rust, or just leave it as such and call it in the rust code, the same way I'm calling SMIME_ENVELOPED_DECODE?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's move that as well -- it actually will probably become a fair bit easier, as many things are automatically handled by giving parameters types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be done. I'd love a quick review on the check_decrypt_parameters function, it is probably more intricate than it should be!

}
};

// Decrypt the key using the private key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this, you need to check that key_encryption_algorithm is the expected type.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this on the RFC :

image

I'll try to create another PKCS#7 with the OAEP padding instead of PKCS#1 v1.5, to have full coverage then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way it could happen is a bug, or an attempt to exploit something. We just want to be conservative in making sure we strictly enforce things.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's even more relevant than that. Basically our key decryption does not support RSAES-OAEP for now so we need to specify this case, add a vector and tests for coverage. Just pushed it now, letting you check.

Comment on lines 575 to 583
let header = b"Content-Type: text/plain\r\n\r\n";
if data.starts_with(header) {
&data[header.len()..]
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the header isn't present in text mode, should that be an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how OpenSSL handles it for now. I'll check it out when I have time and let you know.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some checking with my version of OpenSSL (on Windows!), it seems the behavior of the -text and -binary options on decrypt are much weirder than I expected. They are not symmetrical to their usage in encrypt :

  • No options passed : replaces the \n into \r\n
  • -binary : leaves the \n as such, do not replace them by \r\n
  • -text : Removes the header, raises an error if there is none (your comment is correct)
  • Both options can be combined. In encrypt, -binary was taking precedence.

This implies that if you encrypt then decrypt data using openssl smime and the text option, you can start with Hello, world!\n and end up with Hello, world!\r\r\n 😢

I was mostly using -binary optons in my use cases, so I never quite found out. Adding this right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nitneuqr this looks similar to a bug I encountered while implementing s/mime signing: openssl/openssl#23886. There is also a comment on that issue from someone finding the same bug during encryption

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. To me, the documentation for the -binary option makes sense, at least during the -encrypt part. But for decryption, it doesn't.

-binary

Normally the input message is converted to "canonical" format which is effectively using CR and LF as end of line: as required by the S/MIME specification. When this option is present no translation occurs. This is useful when handling binary data which may not be in MIME format.

To be honest, for this PR I'd stick with the current behavior of OpenSSL (even if seems incorrect). What's your opinion on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, for this PR I'd stick with the current behavior of OpenSSL (even if seems incorrect). What's your opinion on this?

I think it's up to Alex and Paul. For context, the decision for signing was to follow the RFC, rather than imitate what OpenSSL does.

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have misunderstood your first comment, so I'm rephrasing just to be sure:

Your issue was with the format of the output file (which -crlfeol is supposed to control in OpenSSL). Ours is about the format of the ciphered message inside it (which is controlled by -text or -binary), at least for encryption / decryption.

As far as my testing goes, the -crlfeol parameter does not seem to have any effect in decryption. Nevertheless, the issue looks similar and I guess we need to address it the same way as yours (by following the RFC instead of OpenSSL).

Thanks for the feedback 😄

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small nits, but this has come together great!

I want @reaperhulk's opinion on the text vs. binary stuff, since IIRC he did the original impl for the encryption side.

}
};

// If text_mode, strip the header if possible, else return an error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk can I get your opinion on the right behavior for text vs. binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added these flags to behave like OpenSSL (sigh).

OpenSSL calls SMIME_text in its decryption path when you pass PKCS7_TEXT. That function errors for a few conditions, but most notably if parsing the MIME headers fails, if there is no “content-type” header, or if the content type is not “text/plain”.

This code appears to make two assumptions that may not match OpenSSL's though:

  • Case sensitivity on Content-Type
  • That the Content-Type header is guaranteed to be the last header before the payload.

The behavior I believe we want is to parse the MIME headers, check that content-type is text/plain, and then discard them entirely and only keep the payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any suggestion of a lib to handle MIME efficiently in rust? For now, I'm thinking about
using some Python function. Seems simpler to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use Python for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First version pushed, will compare with openssl and let you know!

And yeah, it should be up to the user to sanitize the decrypted output after pkcs7_decrypt, especially with the libraries available on Python that are perfect for this ...

Copy link
Contributor Author

@nitneuqr nitneuqr Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested against some custom headers (you can see them in tests), behavior is the same as openssl (using test_support.pkcs7_decrypt from main).

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides these unwrap comments and @reaperhulk's on text, I think this is ready. Thanks so much for your continued work on it.

cleaned tests, added some others

added more vectors
added tests accordingly

will compare with OpenSSL
@nitneuqr
Copy link
Contributor Author

nitneuqr commented Nov 25, 2024

Besides these unwrap comments and @reaperhulk's on text, I think this is ready. Thanks so much for your continued work on it.

Thanks for your reviews! Should be good now, header removal is now tested in test_pkcs7_decrypt_der_text_handmade_header, and unwraps have been removed.

Copy link
Member

@reaperhulk reaperhulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my small nits LGTM on the text path now. Thanks!

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final nit + Paul's comments and we can merge!

@nitneuqr
Copy link
Contributor Author

nitneuqr commented Nov 26, 2024

One final nit + Paul's comments and we can merge!

Should be done now! Merci beaucoup! Also added some documentation 🛩️

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all your hard work and persistence on this!

@alex alex merged commit d6cac75 into pyca:main Nov 26, 2024
60 checks passed
@alex alex mentioned this pull request Nov 26, 2024
@nitneuqr nitneuqr deleted the smime-decryption branch November 26, 2024 14:21
@reaperhulk
Copy link
Member

Yes, thank you for sticking with this and being so responsive. It's a pleasure having contributors like you!

@nitneuqr
Copy link
Contributor Author

My pleasure! I might come back in a few days / weeks to tackle some more problems (namely, pkcs verify and encrypt/decrypt with other content encryption algorithms)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants