Skip to content

GH-177 - Adds support for DER formatted certficates #336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 25, 2023

Conversation

tdstein
Copy link
Collaborator

@tdstein tdstein commented Jan 17, 2023

Description

Modifies the --cacert option to support binary-encoded certificates,
such as DER (Distinguished Encoding Rules) format. This change is
backwards compatible and has no effect on text-encoded certificate.

Additionally, file-type checks have been added for common certificate
file types. This includes '.cer', '.der', '.ca-bundle', '.crt',
'.key', and '.pem'.

Connected to #177

Testing Notes / Validation Steps

The included unit tests validate the suffix routing for each type of certificate.

Additional manual testing is required to validate that requests using the --cacert option are successful. To facilate testing of DER certificiate types, I have generate a valid DER file in this pull request: https://github.com/rstudio/connect/pull/22753

To validate DER formatting:

rsconnect details --api-key $CONNECT_API_KEY --server https://alwayssecure.rsc:3443 --cacert alwayssecure.rsc.der

The output should be identical to the same call with the existing CRT file.

rsconnect details --api-key $CONNECT_API_KEY --server https://alwayssecure.rsc:3443 --cacert alwayssecure.rsc.crt

@tdstein tdstein force-pushed the 177-cacert-option branch 2 times, most recently from 11ef4eb to 1031fe7 Compare January 18, 2023 19:16
@tdstein tdstein changed the title GH-177 - Adds support for DER-encoded certficates GH-177 - Adds support for DER formatted certficates Jan 18, 2023
@tdstein tdstein force-pushed the 177-cacert-option branch 5 times, most recently from e0f8c2b to d0de2ca Compare January 18, 2023 19:59
@tdstein tdstein requested review from mmarchetti and bcwu January 18, 2023 20:06
@tdstein tdstein marked this pull request as ready for review January 18, 2023 20:06
@@ -360,7 +360,7 @@ def __init__(
url: str = None,
api_key: str = None,
insecure: bool = False,
cacert: IO = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacert should be handled as a file, hence IO; whereas cadata is text, hence str

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Here the cacert variable is a string representing the file path, instead of a file pointer. The call on line 436 opens the file pointer, which was originally handled by Click), and reads the file contents in either text mode or byte mode depending on the file-type.

The existing implementation assumes that the file pointer has been opened in text mode, which doesn't work when using DER files since they are encoded in binary format. L436 onwards, cadata is either a str or bytes, both of which are accepted when creating an SSLContext. This is the same as the existing implementation.

The Click documentation goes into scenario a little further: https://click.palletsprojects.com/en/8.1.x/arguments/#file-path-arguments

TEXT_ENCODED_FILETYPES = [".ca-bundle", ".crt", ".key", ".pem"]


def read_certificate_file(location: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are changing cacert to location then we need to be internally consistent and change existing cacert to something like cacert_location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going down that route can potentially break backwards compatibility so that's an important aspect to consider.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bcwu - could you help clarify your comment. Are you suggesting that I rename the variables named cacert to cacert_location so that the variable type is more obvious?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little more digging. It looks like cafile is the typical nomenclature for file name references. https://docs.python.org/3/library/ssl.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about a couple of things:

  1. Whether changing the type hints of cacert (from IO to str) will break any existing behavior
  • In this case either way (type hint set to IO or str) is probably all right. In general it helps to think through whether changes are backwards compatible.
  1. Possibly rename cacert if we do change the type hint
  • this will probably cause more confusion than it clarifies. It will certainly break existing behavior; so, on second thought not a good idea, please disregard.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the usage of cacert in each of the methods where the type signature is changed. In each case the variable is only used in combination with read_certificate_file. So, this shouldn't break any exisiting behavior.

Modifies the --cacert option to support binary-encoded certificates,
such as DER (Distinguished Encoding Rules) format. This change is
backwards compatible and has no effect on text-encoded certificate.

Additionally, file-type checks have been added for common certificate
file types. This includes '.cer', '.der', '.ca-bundle', '.crt',
'.key', and '.pem'.
"""

path = Path(location)
suffix = path.suffix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the file type by examining its magic number is a safer way of inference than checking the suffix. This especially pertains to security related files like certificates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'll look into this a bit.

Do you happen to know of any documentation that shows how to accomplish this across each of the certificate types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use mimetypes to infer some file types. It may not have full coverage of certificates, but it's a good starting point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't appear that certificates contain standard header blocks, but PEM files and DER files both adhere to a standard. https://www.openssl.org/docs/man3.0/man1/openssl-format-options.html

I can add additional validation that inspects the contents to check expecatations based on this documentation. But, I'm not sure if that is necessary.

@mmarchetti - do you have any additional insights into this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mimetypes determines types by file suffix, so it's probably equivalent to what we're doing. I don't think we need to inspect file contents - if an invalid certificate file is provided, it will fail with an SSL error when we try to connect.

@kgartland-rstudio
Copy link
Contributor

Verified fix.

Reproduced the issue on 1.14.0:

> rsconnect deploy fastapi ./ -s https://alwayssecure:3443 -k {API_KEY} --cacert ../../ssl/alwayssecure.rsc.der
    Warning: the existing manifest.json file will not be used or considered.
Traceback (most recent call last):
  File "/Users/kgartland/.pyenv/versions/3.8.2/lib/python3.8/site-packages/rsconnect/main.py", line 91, in wrapper
    result = func(*args, **kwargs)
  File "/Users/kgartland/.pyenv/versions/3.8.2/lib/python3.8/site-packages/rsconnect/main.py", line 1196, in deploy_app
    ce = RSConnectExecutor(**kwargs)
  File "/Users/kgartland/.pyenv/versions/3.8.2/lib/python3.8/site-packages/rsconnect/api.py", line 376, in __init__
    self.setup_remote_server(
  File "/Users/kgartland/.pyenv/versions/3.8.2/lib/python3.8/site-packages/rsconnect/api.py", line 436, in setup_remote_server
    ca_data = text_type(cacert.read())
  File "/Users/kgartland/.pyenv/versions/3.8.2/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 1: invalid start byte
Internal error: 'utf-8' codec can't decode byte 0x82 in position 1: invalid start byte

On this branch (1.14.1.dev3+gf4f56c1), using the same command, the der cert works as expected.

  • other certs continue to work
  • works on Windows VM
  • --insecure option still works as expected without supplying --cacert

@tdstein
Copy link
Collaborator Author

tdstein commented Jan 24, 2023

@bcwu / @mmarchetti - Now that @kgartland-rstudio has validated the change, am I safe to merge?

@bcwu
Copy link
Contributor

bcwu commented Jan 24, 2023

@bcwu / @mmarchetti - Now that @kgartland-rstudio has validated the change, am I safe to merge?

Yes, go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants