Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bpo-28414] Make all hostnames in SSL module IDN A-labels #5128

Merged
merged 6 commits into from
Feb 24, 2018

Conversation

tiran
Copy link
Member

@tiran tiran commented Jan 7, 2018

Alternative approach to PR #3010

  • Really all server_hostname attributes are IDN A-labels
  • SSLContext.set_servername_callback argument is IDN A-label, too
  • Replace all externally replaced certs with proper certs from internal test helper

https://bugs.python.org/issue28414

expected hostname in A-label form (``"xn--pythn-mua.org"``), rather
than the U-label form (``"pythön.org"``). The server name argument to
:meth:`ssl.SSLContext.set_servername_callback`` is an A-label instead of
an U-labe, too. (Contributed by Nathaniel J. Smith and Christian Heimes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"labe"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also not sure what the deal with github rendering all this with underlines is; potential reST issue?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's caused by a supernumerous backtick. I have fixed both issues.

@@ -1181,6 +1181,8 @@ def test_create_unix_server_ssl_verified(self):
server.close()
self.loop.run_until_complete(proto.done)

maxDiff = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remnants of debugging failing tests

@tiran tiran force-pushed the die-idna-die-hard branch from eacb9e9 to 9bd40df Compare January 7, 2018 16:06
@njsmith
Copy link
Contributor

njsmith commented Jan 10, 2018

I did a quick skim and the code seems fine to me, but... I do think that for set_servername_callback, instead of just breaking things we maybe we need to rename it and add a compatibility shim. Quoting from https://mail.python.org/pipermail/python-dev/2017-December/151527.html:

On the server, the obvious fix would be to start passing
A-label-encoded names to the servername_callback, instead of
U-label-encoded names. Unfortunately, this is a bit trickier, because
this has historically worked (AFAIK) for IDNA names, so long as they
didn't use one of the four magic characters who changed meaning
between IDNA 2003 and IDNA 2008. But we do still need to do something.
For example, right now, it's impossible to use the ssl module to
implement a web server at https://straße.de, because incoming
connections will use SNI to say that they expect a cert for
"xn--strae-oqa.de", and then the ssl module will freak out and throw
an exception instead of invoking the servername callback.

It's ugly, but probably the simplest thing is to add a new function
like set_servername_callback2 that uses the A-label, and then redefine
set_servername_callback as a deprecated compatibility shim:

def set_servername_callback(self, cb):
    def shim_cb(sslobj, servername, sslctx):
        if servername is not None:
            servername = servername.encode("ascii").decode("idna")
        return cb(sslobj, servername, sslctx)
    self.set_servername_callback2(shim_cb)

We can bikeshed what the new name should be. Maybe set_sni_callback?
or set_server_hostname_callback, since the corresponding client-mode
argument is server_hostname?

@tiran tiran force-pushed the die-idna-die-hard branch from 9bd40df to 31b862e Compare January 13, 2018 19:19
@tiran
Copy link
Member Author

tiran commented Jan 13, 2018

Meh! :(

Let's make it PEP 543 compatible and add sni_callback property. The Python subclass can provide your shim cb. Thanks for proposing it!

njsmith and others added 3 commits February 22, 2018 18:48
Historically, our handling of international domain names (IDNs) in the
ssl module has been very broken. The flow went like:

1. User passes server_hostname= to the SSLSocket/SSLObject
   constructor. This gets normalized to an A-label by using the
   PyArg_Parse "et" mode: bytes objects get passed through
   unchanged (assumed to already be A-labels); str objects get run
   through .encode("idna") to convert them into A-labels.

2. newPySSLSocket takes this A-label, and for some reason decodes
   it *back* to a U-label, and stores that as the object's
   server_hostname attribute.

3. Later, this U-label server_hostname attribute gets passed to
   match_hostname, to compare against the hostname seen in the
   certificate. But certificates contain A-labels, and match_hostname
   expects to be passed an A-label, so this doesn't work at all.

This PR fixes the problem by removing the pointless decoding at step
2, so that internally we always use A-labels, which matches how
internet protocols are designed in general: A-labels are used
everywhere internally and on-the-wire, and U-labels are basically just
for user interfaces.

This also matches the general advice to handle encoding/decoding once
at the edges, though for backwards-compatibility we continue to use
'str' objects to store A-labels, even though they're now always
ASCII. Technically there is a minor compatibility break here: if a
user examines the .server_hostname attribute of an ssl-wrapped socket,
then previously they would have seen a U-label like "pythön.org", and
now they'll see an A-label like "xn--pythn-mua.org". But this only
affects non-ASCII domain names, which have never worked in the first
place, so it seems unlikely that anyone is relying on the old
behavior.

This PR also adds an end-to-end test for IDN hostname
validation. Previously there were no tests for this functionality.

Fixes bpo-28414.
All test certs must be generated by CPython's own test helper.

Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
@tiran tiran force-pushed the die-idna-die-hard branch from 5ec02e4 to 1ec0678 Compare February 22, 2018 18:19
@tiran tiran changed the title [bpo-28414][WIP] Make all hostnames in SSL module IDN A-labels [bpo-28414] Make all hostnames in SSL module IDN A-labels Feb 22, 2018
@tiran tiran requested a review from njsmith February 22, 2018 18:29
Copy link
Contributor

@njsmith njsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some prose changes, and here are some comments on the code. Overall looks good, but there are a few small cleanups to consider, and a few points where I have questions.

Lib/ssl.py Outdated
self.sni_callback = None
else:
if not hasattr(server_name_callback, '__call__'):
raise TypeError("not a callable object")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not if not callable(server_name_callback)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, callable was re-added in Python 3.2. I'll change it.

Lib/ssl.py Outdated

def shim_cb(sslobj, servername, sslctx):
if servername is not None:
servername = servername.encode("ascii").decode("idna")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be simpler to use _encode_hostname here, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind to use the method here, too.

@@ -355,13 +355,20 @@ def __new__(cls, protocol=PROTOCOL_TLS, *args, **kwargs):
self = _SSLContext.__new__(cls, protocol)
return self

def __init__(self, protocol=PROTOCOL_TLS):
self.protocol = protocol
def _encode_hostname(self, hostname):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this a top-level utility function, instead of a method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep this a method. It makes it easier to modify the behavior in subclasses or in tests.

{"options", (getter) get_options,
(setter) set_options, NULL},
{"protocol", (getter) get_protocol,
NULL, NULL},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this new protocol attribute in this PR? Should it be documented or anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not new: https://docs.python.org/3/library/ssl.html#ssl.SSLContext.protocol

I just moved it from Python code into C code so I can access the value from C methods. The new setter for the callback uses the value to prevent setting a SNI callback for a client-only context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha.

Modules/_ssl.c Outdated
Py_INCREF(result);
Py_SETREF(ssl->ctx, (PySSLContext *)result);
SSL_set_SSL_CTX(ssl->ssl, ssl->ctx->ctx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs just say that sni_callback has to return None. What's this extra context setting stuff about? Should it be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the code for future compatibility with PEP 543. Rather than modifying socket.context, the callback can return a new context and let the API take care of the rest.

By the way it doesn't have to be None. Other values signal an error:

The server_name_callback function must return None to allow the TLS negotiation to continue. If a TLS failure is required, a constant ALERT_DESCRIPTION_* can be returned. Other return values will result in a TLS fatal error with ALERT_DESCRIPTION_INTERNAL_ERROR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yeah, I think adding new undocumented features in b2 for partial compatibility with an unfinished PEP is probably a bad plan, so thanks for removing that :-). It'll be easy to add back later.

Drop extra code for PEP 543 future compatibility in sni callback

Use callable() and encode_hostname in shim for old SNI callback.

Signed-off-by: Christian Heimes <christian@python.org>
@tiran
Copy link
Member Author

tiran commented Feb 24, 2018

The context special handling is pining for the fjords.

@njsmith njsmith merged commit 11a1493 into python:master Feb 24, 2018
@miss-islington
Copy link
Contributor

Thanks @tiran for the PR, and @njsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Feb 24, 2018
)

Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>
@bedevere-bot
Copy link

GH-5843 is a backport of this pull request to the 3.7 branch.

njsmith pushed a commit that referenced this pull request Feb 24, 2018
…H-5843)

Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>
@mattip
Copy link
Contributor

mattip commented Dec 15, 2020

Why are there no tests for the new function name, the tests in the stdlib all refer to set_servername_callback ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants