Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32221: makeipaddr: remove interface part + speedup #4724

Merged
merged 1 commit into from
Jan 27, 2018
Merged

bpo-32221: makeipaddr: remove interface part + speedup #4724

merged 1 commit into from
Jan 27, 2018

Conversation

socketpair
Copy link
Contributor

@socketpair socketpair commented Dec 5, 2017

https://bugs.python.org/issue32221

recvfrom from multicast socket is painfull slow. In fact, it returns sender address in form:

('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78)
which is superfluous, since interface-name part (%qwe) is not actually used. Actually, scopeid (78) signify interface/scope/zone_id. This tuple can be used for .sendto() either with this interface-name-part or without.

The problem is in the performance. For each socket.recvrfom(), libc.getnameinfo() internally converts interface index to interface name using three syscalls, i.e. socket(), getsockopt()?, close() , which slows down receiving (I have not measured result, but see additional syscalls in strace).

In order to convert from tuple to string-based full address one may use socket.getnameinfo():
As you can see, initial interface is ignored (but without my patch it is also validated uselessly):

In[1]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78), socket.NI_NUMERICHOST)
Out[1]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')
In[2]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560', 42133, 0, 78), socket.NI_NUMERICHOST)
Out[2]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')

@asvetlov
Copy link
Contributor

asvetlov commented Dec 5, 2017

Would you add a test for new functionality?

@socketpair socketpair changed the title makeipaddr: remove interface part + speedup bpo-32221: makeipaddr: remove interface part + speedup Dec 5, 2017
@socketpair
Copy link
Contributor Author

@asvetlov done

@@ -77,6 +77,11 @@ created. Socket addresses are represented as follows:
backward compatibility. Note, however, omission of *scopeid* can cause problems
in manipulating scoped IPv6 addresses.

.. versionchanged:: 3.6
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know which version to write here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.6 is in bugfix mode.
The PR should land into 3.7 only IMHO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 3.7 only.

@asvetlov
Copy link
Contributor

asvetlov commented Dec 6, 2017

Please add NEWS.d entry

@socketpair
Copy link
Contributor Author

Is there anybody who can make a review ?

@socketpair
Copy link
Contributor Author

socketpair commented Dec 14, 2017

@asvetlov could you please speedup review of this PR somehow ?

@socketpair
Copy link
Contributor Author

socketpair commented Dec 20, 2017

Original (not patched) python:

In [1]s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
In [2]: s.bind(('ff02::1de:c0db', 1234, 0, 2))
In [3]: timeit s.getsockname()
The slowest run took 12.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.8 µs per loop

In [5]: d = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
In [6]: d.bind(('::', 1235, 0, 0))
In [7]: timeit d.getsockname()
The slowest run took 23.18 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 532 ns per loop

On patched version, times for both cases should be the same exactly (i.e. 532 ns)

{
char buf[NI_MAXHOST];
int error;
char buf[(INET_ADDRSTRLEN > INET6_ADDRSTRLEN ? INET_ADDRSTRLEN : INET6_ADDRSTRLEN) + 1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Py_MAX

ptr = &((const struct sockaddr_in6*) addr)->sin6_addr;
break;
default:
abort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually want an error set and NULL returned here.

Copy link
Contributor Author

@socketpair socketpair Dec 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is impossible to trigger. It's part of C logic, so no exception can be imagined here. It should be like assert(0). Or, even, I can remove default: branch.

This function is called (and should be) only from AF_INET and AF_INET6 branches from upper code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then rewrite this to:

if (family == AF_INET) {
   ...
}
else {
   assert(family == AF_INET6);
   ...
}

@1st1 1st1 self-assigned this Dec 20, 2017
@1st1
Copy link
Member

1st1 commented Dec 20, 2017

Overall looks fine, please fix the couple of things I left my comments on.

@socketpair
Copy link
Contributor Author

I have found two bugs in my patch:

  1. It did not check #ifdef ENABLE_IPV6
  2. makeipaddr had unused argument addrlen.

Since it hard to code Py_MAX + ifdef and also difficulties with assert() I decide to split this function to two. Actually, all places who call it know address family, so automatic logic is not needed.

Please have a look.

@1st1

@@ -1097,23 +1097,35 @@ setipaddr(const char *name, struct sockaddr *addr_ret, size_t addr_ret_size, int


/* Create a string object representing an IP address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment, it's outdated.

Add a comment to make_ipv4_addr "Convert IPv4 sockaddr to a Python str." Same for make_ipv6_addr.

@1st1
Copy link
Member

1st1 commented Dec 20, 2017

I decide to split this function to two.

I like it.

@socketpair
Copy link
Contributor Author

@1st1 done

@socketpair
Copy link
Contributor Author

After my patch:

$ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("ff02::1de:c0db", 1234, 0, 2))' 's.getsockname()'
500000 loops, best of 5: 613 nsec per loop

$ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("::", 1235, 0, 0))' 's.getsockname()'
500000 loops, best of 5: 420 nsec per loop

Copy link
Contributor

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second iteration looks much better!

@asvetlov
Copy link
Contributor

@1st1 please review again

@1st1
Copy link
Member

1st1 commented Dec 22, 2017

@asvetlov the pr looks fine, but please don't merge it, I'm still not sure if it's backwards compatible "enough", and I don't have time for research right now.

@asvetlov
Copy link
Contributor

Ok, let's wait for a while

@socketpair
Copy link
Contributor Author

@1st1 ping

@socketpair
Copy link
Contributor Author

How can I speed up merging ?

set_gaierror(error);
char buf[INET_ADDRSTRLEN + 1];
if (inet_ntop(AF_INET, &addr->sin_addr, buf, sizeof(buf)) == NULL) {
PyErr_SetFromErrno(PyExc_OSError);
return NULL;
}
return PyUnicode_FromString(buf);
Copy link
Contributor Author

@socketpair socketpair Jan 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will PyUnicode_DecodeASCII be faster ? Is it better to use it here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes.

@@ -5911,14 +5916,11 @@ socket_inet_ntop(PyObject *self, PyObject *args)
Py_buffer packed_ip;
const char* retval;
#ifdef ENABLE_IPV6
char ip[Py_MAX(INET_ADDRSTRLEN, INET6_ADDRSTRLEN) + 1];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not related to my issue, but this is a cosmetic fix.

@socketpair
Copy link
Contributor Author

Regarding meltdown vulnerability: this patch will speed up even more!

@socketpair
Copy link
Contributor Author

socketpair commented Jan 7, 2018

Because 3 syscalls per addr serialisation eliminated.

@socketpair
Copy link
Contributor Author

@1st1 @asvetlov maybe someone else can merge this ?

@asvetlov
Copy link
Contributor

The pr is good for me but @1st1 explicitly asked for additional time.

@1st1
Copy link
Member

1st1 commented Jan 15, 2018

I'll take a look this week.

@socketpair
Copy link
Contributor Author

@1st1 ping

@socketpair
Copy link
Contributor Author

@1st1
@asvetlov

@1st1
Copy link
Member

1st1 commented Jan 26, 2018

i'll take a look today/tomorrow.

@1st1
Copy link
Member

1st1 commented Jan 26, 2018

In the meanwhile, can you explain why this patch is important? and what is the speedup?

@socketpair
Copy link
Contributor Author

socketpair commented Jan 26, 2018

@1st1
#4724 (comment)
#4724 (comment)

This patch speeds up all functions that return 4-item tuple with address. Old code uses wrong function to convert binary representation of IPv6 address to textual one.

For non-scoped addresses result is exactly the same. But for scoped addresses this wrong function does useless interface index to name resolution. This slows down for example recvfrom(). This slowdown is significant for multicast UDP applications (like my own :) )

@1st1 1st1 merged commit 47c0b1f into python:master Jan 27, 2018
1st1 added a commit to 1st1/cpython that referenced this pull request Jan 28, 2018
1st1 added a commit that referenced this pull request Jan 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants