-
Notifications
You must be signed in to change notification settings - Fork 7.8k
getaddrinfo conversion #15555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
getaddrinfo conversion #15555
Conversation
php_network_getaddresses is our wrapper function around getaddrinfo, the modern replacement for gethostbyname/addr. This converts these userland functions to use the newer API, which does have some benefits... that these functions can't yet take advantage of. Other references to gethostbyname/addr will be removed in future commits, to eventually remove the PHP wrappers in the next ABI break. Note that php_network_getaddresses does its own E_WARNINGs. Note that these functions are documented as only supporting IPv4 addresses; we need to do another BC break to return IPv6 addresses.
Allows controlling the address family and flags to use.
Not ideal right now; it should really be using the PHP wrapper function, and have bool conversion. But it does get rid of a use of gethostbyname.
This is curiously IPv4 only, so I kept the the same semantics.
A common idiom is only caring about the first IP address, for i.e. resolving localhost -> 127.0.0.1/::1. Iterating through and freeing the whole list in this case is tedious. In this case, we provide a function that takes a pointer to a sockaddr storage (so it can be stack allocated), as well as information like address family (for i.e. only IPv4 addresses). This should hopefully be only as boilerplate as gethostbyname was.
The underlying php_network_getaddresses will docref this for us, so there's no point in taking a zend_string we don't have to do anything with.
Also curiously IPv4 only...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only had a quick look at it, but thank you for taking care of this!
int address_count = php_network_getaddress(&resolved, hostname, 0, AF_INET, 0, NULL); | ||
if (address_count == 0) { | ||
/* don't need to docref here, getaddresses E_WARNINGs for us */ | ||
RETURN_STRINGL(hostname, hostname_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated optimization, but if we would use Z_PARAM_PATH_STR
to get a zend_string we could just return a copy here.
We will no longer be using this function.
Callers should use php_network_getaddress* instead.
If breaking ABI is fine, then it should be possible to completely remove gethostbyname/addr in this PR. I'll see how much that removes. |
Since this can never return -1 (0 on error/none, >0 for addresses)
Includes some global state as well.
this borks curl tests, so supress warnings on it there
ext/standard/file.c
Outdated
static void file_globals_dtor(php_file_globals *file_globals_p) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call to this should also be removed then.
@@ -7,7 +7,7 @@ curl | |||
--SKIPIF-- | |||
<?php | |||
$addr = "www.".uniqid().".".uniqid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely we could try to come up with something that we know does not exist? Like a weird subdomain name of php.net
Windows failure is curious, because it doesn't seem to happen on on Unix? |
On Windows the relevant part of the stack backtrace is
and Old stack backtrace (i.e. without this patch):
Lines 235 to 240 in c5bce0d
I haven't checked the behavior on Linux, but would have expected that By the way, that test ( |
Just checked on WSL2 (Debian 11), and
May actually have be a good idea to do this alreay. See https://3v4l.org/iP10j. And maybe the documentation ("Returns the IPv4 address or a string containing the unmodified hostname on failure.") and Windows implementation would need to be fixed accordingly. |
Oh, it's trying to turn an integer into an IP address on Unix, and Windows doesn't do that? I agree this test doesn't make sense then - the error and non-error return types are the same, but what it actually returns seems different per platform. (And the behaviour seems identical between |
It seems to me that the Unix behavior is non standard. From the spec:
Maybe change that test, and see what actually happens on CI. Perhaps there is different behavior even on POSIX platforms. |
It seems on Unix, the behaviour is to convert integers into IP addresses, whereas Windows tries to resolve it. Regardless, it's silly to test what the type is, since it returns the unmodified hostname if it can't be resolved, so it'll always be a string, error or not.
Hm, all non Windows systems seem to behave identically; however, FreeBSD and Travis have not been run at all for whatever reason. No sure what to do now; documenting the behavior is a no-brainer (besides someone would have to do that), but should we adapt the Windows behavior? What exactly would need to be done? Just turning an integer to an IP address (basically (Aside: apparently the MacOS builds could need some TLC.) |
Looks like the same behaviour. I looked at NetBSD's implementation as well as musl's, and it seems they do similar things - call Annoyingly, I don't see any flags to control this behaviour. |
To my knowledge, there is no
Indeed. |
I meant |
Circling back to the hardcoded IPv4 usage and raw usage of getaddrinfo things I brought up: I think the right thing to do would probably be adding ai_protocol (for the sockets function, because it can use it for the hint) and service arguments to the new functions like Alternatively, if we're OK with that sockets function using getaddrinfo directly, then the next best thing is to add the service only, the port is returned in the sockaddr, and can be consumed directly with less copies/additional structure writes that we'd have conditions for if-v4/v6 - and it'd make those functions handle v6 more easily, since they can just work with the returned sockaddr, which was the intent of getaddrinfo. |
Maybe @devnexen has some thoughts about this. |
#ifdef HAVE_GETADDRINFO | ||
struct sockaddr_in6 *sin6 = (struct sockaddr_in6*)sin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be possible to factorise it further via macro to avoid this double assignation and the family conditions below, wdyt ?
Probably perilous. Tested only on my Mac and a Gentoo/ppc64le system so far. Concerns:
php_network_getaddress
, for common situations like resolvinglocalhost
- we only want just the one) and necessity (php_network_getaddresses_ex
- exposes more ofgetaddrinfo
like address family).sockaddr_conv.c
should probably be converted tobool
returns, though I'm not sure if it's an ABI break (since it can be represented differently). It's a little ugly anyways...socket_addrinfo_lookup
callsgetaddrinfo
directly. It could callphp_network_getaddresses_ex
, though we'd need to expose service there too.This doesn't dropWe can do ABI breaks before RC1, so I'm removing the remaining support.php_network_gethostbyname
and the associated detritus just yet, since it's an ABI break. We could do it before RC1...HAVE_IPV6
or building without it even make sense anymore? We kinda assumeinet_ntop
for example, so assuminggetaddrinfo
also makes sense.gethostbyaddr
actually calls the moderngethostbyname
equivalent already when possible. The fallback path to the legacy function can be removed withgethostbyname
.Fixes GH-15531