Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Cache v6only option when we create ipv6 socket #11793

Merged
merged 7 commits into from
Jul 27, 2020

Conversation

davinci26
Copy link
Member

@davinci26 davinci26 commented Jun 28, 2020

On Windows, getsockopt fails if the socket driver is performing another
non-blocking operation. For example if we query for a specific socket
option while the socket is connecting then the getsockopt is going to
fail with error (10022).

When we create an ipv6 socket we set it to be ipv6 only. Now we cache
this configuration and we query it before we connect to the socket

Signed-off-by: Sotiris Nanopoulos sonanopo@microsoft.com

Additional Description:
This fixes IpVersions/TcpClientConnectionImplTest.BadConnectConnRefused/IPv6 test on Windows.

Risk Level: Low
Testing: Existing test cover this scenario.
Docs Changes:
Release Notes: Fixes an issue on Windows that get socket option returns invalid parameters when Envoy is asynchronously connecting to it. The resolution was to cache the socket option (v6only) option when the socket is created.
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]

@davinci26
Copy link
Member Author

@davinci26
Copy link
Member Author

@florincoras can you please take a look since you have been tinkering with the socket interface :)

@@ -122,5 +127,22 @@ absl::optional<Address::IpVersion> SocketImpl::ipVersion() const {
return absl::nullopt;
}

bool SocketImpl::isSocketV6Only(const Network::Address::InstanceConstSharedPtr address) const {
Copy link
Member Author

@davinci26 davinci26 Jun 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved to address_impl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it depends on io_handle_, I'm not sure we can do the move. As per my general comment, this is needed because we "forget" what we set on the Socket/IoHandle at creation time. We could potentially improve that.

@florincoras
Copy link
Member

@davinci26 thanks for the heads up! This looks okay but it got me thinking that we are explicitly setting v6only when we create the socket, so we should probably store upfront the value in the IoHandle impl. In theory, this would then avoid the need to retrieve the setting via getOption() and it would simplify localAddress() avoid complicating connect.

@ggreenway what do you think? And a more naive question: do we ever want IPV6_V6ONLY to be false in practice?

@davinci26
Copy link
Member Author

Awesome, your comment makes sense. I was thinking about this but then I wasnt sure if there is any scenario that we set to IPV6_V6ONLY implicitly ( maybe from a user configuration).

If there is no such scenario it makes sense to store this info at construction and remove that call entirely

Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the CI failure.

Should there be any tests added for this case? Or is this fixing a race condition, and so no new coverage is needed?

/wait

source/common/network/socket_impl.cc Outdated Show resolved Hide resolved
@davinci26
Copy link
Member Author

@ggreenway I will fix the CI. I think since the race condition was found from an existing (on Windows which will be soon be part of the CI) we dont need additional test coverage. What do you think?

@florincoras I think that we need to do the query unfortunately because the default value is set by the system.

According to man/ipv6:

The default value for this flag is defined by the contents of the file /proc/sys/net/ipv6/bindv6only. The default value for that file is 0 (false).

The way I interpret this is that there might be a case for a system/environment is ipv6_only due to a different system configuration and it is not Envoy that sets the property. In that case we should be aware and query the socket to get the option.

@florincoras
Copy link
Member

The way I interpret this is that there might be a case for a system/environment is ipv6_only due to a different system configuration and it is not Envoy that sets the property. In that case we should be aware and query the socket to get the option.

We are currently explicitly setting IPV6_V6ONLY when we create the socket via this api. That is, when provided with an address, we do override the system default. We should probably check the system default and store the value in the IoHandle when using any of the other apis. That would entirely avoid the checking the value in localAddress(), but I guess it could be done at a later time.

@ggreenway
Copy link
Contributor

The build is broken.

Please add a release note about the bug being fixed.

@florincoras thanks for your input on this PR. Since you've had the most experience recently in this area, can I assign this PR to you for review?

/wait

@florincoras
Copy link
Member

@ggreenway sure!

@stale
Copy link

stale bot commented Jul 18, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 18, 2020
@davinci26
Copy link
Member Author

Apologies for letting this get stale last week was crazy. I am working on the changes now.

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jul 18, 2020
Signed-off-by: davinci26 <sotirisnan@gmail.com>
Signed-off-by: davinci26 <sotirisnan@gmail.com>
@davinci26
Copy link
Member Author

@florincoras I opted for the implementation that you proposed since its way cleaner ( I get to delete code instead of writing 😄 ) I am getting the CI to be fully green and I will ping you when I am ready for a review

@florincoras
Copy link
Member

@florincoras I opted for the implementation that you proposed since its way cleaner ( I get to delete code instead of writing 😄 ) I am getting the CI to be fully green and I will ping you when I am ready for a review

Sounds good! ;-)

Signed-off-by: davinci26 <sotirisnan@gmail.com>
Signed-off-by: davinci26 <sotirisnan@gmail.com>
Signed-off-by: davinci26 <sotirisnan@gmail.com>
@davinci26
Copy link
Member Author

@florincoras ready for review

the CI is clean minus cx_limit_integration_test which seems to be flaky #11841

@davinci26 davinci26 changed the title [Windows] Query for IPV6_V6ONLY before connect [Windows] Cache v6only option when we create ipv6 socket Jul 21, 2020
florincoras
florincoras previously approved these changes Jul 21, 2020
@davinci26
Copy link
Member Author

@ggreenway I am ready for your review as well!

@ggreenway
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, just one thing to fix.

/wait

include/envoy/network/socket.h Outdated Show resolved Hide resolved
Signed-off-by: davinci26 <sotirisnan@gmail.com>
Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davinci26
Copy link
Member Author

davinci26 commented Jul 23, 2020

@ggreenway can we q tsan again and see what the process it to merge it in.

I want to add it to #12172 and see if it makes the tests there less flaky

@ggreenway
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@davinci26
Copy link
Member Author

davinci26 commented Jul 25, 2020

I merged master and pushed again and hopefully it will help with the flaky tests

@ggreenway ggreenway merged commit ff0beb1 into envoyproxy:master Jul 27, 2020
KBaichoo pushed a commit to KBaichoo/envoy that referenced this pull request Jul 30, 2020
On Windows, getsockopt fails if the socket driver is performing another
non-blocking operation. For example if we query for a specific socket
option while the socket is connecting then the getsockopt is going to
fail with error (10022).

When we create an ipv6 socket we set it to be ipv6 only. Now we cache
this configuration and we query it before we connect to the socket

This fixes IpVersions/TcpClientConnectionImplTest.BadConnectConnRefused/IPv6 test on Windows.

Signed-off-by: Sotiris Nanopoulos <sonanopo@microsoft.com>
Signed-off-by: Kevin Baichoo <kbaichoo@google.com>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this pull request Aug 7, 2020
On Windows, getsockopt fails if the socket driver is performing another
non-blocking operation. For example if we query for a specific socket
option while the socket is connecting then the getsockopt is going to
fail with error (10022).

When we create an ipv6 socket we set it to be ipv6 only. Now we cache
this configuration and we query it before we connect to the socket

This fixes IpVersions/TcpClientConnectionImplTest.BadConnectConnRefused/IPv6 test on Windows.

Signed-off-by: Sotiris Nanopoulos <sonanopo@microsoft.com>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this pull request Aug 7, 2020
On Windows, getsockopt fails if the socket driver is performing another
non-blocking operation. For example if we query for a specific socket
option while the socket is connecting then the getsockopt is going to
fail with error (10022).

When we create an ipv6 socket we set it to be ipv6 only. Now we cache
this configuration and we query it before we connect to the socket

This fixes IpVersions/TcpClientConnectionImplTest.BadConnectConnRefused/IPv6 test on Windows.

Signed-off-by: Sotiris Nanopoulos <sonanopo@microsoft.com>
Signed-off-by: chaoqinli <chaoqinli@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants