Skip to content

[TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket #34211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 3, 2018

Conversation

bizybot
Copy link
Contributor

@bizybot bizybot commented Oct 2, 2018

In SessionFactoryLoadBalancingTests#testRoundRobinWithFailures()
we kill ldap servers randomly and immediately bind to that port
connecting to mock server socket. This is done to avoid someone else
listening to this port. As the creation of mock socket and binding to the
port is immediate, sometimes the earlier socket would be in TIME_WAIT state
thereby having problems with either bind or connect.
This commit sets the SO_REUSEADDR explicitly to true and also sets
the linger on time to 0(as we are not writing any data) so as to
allow re-use of the port and close immediately.

Note: I could not find other places where this might be problematic
but looking at test runs and netstat output I do see lot of sockets
in TIME_WAIT. If we find that this needs to be addressed we can
wrap ServerSocketFactory to set these options and use that with in
memory ldap server configuration during tests.

Closes #32190

In the load balancing test (SessionFactoryLoadBalancingTests
- testRoundRobinWithFailures()) we kill ldap servers randomly
and immediately bind to that port connecting to mock server socket.
This is done to avoid someone else listening to this port.
As the creation of mock socket and binding to the port is immediately,
sometimes the earlier socket would be in TIME_WAIT state
thereby having problems with either bind or connect.
This commit sets the SO_REUSEADDR explicitly to `true` and also sets
the linger on time to `0`(as we are not writing any data) so as to
allow re-use of the port and close immediately.

Note: I could not find other places where this might be problematic
but looking at test runs and netstat output I do see lot of sockets
in TIME_WAIT. If we find that this needs to be addressed we can
wrap ServerSocketFactory to set these options and use that with in
memory ldap server configuration during tests.

Closes elastic#32190
@bizybot bizybot added review >test-failure Triaged test failures from CI v7.0.0 :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) v6.5.0 labels Oct 2, 2018
@bizybot bizybot requested review from jaymode and jkakavas October 2, 2018 06:32
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of minor comments, otherwise LGTM. Please make sure to get an approval from either @tbrooks8 or @original-brownbear before merging (thanks @jasontedor for requesting their review)

private MockSocket openMockServerSocket(InetAddress remoteAddress, int remotePort, InetAddress localAddress, int localPort)
throws IOException {
final MockSocket socket = new MockSocket();
socket.setReuseAddress(true); // allow binding even if the previous socket is in timed state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/timed/timed wait ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this, Thank you.

@@ -149,6 +152,17 @@ public void testRoundRobinWithFailures() throws Exception {
}
}

@SuppressForbidden(reason = "Allow opening socket for test")
private MockSocket openMockServerSocket(InetAddress remoteAddress, int remotePort, InetAddress localAddress, int localPort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a server socket, so I think the method should be openMockSocket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changed this. Thank you.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bizybot bizybot merged commit f79282e into elastic:master Oct 3, 2018
@bizybot bizybot deleted the fix/32190 branch October 3, 2018 22:24
bizybot added a commit that referenced this pull request Oct 3, 2018
In SessionFactoryLoadBalancingTests#testRoundRobinWithFailures()
we kill ldap servers randomly and immediately bind to that port
connecting to mock server socket. This is done to avoid someone else
listening to this port. As the creation of mock socket and binding to the
port is immediate, sometimes the earlier socket would be in TIME_WAIT state
thereby having problems with either bind or connect.
This commit sets the SO_REUSEADDR explicitly to true and also sets
the linger on time to 0(as we are not writing any data) so as to
allow re-use of the port and close immediately.

Note: I could not find other places where this might be problematic
but looking at test runs and netstat output I do see lot of sockets
in TIME_WAIT. If we find that this needs to be addressed we can
wrap ServerSocketFactory to set these options and use that with in
memory ldap server configuration during tests.

Closes #32190
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 4, 2018
* master: (25 commits)
  HLRC: ML Adding get datafeed stats API (elastic#34271)
  Small fixes to the HLRC watcher documentation. (elastic#34306)
  Tasks: Document that status is not semvered (elastic#34270)
  HLRC: ML Add preview datafeed api (elastic#34284)
  [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation
  Fix error in documentation for activete watch
  SCRIPTING: Terms set query expression (elastic#33856)
  Logging: Drop remaining Settings log ctor (elastic#34149)
  [ML] Remove unused last_data_time member from Job (elastic#34262)
  Docs: Allow skipping response assertions (elastic#34240)
  HLRC: Add activate watch action (elastic#33988)
  [Security] Multi Index Expression alias wildcard exclusion (elastic#34144)
  [ML] Label anomalies with  multi_bucket_impact (elastic#34233)
  Document smtp.ssl.trust configuration option (elastic#34275)
  Support PKCS#11 tokens as keystores and truststores  (elastic#34063)
  Fix sporadic failure in NestedObjectMapperTests
  [Authz] Allow update settings action for system user (elastic#34030)
  Replace version with reader cache key in IndicesRequestCache (elastic#34189)
  [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211)
  Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247)
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 4, 2018
* rename-ccr-stats: (25 commits)
  HLRC: ML Adding get datafeed stats API (elastic#34271)
  Small fixes to the HLRC watcher documentation. (elastic#34306)
  Tasks: Document that status is not semvered (elastic#34270)
  HLRC: ML Add preview datafeed api (elastic#34284)
  [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation
  Fix error in documentation for activete watch
  SCRIPTING: Terms set query expression (elastic#33856)
  Logging: Drop remaining Settings log ctor (elastic#34149)
  [ML] Remove unused last_data_time member from Job (elastic#34262)
  Docs: Allow skipping response assertions (elastic#34240)
  HLRC: Add activate watch action (elastic#33988)
  [Security] Multi Index Expression alias wildcard exclusion (elastic#34144)
  [ML] Label anomalies with  multi_bucket_impact (elastic#34233)
  Document smtp.ssl.trust configuration option (elastic#34275)
  Support PKCS#11 tokens as keystores and truststores  (elastic#34063)
  Fix sporadic failure in NestedObjectMapperTests
  [Authz] Allow update settings action for system user (elastic#34030)
  Replace version with reader cache key in IndicesRequestCache (elastic#34189)
  [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211)
  Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247)
  ...
kcm pushed a commit that referenced this pull request Oct 30, 2018
In SessionFactoryLoadBalancingTests#testRoundRobinWithFailures()
we kill ldap servers randomly and immediately bind to that port
connecting to mock server socket. This is done to avoid someone else
listening to this port. As the creation of mock socket and binding to the
port is immediate, sometimes the earlier socket would be in TIME_WAIT state
thereby having problems with either bind or connect.
This commit sets the SO_REUSEADDR explicitly to true and also sets
the linger on time to 0(as we are not writing any data) so as to
allow re-use of the port and close immediately.

Note: I could not find other places where this might be problematic
but looking at test runs and netstat output I do see lot of sockets
in TIME_WAIT. If we find that this needs to be addressed we can
wrap ServerSocketFactory to set these options and use that with in
memory ldap server configuration during tests.

Closes #32190
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) >test-failure Triaged test failures from CI v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] SessionFactoryLoadBalancingTests testRoundRobinWithFailures fails
5 participants