Skip to content

Commit 2a88fea

Browse files
hasnain-dbMridul Muralidharan
authored andcommitted
[SPARK-44937][CORE] Mark connection as timedOut in TransportClient.close
### What changes were proposed in this pull request? This PR avoids a race condition where a connection which is in the process of being closed could be returned by the TransportClientFactory only to be immediately closed and cause errors upon use. This race condition is rare and not easily triggered, but with the upcoming changes to introduce SSL connection support, connection closing can take just a slight bit longer and it's much easier to trigger this issue. Looking at the history of the code I believe this was an oversight in #9853. ### Why are the changes needed? Without this change, some of the new tests added in #42685 would fail ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests were run in CI. Without this change, some of the new tests added in #42685 fail ### Was this patch authored or co-authored using generative AI tooling? No Closes #43162 from hasnain-db/spark-tls-timeout. Authored-by: Hasnain Lakhani <hasnain.lakhani@databricks.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
1 parent 6341310 commit 2a88fea

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,10 @@ public TransportResponseHandler getHandler() {
325325

326326
@Override
327327
public void close() {
328-
// close is a local operation and should finish with milliseconds; timeout just to be safe
328+
// Mark the connection as timed out, so we do not return a connection that's being closed
329+
// from the TransportClientFactory if closing takes some time (e.g. with SSL)
330+
this.timedOut = true;
331+
// close should not take this long; use a timeout just to be safe
329332
channel.close().awaitUninterruptibly(10, TimeUnit.SECONDS);
330333
}
331334

0 commit comments

Comments
 (0)