Introduce dedicated threadpool for establishing connections

Today, we attempt to connect to nodes concurrently using the management threadpool:

https://github.com/elastic/elasticsearch/blob/46e16b68feaee112126df0ed2bcd7e78d12a4d57/server/src/main/java/org/elasticsearch/cluster/NodeConnectionsService.java#L94-L115

Connection establishment can be time-consuming if the remote node is unresponsive, and the management threadpool is small and important, so saturating it with attempts to connect to unresponsive nodes is undesirable.

The suggested fix is to create a separate threadpool purely for establishing node-to-node connections instead. As such connections are mostly long-lived the new-connection threadpool will mostly be idle, but after a network partition it would be good for each node to try and re-establish connections to its peers using a lot more concurrency than the management threadpool can support.

Relates #28920 in which cluster state application is blocked for multiple minutes because, in part, of insufficient concurrency when attempting to connect to unresponsive peers.

	threadPool.executor(ThreadPool.Names.MANAGEMENT).execute(new AbstractRunnable() {
	@Override
	public void onFailure(Exception e) {
	// both errors and rejections are logged here. the service
	// will try again after `cluster.nodes.reconnect_interval` on all nodes but the current master.
	// On the master, node fault detection will remove these nodes from the cluster as their are not
	// connected. Note that it is very rare that we end up here on the master.
	logger.warn((Supplier<?>) () -> new ParameterizedMessage("failed to connect to {}", node), e);
	}

	@Override
	protected void doRun() throws Exception {
	try (Releasable ignored = nodeLocks.acquire(node)) {
	validateAndConnectIfNeeded(node);
	}
	}

	@Override
	public void onAfter() {
	latch.countDown();
	}
	});

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce dedicated threadpool for establishing connections #29023

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Introduce dedicated threadpool for establishing connections #29023

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions