Skip to content

[CI] Real-cluster tests pick ports that clash with ESTestCase port ranges #87734

Closed as not planned
@DaveCTurner

Description

@DaveCTurner

In #87448 we saw a test failure because the failing test was using port 35677 which was also in use by a different test case. I think the other test case involved a real cluster since it had a build hash, and since we shouldn't have any clashes between ESTestCase-based tests running with different test workers thanks to #85777. As far as I can tell we set http.port: 0 and transport.port: 0 in these real-cluster tests which causes Elasticsearch to choose some arbitrary free port from the ephemeral port range. Is it possible 35677 is considered an ephemeral port on some workers?

If so, I see a couple of options:

  • reduce the ephemeral port range on all workers so it doesn't overlap the ports chosen by an ESTestCase, or
  • be specific about which ports each node should use, chosen using the same algorithm as ESTestCase to avoid collisions:

/*
* [NOTE: Port ranges for tests]
*
* Some tests involve interactions over the localhost interface of the machine running the tests. The tests run concurrently in multiple
* JVMs, but all have access to the same network, so there's a risk that different tests will interact with each other in unexpected
* ways and trigger spurious failures. Gradle numbers its workers sequentially starting at 1 and each worker can determine its own
* identity from the {@link #TEST_WORKER_SYS_PROPERTY} system property. We use this to try and assign disjoint port ranges to each test
* worker, avoiding any unexpected interactions, although if we spawn enough test workers then we will wrap around to the beginning
* again.
*/
/**
* Defines the size of the port range assigned to each worker, which must be large enough to supply enough ports to run the tests, but
* not so large that we run out of ports. See also [NOTE: Port ranges for tests].
*/
private static final int PORTS_PER_WORKER = 30;
/**
* Defines the minimum port that test workers should use. See also [NOTE: Port ranges for tests].
*/
protected static final int MIN_PRIVATE_PORT = 13301;
/**
* Defines the maximum port that test workers should use. See also [NOTE: Port ranges for tests].
*/
private static final int MAX_PRIVATE_PORT = 36600;
/**
* Wrap around after reaching this worker ID.
*/
private static final int MAX_EFFECTIVE_WORKER_ID = (MAX_PRIVATE_PORT - MIN_PRIVATE_PORT - PORTS_PER_WORKER + 1) / PORTS_PER_WORKER - 1;
static {
assert getWorkerBasePort(MAX_EFFECTIVE_WORKER_ID) + PORTS_PER_WORKER - 1 <= MAX_PRIVATE_PORT;
}
/**
* Returns a port range for this JVM according to its Gradle worker ID. See also [NOTE: Port ranges for tests].
*/
public static String getPortRange() {
final var firstPort = getWorkerBasePort();
final var lastPort = firstPort + PORTS_PER_WORKER - 1; // upper bound is inclusive
assert MIN_PRIVATE_PORT <= firstPort && lastPort <= MAX_PRIVATE_PORT;
return firstPort + "-" + lastPort;
}
/**
* Returns the start of the port range for this JVM according to its Gradle worker ID. See also [NOTE: Port ranges for tests].
*/
protected static int getWorkerBasePort() {
final var workerIdStr = System.getProperty(ESTestCase.TEST_WORKER_SYS_PROPERTY);
if (workerIdStr == null) {
// running in IDE
return MIN_PRIVATE_PORT;
}
final var workerId = Integer.parseInt(workerIdStr);
assert workerId >= 1 : "Non positive gradle worker id: " + workerIdStr;
return getWorkerBasePort(workerId % (MAX_EFFECTIVE_WORKER_ID + 1));
}
private static int getWorkerBasePort(int effectiveWorkerId) {
assert 0 <= effectiveWorkerId && effectiveWorkerId <= MAX_EFFECTIVE_WORKER_ID;
// the range [MIN_PRIVATE_PORT, MIN_PRIVATE_PORT+PORTS_PER_WORKER) is only for running outside of Gradle
return MIN_PRIVATE_PORT + PORTS_PER_WORKER + effectiveWorkerId * PORTS_PER_WORKER;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Delivery/BuildBuild or test infrastructure>bug>test-failureTriaged test failures from CITeam:DeliveryMeta label for Delivery teamlow-riskAn open issue or test failure that is a low risk to future releases

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions