Description
This relates to #85777
Our current method for allocating port ranges in ESTestCase
is naive if a few ways that makes it possible for concurrently executing test workers to possibly use the same range of ports for spinning up transports. We cannot do port allocation completely in isolation based only on worker id. Eventually the worker id will be incremented outside a certain bound at which we will start allocating overlapping ports. The only solution is to have some way of knowing which port ranges are free so that they can be reused. For this to work there needs to be some kind of coordination here across test workers.
I don't think there's a good solution to do this in the main build and "injecting" it into the test workers. We really have no idea how many different test workers will be used by a given test class so it's impossible to know how many port ranges need to be allocated ahead of time. We know maxParallelForks
but that just indicates how many concurrent workers will be in use, but across the lifetime of a task it's possible that more than that many unique worker ids might be used by a single task.
So I think the only solution is for the tests themselves to "request" a port range. These tests run in isolated JVMs so they can't directly communicate with eachother. Short of using some complicated network communication the simplest solution here might be using a file-based lock system. We could store a registry of in-use port ranges (or even just offsets) on disk in a shared known location and use a lock file to ensure non-concurrent access. Each worker would then acquire the lock to the registry, acquire a port range, and run its tests. After test execution it would similarly access the registry to release the port range, adding back to the available pool to be safely reused by another worker.