Attempt to fix port clash by using different starting port for each test #545

patrickkuo · 2022-02-24T13:56:58Z

this should fix #494

lxfind · 2022-02-24T21:31:58Z

Is it possible to avoid manually setting a starting port for each test? That makes it difficult to track and easy to make mistakes. I imagine at least we could use a shared static variable?

huitseeker

Thanks for getting back to this!!!

The PortAllocator we have now makes a lot of sense to me in production: it makes sure we start looking for a port at a specific place, and if the server doesn't have an open port there, it makes things easy to debug : try the following ports in order until you find where your server started.

However, for tests, that's completely overkill: I don't care where I'm gonna end up, and I can just bind to 127.0.0.1:0 and let the underlying OS sort it out.

The PortAllocator takes a u16 argument to start the port search. Is there a way to refactor it to take an Option<u16> where the default semantics of None would be to do TcpListener::bind(("127.0.0.1:0")).is_ok()? That should provide an easy-to-use default for tests.

Besides that, you start_network is 🤩

patrickkuo · 2022-02-25T12:32:51Z

Thanks for getting back to this!!!

The PortAllocator we have now makes a lot of sense to me in production: it makes sure we start looking for a port at a specific place, and if the server doesn't have an open port there, it makes things easy to debug : try the following ports in order until you find where your server started.

However, for tests, that's completely overkill: I don't care where I'm gonna end up, and I can just bind to 127.0.0.1:0 and let the underlying OS sort it out.

The PortAllocator takes a u16 argument to start the port search. Is there a way to refactor it to take an Option<u16> where the default semantics of None would be to do TcpListener::bind(("127.0.0.1:0")).is_ok()? That should provide an easy-to-use default for tests.

Besides that, you start_network is

Hmm this is harder then expected...

Currently the ports are pre allocated at setup phase (genesis) ahead of the server start up, because we need the port ahead of time to create the wallet.conf.
TcpListener::bind(("127.0.0.1:0")) will only bind momentarily because we are not starting up the real server.

I explored the possibility of starting the server first using '127.0.0.1:0' then create wallet.conf, this won't work without refactoring the existing network code, as there are no way to retrieve actual binding port from AuthorityServer

PORT_ALLOCATOR is a singleton object so it should generate unique available port if the tests are running in the same process, which works fine for me when running the test using cargo test, however it failed when I use cargo nextest run, I suspect it is running tests in parallel processes?

So for now I think using different starting port for each test is an easier fix for now until we refactor AuthorityServer.

huitseeker · 2022-02-25T18:51:49Z

@patrickkuo So what I'm hearing is that the constructor of the PortAllocator is not the issue, the constructor of AuthorityServer is. Thankfully, I think we can simply change the returned AuthorityPrivateinfo so that the value that is provided there for the base_port is the integer 0 in tests (bypassing the port allocator). The constructor of the AuthorityServer will take that up, form an address finishing in the port descriptor 0, and should start on an empty port.

patrickkuo · 2022-02-25T18:57:28Z

@patrickkuo So what I'm hearing is that the constructor of the PortAllocator is not the issue, the constructor of AuthorityServer is. Thankfully, I think we can simply change the returned AuthorityPrivateinfo so that the value that is provided there for the base_port is the integer 0 in tests (bypassing the port allocator). The constructor of the AuthorityServer will take that up, form an address finishing in the port descriptor 0, and should start on an empty port.

Yes the authority server should start using port 0 without problem.... but the wallet won't know which port to connect to...

huitseeker

OK, I've investigated the start of the genesis network enough to be convinced: at the moment we deduce network, genesis and wallet config from each other, and then start the network.

From that moment on, it's hard to get any information back from the network, which makes my fix based on binding to zero pointless (but I note it also makes the info of the PortAllocator very speculative).

Let's land this to stop the bleeding, I'll try a few tricks to refactor on the downstream.

…nt test

* test of consensus restore * hack * Fix consensus_restore test (#519) * Revert "hack" This reverts commit f06f9ec782141f8db614ab34fd3a0b95f089f2c1. * fix: adapt the worker address, remove a few mut * test node restore with cluster and switch metric type * update fail message Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com>

* test of consensus restore * hack * Fix consensus_restore test (MystenLabs#519) * Revert "hack" This reverts commit f06f9ec782141f8db614ab34fd3a0b95f089f2c1. * fix: adapt the worker address, remove a few mut * test node restore with cluster and switch metric type * update fail message Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com>

patrickkuo marked this pull request as ready for review February 24, 2022 14:26

patrickkuo requested review from oxade, huitseeker and lxfind February 24, 2022 14:27

patrickkuo mentioned this pull request Feb 24, 2022

Temporarily disable a few flaky tests in cli_tests #554

Closed

huitseeker reviewed Feb 25, 2022

View reviewed changes

huitseeker approved these changes Feb 25, 2022

View reviewed changes

patrickkuo added 2 commits February 26, 2022 00:10

attempt to fix port clash by using different starting port in differe…

868827a

…nt test

fixup after rebase

2b09e86

patrickkuo force-pushed the pat/attempt_to_fix_port_clash branch from 2a77fac to 2b09e86 Compare February 26, 2022 00:16

patrickkuo merged commit 0f22b0c into main Feb 26, 2022

patrickkuo deleted the pat/attempt_to_fix_port_clash branch February 26, 2022 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to fix port clash by using different starting port for each test #545

Attempt to fix port clash by using different starting port for each test #545

patrickkuo commented Feb 24, 2022 •

edited

Loading

lxfind commented Feb 24, 2022

huitseeker left a comment •

edited

Loading

patrickkuo commented Feb 25, 2022

huitseeker commented Feb 25, 2022

patrickkuo commented Feb 25, 2022

huitseeker left a comment

Attempt to fix port clash by using different starting port for each test #545

Attempt to fix port clash by using different starting port for each test #545

Conversation

patrickkuo commented Feb 24, 2022 • edited Loading

lxfind commented Feb 24, 2022

huitseeker left a comment • edited Loading

Choose a reason for hiding this comment

patrickkuo commented Feb 25, 2022

huitseeker commented Feb 25, 2022

patrickkuo commented Feb 25, 2022

huitseeker left a comment

Choose a reason for hiding this comment

patrickkuo commented Feb 24, 2022 •

edited

Loading

huitseeker left a comment •

edited

Loading