Description
That can be obvious with
System.Net.Quic.QuicException : Connection has been shutdown by transport. Error Code: CONNECTION_REFUSED
however there are more hidden implications:
#55642
we have test fragments like
ValueTask clientTask = clientConnection.ConnectAsync();
using QuicConnection serverConnection = await listener.AcceptConnectionAsync();
await clientTask;
if the connect fails, the AcceptConnectionAsync()
would hang forever since the clientTask is not awaited.
There are other reasons why we hang and @stephentoub is helping with the other patterns.
as @ManickaP pointed in #53224
https://github.com/microsoft/msquic/blob/3898b2b88085a478eeb844885ea78daa9b060d06/src/core/stream.c#L207-L211
if (QuicConnIsClosed(Stream->Connection) ||
Stream->Flags.Started) {
Status = QUIC_STATUS_INVALID_STATE;
goto Exit;
}
That check is QUIC protocol state. If we get close from peer, there is race condition there.
I correlate this with packet capture and I can that the failed stream/connection got REFUSED message as well.
test-refused.pcapng.zip
This is unpleasant as it is difficult to hook any retry logic to INVALID_STATE
.
There may be more to it but this is partially caused by (MS?)QUIC design.
I originally thought there may be some race condition starting the listener but clearly the message is well-formate QUIC.
When I debug the failures, there would be NO callback on the MsQuicListener
and the logic happens inside msquic.
@nibanks pointed me to QuicWorkerIsOverloaded
and when this is true, listener would refuse to take up more work e.g. accept new connections. There may be other reasons why msquic would refuse new connection.
This is also unpleasant as our tests do stress CPU and we are at point where any test can impact any test.