Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully disconnect connections and trigger events #6

Merged
merged 4 commits into from
Jan 9, 2023

Conversation

zheilbron
Copy link
Contributor

@zheilbron zheilbron commented Jan 8, 2023

Two somewhat related fixes in this PR:

  1. The client and server will now attempt to process pending messages before closing. Previously, the following pattern would cause a race.
send_message(...);
close_connection(...);
  1. A separate wait task is spawned to wait on closed connections in order to signal the ConnectionLostEvents.

@zheilbron zheilbron force-pushed the graceful_disconnect branch from 80b0fc5 to 57d9f99 Compare January 8, 2023 18:16
@zheilbron zheilbron force-pushed the graceful_disconnect branch from 57d9f99 to eb0b350 Compare January 9, 2023 00:16
…essage::ClientLostConnection immediately on a send failure
…ending messages, send InternalAsyncMessage::LostConnection immediately on a send failure and only emit ConnectionLostEvent once
@Henauxg
Copy link
Owner

Henauxg commented Jan 9, 2023

I did indeed leave the disconnection/close behavior into a quick and dirty grey territory until now. It's good that you looked into it and it made me took a closer look too. Here are my thoughts about it:

Desired behaviors :

  • client: A call to client::close_connection should:
    • not raise a ConnectionLostEvent
    • prevent new messages from being sent
    • flush existing outgoing messages
    • end all the connection's tasks
  • client: Connection being closed by the server or failure to send a message on a reliable stream should:
    • raise ConnectionLostEvent
    • prevent new messages from being sent
    • flush existing outgoing messages
    • end all the connection's tasks
  • server: A call to server::disconnect_client should:
    • not raise a ConnectionLostEvent
    • prevent new messages from being sent
    • flush existing outgoing messages
    • end all the connection's tasks
  • server: A connection being closed by a client or failure to send a message on a reliable stream should:
    • raise a `ConnectionLostEvent
    • prevent new messages from being sent
    • flush existing outgoing messages
    • end all the connection's tasks

We are still missing a few bits so I proposed additional changes to your PR to fulfill those:

  • client: ConnectionLostEvent should not be emitted if the connection closed due to a call to client::close_connection => change in udate_sync_client to only emit the event if the state was not already Disconnected
  • server: ClientLostConnection should not be emitted if the client connection ended following a call to disconnect_client => change in update_sync_server to only emit the event if the client was not already disconnected
  • client: Prevent new messages from being sent when connection state set to Disconnected => check for Disconnected state in connection methods, and add a new Connecting state as the initialization state.
  • client: Failure to send a message should block new messages from being sent on the connection, else, we could never finish emptying the to_server_receiver queue before flushing => send LostConnection SyncMessage from send_msg
  • server: Failure to send a message should block new messages from being sent to this client, else, we could never finish emptying the to_client_receiver queue before flushing => send ClientLostConnection SyncMessage from send_msg
  • client & server: Do not use send_msg while emptying remaining messages because close signal and LostConnection have already been raised.

Thanks a lot for your work ! Feel free to review the additional changes I propose. I'll wait for your opinion to merge those.

Minors:

  • client: In close_waiter, we could rename conn_err to close_reason and log as: info!("Connection closed: {}", close_reason); since "normally" closing locally of by the peer is not really considered as an "error".
  • I think that we don't need to call flush() yet, since send() does a flush every time. But since replacing send() with non flushing feed() or send_all() is on the roadmap, we might as well leave the flush() calls.

Possible future tweaks:

  • I am currently working on a channel implementation in the channels branch, and I will have to rework some of what we are doing here so that, for instance, quinn::connection::close() will have to be called only after every channel flushed its content. But for now, it is enough to call it after flushing the main send stream.
  • Following my additional changes, in case of a message failing to be sent, InternalAsyncMessage::LostConnection will be sent twice: the first time right after the failure, to signal the lost connection, set the disconnected state & block new messages. And the second time by the close_waiter once the connection is fully closed. This is just a minor issue and has no impact (the final ConnectionLost event is raised only once). I might however come back to it and change the InternalAsyncMessage sent by the close_waiter to a new ConnectionClosed message due to its slightly different meaning and timing.
  • I'm on the fence about removing the calls to close_sender.send() from the close_waiter and send_msg, and simply calling connection.disconnect() from update_sync_client. This could introduce a delay of up to 1 frame to start closing the async tasks, which is not a time critical part. But it would simplify a bit the async part by not having to ever use the close_sender in it, which is finicky since calling it might stop the calling task itself right after.
  • (out of scope) I still have to do a pass on error handling while Connecting/Connected

@zheilbron
Copy link
Contributor Author

zheilbron commented Jan 9, 2023

Thanks for taking a holistic look at the issue and fixing up the PR. I agree with the desired behaviors.

client: In close_waiter, we could rename conn_err to close_reason and log as: info!("Connection closed: {}", close_reason); since "normally" closing locally of by the peer is not really considered as an "error".

I agree. This is how quinn names it though: https://docs.rs/quinn/latest/quinn/struct.Connection.html#method.closed

I think that we don't need to call flush() yet, since send() does a flush every time. But since replacing send() with non flushing feed() or send_all() is on the roadmap, we might as well leave the flush() calls.

I was debating this, but couldn't convince myself whether or not it was needed. Thanks for pointing out that send does indeed flush.

One question I'm still unsure about:
What is the lifecycle of a connection, particularly one that has reached the Disconnected state? Can it ever be reconnected or is it essentially garbage at that point? And, if it's garbage, should it be collected automatically once it's no longer needed?

I noticed that this is problematic if one relies on the default connection. If the default connection ever becomes disconnected, it remains a defunct connection until manually cleared with client::close_connection.

@Henauxg
Copy link
Owner

Henauxg commented Jan 9, 2023

As of now, connections are indeed kind of a dead weight once Disconnected. This is partly why I made connection::disconnect private, and only used through the public client::close_connection which properly dispose of the connection.
But you are right that even so we can't close them ourselves, connections can still become Disconnected due to other external factors (errors, server closing, ...) which is an annoyance when it happens to the default connection.

I was thinking that I might add a reconnect to re-use the same ConnectionConfiguration and ConnectionId. It seems to offer better ergonomics than a new open_connection call, allowing for thinks like "auto-reconnect", or just the simplicity of keeping the same default connection.

@Henauxg Henauxg merged commit a5921d1 into Henauxg:main Jan 9, 2023
@zheilbron zheilbron deleted the graceful_disconnect branch January 13, 2023 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants