-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Implemented graceful shutdown for websockets #2827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The general design I followed: * Keep the knowledge of how to close connections inside the Websocket module * Add a field to `WsServer` to hold a channel, the channel signals when the appliction is stopping * Return a callback to `Main` that can be invoked to send a message to the channel
Deploy preview for hasura-docs ready! Built with commit c35b47e |
Review app for commit dc2906b deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this generally looks great to me, though I left one small comment.
@@ -435,6 +435,14 @@ initErrExit e = do | |||
<> T.unpack (qeError e) | |||
exitFailure | |||
|
|||
data HasuraApp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for pulling this out.
@lexi-lambda added your suggestions, thanks for the review! |
Review app for commit b64cda2 deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
Before it was not rejecting new connection. By moving the logic one layer up, we're now able to reject connections as they come in.
@lexi-lambda I figured that my implementation was not entirely correct as new WS connections were not being rejected properly. By moving the logic one layer above I was able to achieve that goal. |
Review app for commit d15cf86 deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. I’m not opposed to the new approach, but I’ll admit: I don’t understand it (and I’d like to make sure I do). I’d have thought that closing the socket would be sufficient to prevent new websocket connections, since, well… how could new connections be received at all if the server isn’t listening anymore? Can you explain to me why that wasn’t enough?
Review app for commit cdcc5bf deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
@lexi-lambda That's a great question and I was also mystified at how new connections were apparently making their way in. My working theory right now is that those connections were already accepted before closing the socket but had not been yet handled by the Websockets server code. Those still in transit connections would apparently cause a race condition where they were accepted and then served as live queries after the call to closing all connections was already made. I'm actually not sure how to validate my theory as true, but it seemed to solve an edge case when rapidly trying to create new WS connections as I send the |
Aha, that makes sense to me, thanks for explaining. The guard does seem necessary, then, but I have a couple small requests:
|
That was actually my first attempt and I immediately confused myself by writing the function In the light that it took 5 seconds for me to get boolean blindness I resolved to make a cheap sum type
You may remember that was the first version of this PR you reviewed. I got confused by the module naming thinking that it actually implemented a server-like structure, or that at least the code handling the very beginnings of a connection happened there. I think that those 2 modules should swap names: I will attempt to put the logic back there in the best way it makes sense to me now. Let me know if this is what you had in mind. |
Also sending a 503 status witha relevant message
Review app for commit 66356fd deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
@lexi-lambda can you please review again? |
Review app for commit e408213 deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
This new design prevents accepting new WS connections while we are in the process of shutting the server down. We enforce this by guarding the connections map in a TVar that we need to check for every time we need to use the connections map.
@lexi-lambda Alright, I tried your suggestion and really liked how much safer the code looks like now. I had to do a few changes in order to keep things atomic, but I'm feeling more confident now that this is the right design. |
Review app for commit 48a96ec deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
Review app for commit d377c9a deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
Review app for commit 0f7bd96 deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
Review app for commit 768baf6 deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
@lexi-lambda can you please review again? |
Review app for commit c35b47e deployed to Heroku: https://hge-ci-pull-2827.herokuapp.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me; many thanks for your patience working out the subtleties!
Thanks @lexi-lambda I’m actually glad that you were so thorough, I’m happy with his more robust solution. |
Review app https://hge-ci-pull-2827.herokuapp.com is deleted |
Description
Implements graceful shutdown for websockets. When the sockets are not actively receiving data they will receive a closing message and the server will promptly shut down.
Affected components
Solution and Design
The general design I followed:
module
WsServer
to hold a channel, the channel signals whenthe application is stopping
Main
that can be invoked to send a message tothe channel
Steps to test and verify
gql
with any subscriptionTERM
signal to the process. It should stop immediately