Skip to content

Conversation

@onelapahead
Copy link
Contributor

@onelapahead onelapahead commented Jul 2, 2021

While running DX on K8s as part of https://github.com/hyperledger-labs/firefly/pull/114 I noticed pods would take the entire grace period before properly shutting down. Handling INT and QUIT will make it so we properly shut down when K8s terminates a pod (rather than using a CLI tool like dumb-init):

Output from testing:

First terminal where I ran DX

❯ npm start

> firefly-dataexchange@1.0.0 start
> node build/index.js

2021-07-02T15:22:32.819Z [INFO ]: FireFly Data Exchange running on http://0.0.0.0:3000 (API) and https://0.0.0.0:3001 (P2P) - log level "info" app.ts
2021-07-02T15:22:52.525Z [INFO ]: FireFly Data Exchange is gracefully shutting down app.ts
❯ npm start

> firefly-dataexchange@1.0.0 start
> node build/index.js

2021-07-02T15:22:59.358Z [INFO ]: FireFly Data Exchange running on http://0.0.0.0:3000 (API) and https://0.0.0.0:3001 (P2P) - log level "info" app.ts
2021-07-02T15:23:10.018Z [INFO ]: FireFly Data Exchange is gracefully shutting down app.ts
❯ npm start

> firefly-dataexchange@1.0.0 start
> node build/index.js

2021-07-02T15:23:17.321Z [INFO ]: FireFly Data Exchange running on http://0.0.0.0:3000 (API) and https://0.0.0.0:3001 (P2P) - log level "info" app.ts
2021-07-02T15:23:35.168Z [INFO ]: FireFly Data Exchange is gracefully shutting down app.ts

And terminal where I killed DX:

❯ ps aux | grep node
hfuss            16607   0.0  0.3  4767956  55500 s002  S+   11:22AM   0:00.61 node build/index.js
❯ kill -s QUIT 16607
❯ ps aux | grep node
hfuss            16612   0.0  0.3  4730180  57820 s002  S+   11:22AM   0:00.51 node build/index.js
❯ kill -s QUIT 16612
❯ ps aux | grep node
hfuss            16616   0.1  0.3  4785220  57936 s002  S+   11:23AM   0:00.57 node build/index.js
❯ kill -s TERM 16616

Will defer to @gabriel-indik @peterbroadhurst if any additional shutdown logic should be added to stop i.e. wait for all in-flight HTTP requests to finish, writing to disk operations, etc.

Signed-off-by: hfuss <haydenfuss@gmail.com>
@gabriel-indik
Copy link
Contributor

Once Data Exchange is started it will be listening on the API and P2P ports. I wonder if it might be worth adding a mechanism or API to stop the servers instead, bringing the execution of the application to an end.

Signed-off-by: hfuss <haydenfuss@gmail.com>
@onelapahead
Copy link
Contributor Author

Thanks for the feedback @gabriel-indik, I added some additional logic to ensure the API, P2P, and WS servers are all stopped before exiting:

❯ npm start

> firefly-dataexchange@1.0.0 start
> node build/index.js

2021-07-02T16:12:49.404Z [INFO ]: FireFly Data Exchange running on http://0.0.0.0:3000 (API) and https://0.0.0.0:3001 (P2P) - log level "info" app.ts
2021-07-02T16:12:58.877Z [INFO ]: FireFly Data Exchange is gracefully shutting down the webservers app.ts
2021-07-02T16:12:58.878Z [INFO ]: FireFly Data Exchange has stopped all webservers, exiting app.ts
❯ ps aux | grep '[n]ode'
hfuss            17249   0.0  0.3  4748500  56236 s002  S+   12:12PM   0:00.53 node build/index.js
❯ kill -s QUIT 17249

@gabriel-indik
Copy link
Contributor

Changes look great. Is the process.exit() still required? I thought that once the servers are stopped the process would exit by itself, unless there is something else hanging like a file change listener.

@onelapahead
Copy link
Contributor Author

onelapahead commented Jul 2, 2021

Testing locally confirms you're right that the process.exit is not required at that point. Unless we're worried there could be an errant promise in the background doing file IO (like persisting the peers) I'm fine w/ removing it. I like the explicit exit for that reason though.

@gabriel-indik gabriel-indik merged commit 5ac34d2 into hyperledger:main Jul 2, 2021
@onelapahead onelapahead deleted the graceful-shutdown-cloudnative branch July 7, 2021 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants