-
Notifications
You must be signed in to change notification settings - Fork 6
SYSTEM PRESHUTDOWN command for graceful shutdown swarm node #852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: antalya-25.3
Are you sure you want to change the base?
Conversation
{ | ||
if (getContext()->isPreShutdownCalled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that operations prohibited in PreShutdown phase (e.g getting new tasks), are allowed in Shutdown phase?
If yes, is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand ClickHouse does not have specific shutdown phase. On SYSTEM SHUTDOWN
just calls kill(0, SIGTERM)
. Without PRESHUTDOW
this caused error on initiator as well as already taken but unfinished tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then what is the purpose of shutdown_called flag?
From the first glance I would expect that all checks that are true for preshutdown_called should be true for shutdown_called as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, understood, flag set in destructor.
Yes, make sense to set preshutdown there too.
a842ce6
to
9642c42
Compare
As I understand, regular queries would still be processed. Literally the only thing that's stopped is the "swarm node" work. Wouldn't it make more sense to rename the command to something more meaningful? (e.g UNREGISTER FROM SWARM) This is just a question, not a change request |
It's a topic to discussion. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
SYSTEM PRESHUTDOWN command for graceful shutdown swarm node
Documentation entry for user-facing changes
Solved #759
New command
SYSTEM PRESHUTDOWN
.Scenario:
We want to scale down swarm cluster. On node which we want to shutdown we call
SYSTEM PRESHUTDOWN
, after that node stops to accept new distributed commands. It can still processed objects which is started processed beforeSYSTEM PRESHUTDOWN
. When all that objects successfully processed, we can kill that node without any errors or lost data in responses on initiator.After
SYSTEM PRESHUTDOWN
on swarm node:On initiator node:
skip_unavailable_shards=true
unexpected closing of socket is legal if no data packets were accepted before. This allow to shutdown non-autodiscovery node too.Exclude tests: