Closed
Description
We've run into issues where the scheduler unexpectedly cleanly shuts itself down after running for a very long time. Having a dump of cluster state would help to debug this.
After #5659 is implemented, write a SchedulerPlugin
with a close
hook that dumps cluster state. The filename where the state is written can be either passed into the plugin instance.
If the cluster state dump fails, or writing to the destination fails, this should not affect the shutdown process—just log the problem and move on.