release-22.1: spanconfig: reset job run_stats to avoid job system backoff #82858
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #82724 on behalf of @stevendanna.
/cc @cockroachdb/release
If the coordinator of the span configuration job dies, another node
will adopt the job. However, when doing so it will bump the num_runs
run stat. As this number increases, the job system will delay future
resumptions of this job.
We solve this here by resetting the job's run_stats at the beginning
of the job.
We've yet again handled this in the job directly rather than adjusting
the behavior of the job system. In this case, my justification is that
this solution is fit for backporting.
Fixes #82689
Release note (bug fix): Fix a bug where the startup of an internal
component after a server restart could result in the delayed
application of zone configuration.
Release justification: high impact bug fix that prevents the spanconfig job from being paused for long periods of time, resulting in no reconciliation