[9.2] (backport #10579) Remove upgrade marker if rolling back to versions older than 9.2.0 #10595
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Conditionally remove upgrade marker when an elastic-agent upgrade is rolled back to a version that does not contain #8407
This PR will delete the upgrade marker by default (in order to maintain backward compatibility).
Why is it important?
There's a race condition if the rolled back agent starts slowly for the old (i.e. rolled back) watcher to pick up the
.upgrade-markerfile when the watcher that triggered the rollback has already terminated.The old watcher code will interpret the presence of the upgrade marker as an ongoing upgrade for which it will have perform the watching of the agent for the grace period without checking agent version or upgrade state.
This may lead to elastic-agent trying to delete itself after the rollback has already been performed.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added an entry in./changelog/fragmentsusing the changelog tool[ ] I have added an integration test or an E2E testDisruptive User Impact
How to test this PR locally
See the additional unit tests in
watch_test.goandrollback_test.go.Manual testing requires an agent that is slow to restart in case of rollback, may need some custom code modifications.
Related issues
Questions to ask yourself
This is an automatic backport of pull request #10579 done by [Mergify](https://mergify.com).