Improve validation during upgrade e2e tests #10956
Labels
area/e2e-testing
Issues or PRs related to e2e testing
help wanted
Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.
kind/feature
Categorizes issue or PR as related to a new feature.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
We recently found an issue in KCP upgrades. This only lead to failures in self-hosted tests, but in general KCP was upgrading through the entire control plane and only static pods came up (kube-proxy, CNI did not). Also the Nodes weren't ready during the upgrade.
I think we should improve our e2e test coverage to detect if the control plane is in a state like this during upgrades.
This should be either additional validation while the upgrade is running or potentially we can e.g. deploy a StatefulSet with PDBs that runs on CP nodes. In this case the test would have failed with the issue from #10947
Bonus: Potentially the Statefulset with PDBs we deploy could also use volumes for some bonus test coverage for "wait for volume detach" but it's unclear if we have a CSI implementation for CAPD that would work for that (maybe the provisioner that comes with kind out-of-the-box)
The text was updated successfully, but these errors were encountered: