The symptons include:
- tests failed to create HyperShift clusters
- Repeated alerts as
It is a known issue that hive-controller
could be evicted once high number of HostedClusters are created and the hive
cluster is scaling up.
We can check the number of HostedClusters by
oc --context hive -n clusters get hostedclusters | wc -l
If the number is greater than 80, we need to invoke our cleaner to clean the resources.
$ JOB=periodic-openshift-dptp-3312-hypershift-leaks-cleaner make job
If the number of clusters is still more than 80
once the periodic is ran, we need to manually inspect the hostedclusters to find out which job creates them (oc --context hive -n clusters get hostedcluster <name>
and look for annotations
), then ask the owner of the job to reduce the frequencies the job is invoked.