external cluster TLS client cert has expired #19033
Labels
bug/in-triage
This issue needs further triage to be correctly classified
bug
Something isn't working
component:security
type:bug
Checklist:
argocd version
.Describe the bug
We have encountered a situation a few times where the connection from ArgoCD to an external cluster no longer works (UI shows unknown state for all applications of the corresponding cluster). In the past, we fixed the problem with the procedure described here. Today we took a closer look at this recurring problem, gathered some more detailed information about the situation and we think we have found the "real" cause.
To Reproduce
Error messages like this can be found in ArgoCD log for all applications:
The kube-apiserver of the corresponding external cluster shows error messages like this for each ArgoCD connection attempt:
We thought, that we were using bearer token authentication between ArgoCD and the external clusters, but it seem, we were wrong:
The ServiceAccount/Bearer Token should be long-lived, see annotation explained in this reference, but this seem to not matter in this case. Just for your information:
While checking the ArgoCD secrets we found that it includes a TLS client certificate in the config blob, which has expired:
===> certificate serial number matches with the on from the external cluster kube-apiserver error message
===> it is the same certificate of the external cluster kubernetes-admin, which was used during
argocd cluster add
operationExpected behavior
We either want to use authentication based on the long-lived ServiceAccount/Bearer Token or an option, better an automatism, that rotates the TLS client cert.
Screenshots
Version
Logs
Thank you very much for taking care of this issue. We would be pleased if you could give us a permanent solution.
The text was updated successfully, but these errors were encountered: