Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods stuck in crashloop after update to 0.0.14 #186

Closed
Numblesix opened this issue Oct 14, 2020 · 19 comments
Closed

Pods stuck in crashloop after update to 0.0.14 #186

Numblesix opened this issue Oct 14, 2020 · 19 comments
Labels
bug Something isn't working
Milestone

Comments

@Numblesix
Copy link

Hi :)

First thanks for the great Operator :)!

I updated our Dev Cluster today from 0.0.13 to 0.0.14 and since then two pods are crashlooping :(

See below logs :)

argocd-repo-server

time="2020-10-14T11:48:10Z" level=info msg="Initializing GnuPG keyring at /app/config/gpg/keys"
time="2020-10-14T11:48:10Z" level=fatal msg="stat /app/config/gpg/keys/trustdb.gpg: permission denied"

argocd-application-controller

time="2020-10-14T11:48:10Z" level=info msg="appResyncPeriod=3m0s"
time="2020-10-14T11:48:10Z" level=info msg="Application Controller (version: v1.7.7+33c93ae, built: 2020-09-29T04:56:38Z) starting (namespace: argocd)"
time="2020-10-14T11:48:10Z" level=info msg="Starting configmap/secret informers"
time="2020-10-14T11:48:10Z" level=info msg="Configmap/secret informer synced"
E1014 11:48:10.261304       1 runtime.go:78] Observed a panic: "assignment to entry in nil map" (assignment to entry in nil map)
goroutine 63 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1cbec40, 0x227bc80)
	/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:48 +0x82
panic(0x1cbec40, 0x227bc80)
	/usr/local/go/src/runtime/panic.go:967 +0x166
github.com/argoproj/argo-cd/util/settings.addStatusOverrideToGK(...)
	/go/src/github.com/argoproj/argo-cd/util/settings/settings.go:508
github.com/argoproj/argo-cd/util/settings.(*SettingsManager).GetResourceOverrides(0xc0000f78c0, 0xc000ecaed0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo-cd/util/settings/settings.go:485 +0x468
github.com/argoproj/argo-cd/controller/cache.(*liveStateCache).loadCacheSettings(0xc00030db80, 0x10, 0xc0004a1380, 0x1ac619d)
	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:113 +0x9b
github.com/argoproj/argo-cd/controller/cache.(*liveStateCache).Init(0xc00030db80, 0x2309518, 0xc000639c20)
	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:417 +0x2f
github.com/argoproj/argo-cd/controller.(*ApplicationController).Run(0xc0004c9680, 0x22ead00, 0xc0004b9580, 0x14, 0xa)
	/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:454 +0x26d
created by main.newCommand.func1
	/go/src/github.com/argoproj/argo-cd/cmd/argocd-application-controller/main.go:109 +0x90c
panic: assignment to entry in nil map [recovered]
	panic: assignment to entry in nil map

goroutine 63 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:55 +0x105
panic(0x1cbec40, 0x227bc80)
	/usr/local/go/src/runtime/panic.go:967 +0x166
github.com/argoproj/argo-cd/util/settings.addStatusOverrideToGK(...)
	/go/src/github.com/argoproj/argo-cd/util/settings/settings.go:508
github.com/argoproj/argo-cd/util/settings.(*SettingsManager).GetResourceOverrides(0xc0000f78c0, 0xc000ecaed0, 0x0, 0x0)
	/go/src/github.com/argoproj/argo-cd/util/settings/settings.go:485 +0x468
github.com/argoproj/argo-cd/controller/cache.(*liveStateCache).loadCacheSettings(0xc00030db80, 0x10, 0xc0004a1380, 0x1ac619d)
	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:113 +0x9b
github.com/argoproj/argo-cd/controller/cache.(*liveStateCache).Init(0xc00030db80, 0x2309518, 0xc000639c20)
	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:417 +0x2f
github.com/argoproj/argo-cd/controller.(*ApplicationController).Run(0xc0004c9680, 0x22ead00, 0xc0004b9580, 0x14, 0xa)
	/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:454 +0x26d
created by main.newCommand.func1
	/go/src/github.com/argoproj/argo-cd/cmd/argocd-application-controller/main.go:109 +0x90c
@jomkz
Copy link
Collaborator

jomkz commented Oct 14, 2020

Hey @Numblesix thanks for the kind words and I am sorry that you are having an issue. I thought we had that handled, let me do a little digging to see if I can get a solution for you.

@jomkz jomkz added the bug Something isn't working label Oct 14, 2020
@Numblesix
Copy link
Author

Numblesix commented Oct 14, 2020

No Problem.

Ill try to make a rollback to v0.0.13 and put some words in the Docs about how to do that :)
Or should i leave it as is ? Currently it doesnt hurt me to much then i could help with logs etc

@jomkz
Copy link
Collaborator

jomkz commented Oct 14, 2020

It's up to you. I am somewhat familiar with this issue and new clusters are not impacted, so this appears to be related to upgrading from 0.0.13 to 0.0.14

@Numblesix
Copy link
Author

ill leave it then as is for you to request logs etc :)

Depending on if it might start hurt ill do a downgrade then and would write some small doc on how to do so :)

@wouter2397
Copy link

We have the same issue as @Numblesix.
We have a broken ArgoCD now....

How can we rollback to 0.0.13?

@Numblesix
Copy link
Author

Hi @wouter2397

Didn't had time to dig into this but in theory you should be able to create a backup of all your crds created (for application projects etc) and just redeploy the Argo crd. As far as I understood this issue only happens at an update :)

@BostjanBozic
Copy link

BostjanBozic commented Oct 14, 2020

Issue seems to be that with operator v0.0.14, default ArgoCD version used is v1.7.7. With v1.7.x, GPG is enabled by default and is causing issues due to permissions (at least on OpenShift) argoproj/argo-cd#4127
Maybe operator is not configuring those parameters when deploying ArgoCD?
I would say you do not have to downgrade operator, just specify version in ArgoCD custom resource ( spec.version) - in operator v0.0.13 default value was set to v1.6.1 I believe.

@wouter2397
Copy link

Hi @BostjanBozic

You're suggestion works!
Thank you for your quick help

@wouter2397
Copy link

@jmckind Can you update us within this issue when the problem should be solved and we can remove the spec.version within the ArgoCD custom resource?

@jomkz
Copy link
Collaborator

jomkz commented Oct 14, 2020

Thanks for the comment @BostjanBozic, yes that would work.

There are two issues here, the first GPG issue can be worked around by deleting the Deployment for the repo-server. The operator will recreate it and that issue will be resolved. This is due to a new Volume being needed for the GPG functionality and the old Deployment obviously didn't have this.

The second issue is related to how the operator handles the ResourceCustomizations and an upstream issue that prevents translating an empty value properly. The work around for this is to remove the resource.customizations key from the argocd-cm ConfigMap. Once the pod restarts, the issue should be resolved.

@jomkz
Copy link
Collaborator

jomkz commented Oct 14, 2020

@wouter2397 I am sorry that we let this get through and will provide an update when we have a proper fix in the operator.

@jgradyntst
Copy link

Thanks for the workaround to get my lab back up and running again. Great operator appreciate all your efforts!

@Numblesix
Copy link
Author

@jmckind Can confirm both workarounds from above work :) !

@geoL86
Copy link

geoL86 commented Oct 15, 2020

It works for me as well

@robertodocampo
Copy link

Works!!!

@alekonko
Copy link

it works also for us, thank you so much. (same problem on many ArgoCD instance on OCP 4.5.x)

bye
Alessandro

@jonaslar
Copy link

jonaslar commented Oct 23, 2020

Deleting resource.customizations from config map argocd-cm and deleting the deployment for the repo server worked for us as well. We upgraded from 0.0.12 to 0.0.14.

Thx :-)

@jomkz jomkz added this to the v0.0.15 milestone Oct 28, 2020
@jannfis jannfis modified the milestones: v0.0.15, v0.0.16 Apr 22, 2021
@jannfis
Copy link
Collaborator

jannfis commented Apr 29, 2021

There have been further improvements with #286, which fully enables GnuPG feature in Argo CD Operator.

I will close this issue, since this has been released with v0.0.15

@jannfis
Copy link
Collaborator

jannfis commented Apr 29, 2021

Feel free to reopen if the issues have not been resolved. :)

@jannfis jannfis closed this as completed Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants