-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingress controller Panic while reconciling ingresses #11661
Comments
/remove-kind bug Can you try to reproduce on a kind cluster or a minikube cluster. thnx /triage needs-information |
/kind support |
No, I can't reproduce it. We run ingress in dozens of clusters and haven't seen this before, and I'm unsure of what caused the issue. |
Hi @rsafonseca , If you want you can change the label. Hoping we get some actionable data and hopefully a reproduce procedure. |
Hum looking at the code this seems like a very weird but valid issue. I can see it tries to do some assertion between 2 types of maps, and one maybe is null? @rsafonseca can you provide a bit more of information on what ingress objects you have? |
@longwuyuan that makes sense for behavioral bugs, for an NPE that causes panic it's pretty straightforward that it's a bug, as this should never happen leading to a crash @rikatz It can only be one of the maps, since the other one has a nil check a few lines above. I didn't have time to follow the code (yet) as this just came up this weekend, and was hoping someone with context on that map to maybe hint at why it might be getting into a nil state (maybe some silently failed kube-api call or something like that). I have literally hundreds of ingresses in this cluster, It would take forever to make (and redact) a full dump and it's not likely that the issue is in any way related to the ingress' content, since at least for the ingress indicated above, there were no changes, including on endpoints and it has existed for nearly a year, but i'll try to check tomorrow if this only happened on a single ingress or random ingresses (which I suspect), but it affected a single controller pod, which is odd, so i suppose it might've been due to some transient network issues on the host (e.g. failures on kube-api calls) which lead to this issue, but for now this is mere conjecture. At worst, if the root cause isn't easily found, it might be worth to add an extra nil check for the offending map to avoid a crash. |
Yeah, if you can send the PR to check this map I think it would be great |
/kind bug |
/remove-triage needs-information |
This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach |
What happened:
A single controller pod crashed a few times in a row, with the following stack trace (running version 1.9.5)
Since we're running 1.9.5, the panic appears to happen on this line where the obvious culprit is priUps being null, since altUps has a nil check a few lines above, but priUps does not
What you expected to happen:
Ingress not crashing
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): 1.9.5
Kubernetes version (use
kubectl version
): 1.27.11Environment:
uname -a
): 6.5.0-1022-awsPlease mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
kubectl version
kubectl get nodes -o wide
Anything else we need to know:
No changes were happening on the ingress which triggered this, no pod rotation or any config change, this happened during normal sync. Oddly, it happened multiple times but only on a single controller pod, our of 3 existing.
The text was updated successfully, but these errors were encountered: