Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors during start KIC #5710

Open
1 task done
oleksandrs-adorama opened this issue Mar 16, 2024 · 6 comments
Open
1 task done

Errors during start KIC #5710

oleksandrs-adorama opened this issue Mar 16, 2024 · 6 comments
Labels
bug Something isn't working pending author feedback stale Will be closed unless advocated for within 7 days

Comments

@oleksandrs-adorama
Copy link

oleksandrs-adorama commented Mar 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

we updated kubernetes-ingress-controller version 2.8 to kubernetes-ingress-controller version 3.0
we started observe strange issues during start KIC.

2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "****", "error": "Service ****/**** not found"}
2024-03-12T14:05:49Z	error	Failed to fetch service	{"service_name": "", "service_namespace": "default", "error": "Service default/**** not found"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret ****/****.api-key not found	{"name": "****", "namespace": "****", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****-service.public" failure: Failed to fetch secret: Secret ****/****-service.public not found	{"name": "****.public-user", "namespace": "****", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "*****.public" failure: Failed to fetch secret: Secret default/*****.public not found	{"name": "****public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/c****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.public" failure: Failed to fetch secret: Secret default/****.public not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:57Z	error	credential "****.api-key" failure: Failed to fetch secret: Secret default/****.api-key not found	{"name": "****.public-user", "namespace": "default", "GVK": "configuration.konghq.com/v1, Kind=KongConsumer", "error": "resource processing failed"}
2024-03-12T14:05:58Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024-03-13T16:47:06Z	error	Failed to fetch KongPlugin resource	{"kongplugin_name": "****.response-transformer", "kongplugin_namespace": "default", "error": "no KongPlugin or KongClusterPlugin was found for default/***.response-transformer"}
2024-03-13T16:47:06Z	error	Failed to fetch KongPlugin resource	{"kongplugin_name": "***.acl", "kongplugin_namespace": "default", "error": "no KongPlugin or KongClusterPlugin was found for default/****.acl"}

After those errors KIC works as we expected
Time to time we can see errors in log

2024/03/14 14:32:29 http: TLS handshake error from 10.50.57.190:35458: EOF
2024/03/14 14:32:29 http: TLS handshake error from 10.50.56.159:55946: EOF
2024-03-14T17:43:45Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024-03-14T17:46:30Z	info	Successfully synced configuration to Kong	{"url": "https://localhost:8444", "update_strategy": "InMemory", "v": 0}
2024/03/14 22:09:32 http: TLS handshake error from 10.50.56.159:46498: EOF

or

time="2024-03-15T08:08:08Z" level=error msg="checking config status failed" error="making HTTP request: Get \"https://localhost:8444/status\": read tcp 127.0.0.1:46354->127.0.0.1:8444: read: connection reset by peer"
time="2024-03-15T09:51:12Z" level=error msg="failed to fetch KongIngress resource for Services default/***" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/***" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/****" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:13Z" level=error msg="failed to fetch KongIngress resource for Services default/****" error="KongIngress ****.gateway-ingress not found"
time="2024-03-15T09:51:15Z" level=info msg="successfully synced configuration to kong."

Expected Behavior

No response

Steps To Reproduce

during start pod KIC version version 3.0

Kong Ingress Controller version

kong/kubernetes-ingress-controller:3.0

Kubernetes version

1.27.8-gke.1067004

Anything else?

No response

@oleksandrs-adorama oleksandrs-adorama added the bug Something isn't working label Mar 16, 2024
@randmonkey
Copy link
Contributor

@oleksandrs-adorama Looks like the connection inside your k8s cluster ( connection between KIC pod and k8s apiserver, and connection between KIC and Kong gateway admin API) is not very stable so the cache inside KIC's controller runtime may not be synced with k8s apiserver. Do you know the what pods own IPs 10.50.56.159 and 10.50.57.190? This can help us to locate the problems.

@oleksandrs-adorama
Copy link
Author

10.50.56.159 and 10.50.57.190 - konnectivity-agent

@oleksandrs-adorama
Copy link
Author

also i can see errors

2024-03-20T12:05:21Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "ce309f6f-d757-48ac-a553-f6722ecbd207", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}
2024-03-20T12:05:22Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "bd3cf33d-d5b4-4c97-9c4a-b749b9950d06", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}

@randmonkey
Copy link
Contributor

also i can see errors

2024-03-20T12:05:21Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "ce309f6f-d757-48ac-a553-f6722ecbd207", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}
2024-03-20T12:05:22Z	error	controllers.KongConsumer	Reconciler error	{"reconcileID": "bd3cf33d-d5b4-4c97-9c4a-b749b9950d06", "error": "Operation cannot be fulfilled on kongconsumers.configuration.konghq.com \"****.public-user\": the object has been modified; please apply your changes to the latest version and try again"}

This seems to be that the KongConsumer in the cache is outdated. While based on k8s's eventual consistency mechanism, it will be translated and applied on Kong gateway finally. It might take longer time for controller cache to be synced with k8s apiserver if your cluster is heavy loaded or network is not stable.

@oleksandrs-adorama
Copy link
Author

we have three separate environments test, dev and prod. In three env ea have the same behavior.

Copy link

stale bot commented Apr 22, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Will be closed unless advocated for within 7 days label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending author feedback stale Will be closed unless advocated for within 7 days
Projects
None yet
Development

No branches or pull requests

2 participants