-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ibmcloud: Peer pods fail during CreateContainer #1882
Comments
Looking in the kata-agent log it has the
But we never get anything back from image-rs's pull image and then after 60s container fails with |
If it's using in-guest image pull, then can you try increasing the remote hypervisor timeout and the container create container timeout - https://github.com/kata-containers/kata-containers/blob/main/src/runtime/config/configuration-remote.toml.in#L298 ? |
Yeah, that's a good idea, but just pulling nginx shouldn't take more that 60s and in the past when I've seen the timeout it's only been on the containerd side, so the kata-agent has still come back for the image pull afterwards, which doesn't seem to be happening here. |
Okay - I stand corrected. It appears that the nginx pull took over 2mins:
So I might not have waited long enough, or the containerd request cancelled it or something? So we have an ibmcloud performance issue, rather than functional one. Thanks for nudging me into trying the timeout Pradipta! |
I'll note that I've just tried the 0.8.2 version of code and that fails with the same issues. As it worked three months ago when 0.8.2 was tested then I think there is potentially some IaaS networking changes/account issues getting in the way and not necessarily a code change. |
When creating an ibmcloud set up on with a self-managed cluster with both s390x and amd64 architectures, the tests fail.
The pod describe looks like:
and CAA log shows and error during the CreateContainer (which includes the pull image step):
I need to dig into the kata-agent logs and see if there is any more information about this.
The text was updated successfully, but these errors were encountered: