-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skaffold fails when a container's status code is 1 #5210
Comments
@Seb-C this does sound like a bad experience. are you saying that skaffold immediately fails when one of the containers exits with error code 1, not after waiting the allotted 60 seconds for the status check? would you be able to provide a small k8s manifest to reproduce this issue? |
@nkubala I am sorry but I don't really have time to prepare an environment to show this. I think that you can reproduce this just with a defined |
This is expected. skaffold dev exits if first deployment fails. This kind of relates to our idea of making 1st devloop not exit. |
duplicate of #4953 |
@tejal29 I disagree with closing this ticket. This is a different issue than the first dev loop problem you linked.
Currently, starting the first dev loop is impossible, independently of any code issue. Even if my pod is still in starting status in k8s, skaffold fails it. I expect Skaffold to handle my k8s objects, and kubernetes to handle my containers. But currently skaffold is also doing checks at the containers level, which short circuits my probes. This mishandling of responsibilities is forcing me to write code that behaves differently in kubernetes (prod) and skaffold (local). |
@Seb-C Can you please re-run with Hope that answers your questoin |
@tejal29 Thank you for your answer. |
@tejal29 I ran into this today when I was upgrading from 1.7.0. It's definitely a regression. 1.7 would wait until the deployment was stable, without caring about the exit codes of the container. I think that's the correct behavior from a k8s point of view, since containers should be considered ephemeral and restartable. Can you re-open this one, or would you rather I opened a new issue? |
from @casret:
|
Remove these errors from the unrecoverable list. Container errors are recoverable in a K8S environment, they may be waiting for another resource to become stable e.g.
Remove these errors from the unrecoverable list. Container errors are recoverable in a K8S environment, they may be waiting for another resource to become stable e.g.
Remove these from the unrecoverable errors list. Containers are ephemeral in k8s, so errors in them may be recoverable at a system level. E.g. when they are waiting for another resource to stablelize.
closing this due to inactivity. |
I have this same issue, and it's a show-stopper for me. Please re-open. |
Glad to see that it has been re-opened. Some more context for me: I use skaffold to set up https://github.com/argoproj/argo-workflows and that fails because argo-server and another pod normally fail a couple of times while mysql is starting up. I've managed to add an init container to the argo pods to make them wait for mysql as a workaround, but it would be much cleaner if I could disable skaffold's "panic on container exit" behavior. |
@foobarbecue thanks for the context. I think probably what we should do here is expose a |
@tejal29 suggested in #5790 (comment):
and
|
Has there been any update on enabling the previous (<1.7) behavior? We have a large number of applications that depend on the container restarting (usually just once) when a dependent DB takes a little longer to start up and become ready for connections. Having skaffold consider a deployment "failed" and quit before the specified k8s |
Recently this PR was merged: Which adds the
to skaffold. This has not been added to our docs site yet, there is an issue tracking that here: With the option enabled, Skaffold will wait for all containers to be successful until the given I believe the feature above should resolve this issue. Will wait until the docs issue/PR is closed and then I will close this. |
The flag |
I have different pods that depends on mysql to start.
Each one of those pods implements the proper
readinessProbe
andlivenessProbe
to check the status of the pod.If the application cannot start, the container stops with
exit 1
because connecting to MySQL is necessary to get the application to work.Since there is no management of dependencies in kubernetes, the recommended way to handle this is to have the pod restart the containers until it works.
Currently, when I run
skaffold dev
, it fails and deletes the resources immediately instead of waiting. If I doskaffold run
, it also reports a failure, but then 2~3 seconds later the pods created by skaffold are running properly in the cluster.Expected behavior
Skaffold waits for kubernetes to get the pod running and does not check the individual container's exit codes.
It should only stop and delete the resources if the pod still fails after
statusCheckDeadlineSeconds
seconds.Actual behavior
Skaffold does not start the dev mode and deletes the resources from the cluster.
Information
Steps to reproduce the behavior
Have a pod that
exit 1
before mysql starts, and then runs successfully.The text was updated successfully, but these errors were encountered: