Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod status.phase shall not transition from Succeeded to Pending #17011

Closed
tnozicka opened this issue Oct 23, 2017 · 8 comments · Fixed by #17042
Closed

Pod status.phase shall not transition from Succeeded to Pending #17011

tnozicka opened this issue Oct 23, 2017 · 8 comments · Fixed by #17042

Comments

@tnozicka
Copy link
Contributor

Pod status.phase shall not transition from Succeeded to Pending.
(Note that this is dependent on timing of the deletion.)

I think this is the reason we are seeing multiple deployer pods running for DeploymentConfigs.

Version
$ oc version
oc v3.7.0-alpha.1+cba28e3-1203-dirty
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth

Server https://172.16.20.152:8443
openshift v3.7.0-alpha.1+39957d6-1203-dirty
kubernetes v1.7.6+a08f5eeb62
Steps To Reproduce
  1. docker pull centos:centos7
  2. oc observe pods --output=gotemplate -a '{{.}}'
  3. oc create -f ~/tmp/pod/pod.yaml; sleep 3; oc delete -f ~/tmp/pod/pod.yaml;
Current Result

Pod transitions from Succeeded to Pending after being deleted.

Expected Result

Once Pod reaches phase Succeeded it stays there.

Additional Information

$ cat ~/tmp/pod/pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - command:
    - bash
    - -c
    - trap 'excode=$?; trap "" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM;
      sleep 5
    env:
    image: centos:centos7
    imagePullPolicy: IfNotPresent
    name: test-container 
  restartPolicy: Never
  terminationGracePeriodSeconds: 30

$ oc observe pods --output=gotemplate -a '{{.}}'

# 2017-10-23T23:35:04+02:00 Added 50593	"" test test "map[kind:Pod apiVersion:v1 metadata:map[selfLink:/api/v1/namespaces/test/pods/test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 resourceVersion:50593 creationTimestamp:2017-10-23T21:35:04Z annotations:map[openshift.io/scc:anyuid] name:test namespace:test] spec:map[volumes:[map[name:default-token-ql6rx secret:map[defaultMode:420 secretName:default-token-ql6rx]]] restartPolicy:Never dnsPolicy:ClusterFirst securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] imagePullSecrets:[map[name:default-dockercfg-nrccf]] schedulerName:default-scheduler containers:[map[terminationMessagePath:/dev/termination-log terminationMessagePolicy:File securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]] command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] image:centos:centos7 resources:map[] volumeMounts:[map[name:default-token-ql6rx readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount]] imagePullPolicy:IfNotPresent name:test-container]] terminationGracePeriodSeconds:30 serviceAccountName:default serviceAccount:default] status:map[phase:Pending qosClass:BestEffort]]"
# 2017-10-23T23:35:04+02:00 Updated 50594	"" test test "map[spec:map[schedulerName:default-scheduler volumes:[map[name:default-token-ql6rx secret:map[secretName:default-token-ql6rx defaultMode:420]]] containers:[map[image:centos:centos7 resources:map[] terminationMessagePolicy:File securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]] imagePullPolicy:IfNotPresent name:test-container command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] volumeMounts:[map[name:default-token-ql6rx readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount]] terminationMessagePath:/dev/termination-log]] terminationGracePeriodSeconds:30 securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] nodeName:lenovo-t450s imagePullSecrets:[map[name:default-dockercfg-nrccf]] restartPolicy:Never dnsPolicy:ClusterFirst serviceAccountName:default serviceAccount:default] status:map[phase:Pending conditions:[map[status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z type:PodScheduled]] qosClass:BestEffort] kind:Pod apiVersion:v1 metadata:map[creationTimestamp:2017-10-23T21:35:04Z annotations:map[openshift.io/scc:anyuid] name:test namespace:test selfLink:/api/v1/namespaces/test/pods/test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 resourceVersion:50594]]"
# 2017-10-23T23:35:04+02:00 Updated 50596	"" test test "map[metadata:map[resourceVersion:50596 creationTimestamp:2017-10-23T21:35:04Z annotations:map[openshift.io/scc:anyuid] name:test namespace:test selfLink:/api/v1/namespaces/test/pods/test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6] spec:map[containers:[map[image:centos:centos7 command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] volumeMounts:[map[readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount name:default-token-ql6rx]] terminationMessagePath:/dev/termination-log imagePullPolicy:IfNotPresent name:test-container resources:map[] terminationMessagePolicy:File securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]]]] serviceAccountName:default schedulerName:default-scheduler volumes:[map[name:default-token-ql6rx secret:map[secretName:default-token-ql6rx defaultMode:420]]] restartPolicy:Never terminationGracePeriodSeconds:30 dnsPolicy:ClusterFirst serviceAccount:default nodeName:lenovo-t450s securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] imagePullSecrets:[map[name:default-dockercfg-nrccf]]] status:map[startTime:2017-10-23T21:35:04Z containerStatuses:[map[imageID: name:test-container state:map[waiting:map[reason:ContainerCreating]] lastState:map[] ready:false restartCount:0 image:centos:centos7]] qosClass:BestEffort phase:Pending conditions:[map[type:Initialized status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z] map[status:False lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z reason:ContainersNotReady message:containers with unready status: [test-container] type:Ready] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152] kind:Pod apiVersion:v1]"
# 2017-10-23T23:35:05+02:00 Updated 50603	"" test test "map[metadata:map[uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 resourceVersion:50603 creationTimestamp:2017-10-23T21:35:04Z annotations:map[openshift.io/scc:anyuid] name:test namespace:test selfLink:/api/v1/namespaces/test/pods/test] spec:map[containers:[map[terminationMessagePolicy:File imagePullPolicy:IfNotPresent volumeMounts:[map[name:default-token-ql6rx readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount]] terminationMessagePath:/dev/termination-log securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]] name:test-container image:centos:centos7 command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] resources:map[]]] dnsPolicy:ClusterFirst serviceAccount:default imagePullSecrets:[map[name:default-dockercfg-nrccf]] schedulerName:default-scheduler volumes:[map[name:default-token-ql6rx secret:map[secretName:default-token-ql6rx defaultMode:420]]] restartPolicy:Never terminationGracePeriodSeconds:30 serviceAccountName:default nodeName:lenovo-t450s securityContext:map[seLinuxOptions:map[level:s0:c8,c2]]] status:map[phase:Running conditions:[map[type:Initialized status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z] map[lastTransitionTime:2017-10-23T21:35:05Z type:Ready status:True lastProbeTime:<nil>] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152 podIP:172.17.0.3 startTime:2017-10-23T21:35:04Z containerStatuses:[map[ready:true restartCount:0 image:centos:centos7 imageID:docker-pullable://centos@sha256:26f74cefad82967f97f3eeeef88c1b6262f9b42bc96f2ad61d6f3fdf544759b8 containerID:docker://d2fdf038ac90bf3b535a610b2dc8661dfb4f09aea9ffa842c518b28ffd941e4d name:test-container state:map[running:map[startedAt:2017-10-23T21:35:05Z]] lastState:map[]]] qosClass:BestEffort] kind:Pod apiVersion:v1]"
# 2017-10-23T23:35:07+02:00 Updated 50606	"" test test "map[kind:Pod apiVersion:v1 metadata:map[namespace:test selfLink:/api/v1/namespaces/test/pods/test creationTimestamp:2017-10-23T21:35:04Z name:test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 resourceVersion:50606 deletionTimestamp:2017-10-23T21:35:37Z deletionGracePeriodSeconds:30 annotations:map[openshift.io/scc:anyuid]] spec:map[terminationGracePeriodSeconds:30 dnsPolicy:ClusterFirst nodeName:lenovo-t450s securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] imagePullSecrets:[map[name:default-dockercfg-nrccf]] schedulerName:default-scheduler volumes:[map[secret:map[secretName:default-token-ql6rx defaultMode:420] name:default-token-ql6rx]] restartPolicy:Never serviceAccountName:default serviceAccount:default containers:[map[resources:map[] volumeMounts:[map[name:default-token-ql6rx readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount]] securityContext:map[seLinuxOptions:map[level:s0:c8,c2] capabilities:map[drop:[MKNOD]] privileged:false] name:test-container image:centos:centos7 terminationMessagePolicy:File imagePullPolicy:IfNotPresent command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] terminationMessagePath:/dev/termination-log]]] status:map[startTime:2017-10-23T21:35:04Z containerStatuses:[map[containerID:docker://d2fdf038ac90bf3b535a610b2dc8661dfb4f09aea9ffa842c518b28ffd941e4d name:test-container state:map[running:map[startedAt:2017-10-23T21:35:05Z]] lastState:map[] ready:true restartCount:0 image:centos:centos7 imageID:docker-pullable://centos@sha256:26f74cefad82967f97f3eeeef88c1b6262f9b42bc96f2ad61d6f3fdf544759b8]] qosClass:BestEffort phase:Running conditions:[map[type:Initialized status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z] map[type:Ready status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:05Z] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152 podIP:172.17.0.3]]"
# 2017-10-23T23:35:11+02:00 Updated 50611	"" test test "map[spec:map[serviceAccount:default schedulerName:default-scheduler volumes:[map[name:default-token-ql6rx secret:map[secretName:default-token-ql6rx defaultMode:420]]] containers:[map[terminationMessagePolicy:File imagePullPolicy:IfNotPresent securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]] name:test-container command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] resources:map[] volumeMounts:[map[name:default-token-ql6rx readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount]] terminationMessagePath:/dev/termination-log image:centos:centos7]] restartPolicy:Never nodeName:lenovo-t450s securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] imagePullSecrets:[map[name:default-dockercfg-nrccf]] terminationGracePeriodSeconds:30 dnsPolicy:ClusterFirst serviceAccountName:default] status:map[startTime:2017-10-23T21:35:04Z containerStatuses:[map[name:test-container state:map[terminated:map[reason:Completed startedAt:2017-10-23T21:35:05Z finishedAt:2017-10-23T21:35:10Z containerID:docker://d2fdf038ac90bf3b535a610b2dc8661dfb4f09aea9ffa842c518b28ffd941e4d exitCode:0]] lastState:map[] ready:false restartCount:0 image:centos:centos7 imageID:docker-pullable://centos@sha256:26f74cefad82967f97f3eeeef88c1b6262f9b42bc96f2ad61d6f3fdf544759b8 containerID:docker://d2fdf038ac90bf3b535a610b2dc8661dfb4f09aea9ffa842c518b28ffd941e4d]] qosClass:BestEffort phase:Succeeded conditions:[map[lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z reason:PodCompleted type:Initialized status:True] map[reason:PodCompleted type:Ready status:False lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:11Z] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152 podIP:172.17.0.3] kind:Pod apiVersion:v1 metadata:map[namespace:test creationTimestamp:2017-10-23T21:35:04Z deletionGracePeriodSeconds:30 annotations:map[openshift.io/scc:anyuid] name:test selfLink:/api/v1/namespaces/test/pods/test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 resourceVersion:50611 deletionTimestamp:2017-10-23T21:35:37Z]]"
# 2017-10-23T23:35:12+02:00 Updated 50612	"" test test "map[metadata:map[resourceVersion:50612 deletionTimestamp:2017-10-23T21:35:37Z namespace:test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 creationTimestamp:2017-10-23T21:35:04Z deletionGracePeriodSeconds:30 annotations:map[openshift.io/scc:anyuid] name:test selfLink:/api/v1/namespaces/test/pods/test] spec:map[containers:[map[command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] volumeMounts:[map[readOnly:true mountPath:/var/run/secrets/kubernetes.io/serviceaccount name:default-token-ql6rx]] terminationMessagePolicy:File imagePullPolicy:IfNotPresent name:test-container resources:map[] terminationMessagePath:/dev/termination-log securityContext:map[seLinuxOptions:map[level:s0:c8,c2] capabilities:map[drop:[MKNOD]] privileged:false] image:centos:centos7]] dnsPolicy:ClusterFirst serviceAccount:default securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] imagePullSecrets:[map[name:default-dockercfg-nrccf]] volumes:[map[secret:map[secretName:default-token-ql6rx defaultMode:420] name:default-token-ql6rx]] restartPolicy:Never terminationGracePeriodSeconds:30 serviceAccountName:default nodeName:lenovo-t450s schedulerName:default-scheduler] status:map[podIP:172.17.0.3 startTime:2017-10-23T21:35:04Z containerStatuses:[map[state:map[waiting:map[reason:ContainerCreating]] lastState:map[] ready:false restartCount:0 image:centos:centos7 imageID: name:test-container]] qosClass:BestEffort phase:Pending conditions:[map[type:Initialized status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z] map[reason:ContainersNotReady message:containers with unready status: [test-container] type:Ready status:False lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:11Z] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152] kind:Pod apiVersion:v1]"
# 2017-10-23T23:35:12+02:00 Updated 50613	"" test test "map[metadata:map[selfLink:/api/v1/namespaces/test/pods/test resourceVersion:50613 creationTimestamp:2017-10-23T21:35:04Z deletionGracePeriodSeconds:0 name:test namespace:test uid:08279b2a-b83a-11e7-acc3-b6ff77a676a6 deletionTimestamp:2017-10-23T21:35:07Z annotations:map[openshift.io/scc:anyuid]] spec:map[terminationGracePeriodSeconds:30 serviceAccount:default nodeName:lenovo-t450s securityContext:map[seLinuxOptions:map[level:s0:c8,c2]] schedulerName:default-scheduler restartPolicy:Never containers:[map[resources:map[] terminationMessagePath:/dev/termination-log terminationMessagePolicy:File securityContext:map[capabilities:map[drop:[MKNOD]] privileged:false seLinuxOptions:map[level:s0:c8,c2]] imagePullPolicy:IfNotPresent name:test-container image:centos:centos7 command:[bash -c trap 'excode=$?; trap \"\" EXIT; cmd; echo $excode' EXIT HUP INT QUIT PIPE TERM; sleep 5] volumeMounts:[map[mountPath:/var/run/secrets/kubernetes.io/serviceaccount name:default-token-ql6rx readOnly:true]]]] dnsPolicy:ClusterFirst serviceAccountName:default imagePullSecrets:[map[name:default-dockercfg-nrccf]] volumes:[map[name:default-token-ql6rx secret:map[secretName:default-token-ql6rx defaultMode:420]]]] status:map[phase:Pending conditions:[map[type:Initialized status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z] map[reason:ContainersNotReady message:containers with unready status: [test-container] type:Ready status:False lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:11Z] map[type:PodScheduled status:True lastProbeTime:<nil> lastTransitionTime:2017-10-23T21:35:04Z]] hostIP:172.16.20.152 podIP:172.17.0.3 startTime:2017-10-23T21:35:04Z containerStatuses:[map[name:test-container state:map[waiting:map[reason:ContainerCreating]] lastState:map[] ready:false restartCount:0 image:centos:centos7 imageID:]] qosClass:BestEffort] kind:Pod apiVersion:v1]"
@tnozicka
Copy link
Contributor Author

P0 after the associated issue #16870

@smarterclayton
Copy link
Contributor

Uh oh

@sjenning
Copy link
Contributor

fyi @derekwaynecarr

might be related to our old friend https://bugzilla.redhat.com/show_bug.cgi?id=1486356

@sjenning
Copy link
Contributor

sjenning commented Oct 24, 2017

I was unable to reproduce this on kubernetes release-1.7 branch or v1.7.6.

Here are the logs where things go sideways:

I1023 22:47:36.615158   20104 generic.go:183] GenericPLEG: Relisting
I1023 22:47:36.618081   20104 generic.go:147] GenericPLEG: 0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649: running -> exited
I1023 22:47:36.618102   20104 generic.go:147] GenericPLEG: 0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/48679b5a5ac8ba5f584b20f11cdb12742f0058ee890134454b669437cb64c95a: running -> exited
I1023 22:47:36.619351   20104 kuberuntime_manager.go:828] getSandboxIDByPodUID got sandbox IDs ["48679b5a5ac8ba5f584b20f11cdb12742f0058ee890134454b669437cb64c95a"] for pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:36.625538   20104 generic.go:381] PLEG: Write status for test/default: &container.PodStatus{ID:"0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", Name:"test", Namespace:"default", IP:"", ContainerStatuses:[]*container.ContainerStatus{(*container.ContainerStatus)(0xc428c1b0a0)}, SandboxStatuses:[]*runtime.PodSandboxStatus{(*runtime.PodSandboxStatus)(0xc42af894f0)}} (err: <nil>)
I1023 22:47:36.625592   20104 kubelet.go:1888] SyncLoop (PLEG): "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)", event: &pleg.PodLifecycleEvent{ID:"0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", Type:"ContainerDied", Data:"54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649"}
I1023 22:47:36.625624   20104 kubelet_pods.go:1224] Generating status for "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:36.625646   20104 helpers.go:84] Already ran container "test-container" of pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)", do nothing
I1023 22:47:36.625681   20104 kubelet.go:1888] SyncLoop (PLEG): "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)", event: &pleg.PodLifecycleEvent{ID:"0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", Type:"ContainerDied", Data:"48679b5a5ac8ba5f584b20f11cdb12742f0058ee890134454b669437cb64c95a"}
I1023 22:47:36.625703   20104 kubelet_pods.go:1224] Generating status for "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:36.625708   20104 kubelet_pods.go:1224] Generating status for "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:36.625716   20104 helpers.go:84] Already ran container "test-container" of pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)", do nothing
I1023 22:47:36.625734   20104 helpers.go:84] Already ran container "test-container" of pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)", do nothing
I1023 22:47:36.625744   20104 kuberuntime_container.go:798] Removing container "54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649"
W1023 22:47:36.625736   20104 pod_container_deletor.go:77] Container "48679b5a5ac8ba5f584b20f11cdb12742f0058ee890134454b669437cb64c95a" not found in pod's containers
I1023 22:47:36.625797   20104 status_manager.go:340] Status Manager: adding pod: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", with status: ('\x04', {Succeeded [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT PodCompleted } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:36 -0500 CDT PodCompleted } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  }]   10.42.10.23 172.17.0.2 2017-10-23 22:47:29 -0500 CDT [] [{test-container {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2017-10-23 22:47:30 -0500 CDT,FinishedAt:2017-10-23 22:47:35 -0500 CDT,ContainerID:docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649,}} {nil nil nil} false 0 docker.io/centos:centos7 docker-pullable://docker.io/centos@sha256:eba772bac22c86d7d6e72421b4700c3f894ab6e35475a34014ff8de74c10872e docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649}] BestEffort}) to podStatusChannel
I1023 22:47:36.625901   20104 status_manager.go:147] Status Manager: syncing pod: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", with status: (4, {Succeeded [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT PodCompleted } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:36 -0500 CDT PodCompleted } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  }]   10.42.10.23 172.17.0.2 2017-10-23 22:47:29 -0500 CDT [] [{test-container {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2017-10-23 22:47:30 -0500 CDT,FinishedAt:2017-10-23 22:47:35 -0500 CDT,ContainerID:docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649,}} {nil nil nil} false 0 docker.io/centos:centos7 docker-pullable://docker.io/centos@sha256:eba772bac22c86d7d6e72421b4700c3f894ab6e35475a34014ff8de74c10872e docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649}] BestEffort}) from podStatusChannel
I1023 22:47:36.630263   20104 qos_container_manager_linux.go:286] [ContainerManager]: Updated QoS cgroup configuration
I1023 22:47:36.630618   20104 wrap.go:42] PUT /api/v1/namespaces/default/pods/test/status: (2.126702ms) 200 [[openshift/v1.7.6+a08f5eeb62 (linux/amd64) kubernetes/c84beff] 10.42.10.23:38370]
I1023 22:47:36.630765   20104 config.go:293] Setting pods for source api
I1023 22:47:36.630909   20104 graph_populator.go:63] updatePod default/test, node unchanged
I1023 22:47:36.630960   20104 graph_builder.go:475] GraphBuilder process object: v1/Pod, namespace default, name test, uid 0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa, event type update
I1023 22:47:36.630975   20104 kubelet.go:1867] SyncLoop (RECONCILE, "api"): "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:36.630911   20104 status_manager.go:453] Status for pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)" updated successfully: (4, {Phase:Succeeded Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:29 -0500 CDT Reason:PodCompleted Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:36 -0500 CDT Reason:PodCompleted Message:} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:29 -0500 CDT Reason: Message:}] Message: Reason: HostIP:10.42.10.23 PodIP:172.17.0.2 StartTime:2017-10-23 22:47:29 -0500 CDT InitContainerStatuses:[] ContainerStatuses:[{Name:test-container State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2017-10-23 22:47:30 -0500 CDT,FinishedAt:2017-10-23 22:47:35 -0500 CDT,ContainerID:docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:docker.io/centos:centos7 ImageID:docker-pullable://docker.io/centos@sha256:eba772bac22c86d7d6e72421b4700c3f894ab6e35475a34014ff8de74c10872e ContainerID:docker://54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649}] QOSClass:BestEffort})
I1023 22:47:36.631032   20104 kubelet_pods.go:788] Pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)" is terminated, but some containers have not been cleaned up: [0xc428c1b0a0]
I1023 22:47:36.631008   20104 disruption.go:349] updatePod called on pod "test"
I1023 22:47:36.631060   20104 disruption.go:412] No PodDisruptionBudgets found for pod test, PodDisruptionBudget controller will avoid syncing.
I1023 22:47:36.631066   20104 disruption.go:352] No matching pdb for pod "test"
I1023 22:47:36.717634   20104 desired_state_of_world_populator.go:243] Removing volume from desired state for volume "default-token-p6877" (UniqueName: "kubernetes.io/secret/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa-default-token-p6877") pod "test" (UID: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa") 
I1023 22:47:36.815551   20104 reconciler.go:186] operationExecutor.UnmountVolume started for volume "default-token-p6877" (UniqueName: "kubernetes.io/secret/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa-default-token-p6877") pod "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa" (UID: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa") 
I1023 22:47:36.815623   20104 util.go:367] Tearing down volume default-token-p6877 for pod 0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa at /home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/bin/linux/amd64/openshift.local.volumes/pods/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/volumes/kubernetes.io~secret/default-token-p6877
I1023 22:47:36.815805   20104 empty_dir_linux.go:38] Determining mount medium of /home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/bin/linux/amd64/openshift.local.volumes/pods/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/volumes/kubernetes.io~secret/default-token-p6877
I1023 22:47:36.815845   20104 empty_dir_linux.go:48] Statfs_t of /home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/bin/linux/amd64/openshift.local.volumes/pods/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/volumes/kubernetes.io~secret/default-token-p6877: {Type:16914836 Bsize:4096 Blocks:2055680 Bfree:2055676 Bavail:2055676 Files:2055680 Ffree:2055669 Fsid:{X__val:[0 0]} Namelen:255 Frsize:4096 Flags:4128 Spare:[0 0 0 0]}
I1023 22:47:36.815894   20104 mount_linux.go:224] Unmounting /home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/bin/linux/amd64/openshift.local.volumes/pods/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/volumes/kubernetes.io~secret/default-token-p6877
I1023 22:47:36.822537   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:36.850243   20104 operation_generator.go:532] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa-default-token-p6877" (OuterVolumeSpecName: "default-token-p6877") pod "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa" (UID: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa"). InnerVolumeSpecName "default-token-p6877". PluginName "kubernetes.io/secret", VolumeGidValue ""
I1023 22:47:36.915745   20104 reconciler.go:290] Volume detached for volume "default-token-p6877" (UniqueName: "kubernetes.io/secret/0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa-default-token-p6877") on node "cerebellum.local.variantweb.net" DevicePath ""
I1023 22:47:36.922863   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.023021   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.123194   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.223460   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.323739   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.424005   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.496783   20104 kubelet.go:1930] SyncLoop (housekeeping)
I1023 22:47:37.500762   20104 kubelet_pods.go:973] Killing unwanted pod "test"
E1023 22:47:37.502268   20104 remote_runtime.go:277] ContainerStatus "54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649" from runtime service failed: rpc error: code = 2 desc = Error: No such container: 54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649
I1023 22:47:37.503210   20104 kubelet_pods.go:1638] Orphaned pod "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa" found, removing pod cgroups
E1023 22:47:37.505515   20104 kubelet_pods.go:976] Failed killing the pod "test": failed to "KillContainer" for "test-container" with KillContainerError: "rpc error: code = 2 desc = Error: No such container: 54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649"
I1023 22:47:37.507552   20104 manager.go:988] Destroyed container: "/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod0f3cdbbe_b86e_11e7_891d_1c1b0deeddfa.slice" (aliases: [], namespace: "")
I1023 22:47:37.507584   20104 handler.go:325] Added event &{/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod0f3cdbbe_b86e_11e7_891d_1c1b0deeddfa.slice 2017-10-23 22:47:37.507574826 -0500 CDT containerDeletion {<nil>}}
I1023 22:47:37.524257   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.624424   20104 desired_state_of_world_populator.go:160] Skipping findAndRemoveDeletedPods(). Not permitted until 2017-10-23 22:47:38.717614742 -0500 CDT (getPodStatusRetryDuration 2s).
I1023 22:47:37.625703   20104 generic.go:183] GenericPLEG: Relisting
I1023 22:47:37.628355   20104 generic.go:147] GenericPLEG: 0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa/54b38d55b7505e6130f75aae23b6e65e608566f44f6b707866a0dbcb003db649: exited -> non-existent
I1023 22:47:37.629628   20104 kuberuntime_manager.go:828] getSandboxIDByPodUID got sandbox IDs ["48679b5a5ac8ba5f584b20f11cdb12742f0058ee890134454b669437cb64c95a"] for pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:37.632287   20104 generic.go:381] PLEG: Write status for test/default: &container.PodStatus{ID:"0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", Name:"test", Namespace:"default", IP:"", ContainerStatuses:[]*container.ContainerStatus{}, SandboxStatuses:[]*runtime.PodSandboxStatus{(*runtime.PodSandboxStatus)(0xc425959b80)}} (err: <nil>)
I1023 22:47:37.632348   20104 kubelet_pods.go:1224] Generating status for "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:37.632386   20104 kubelet_pods.go:1189] pod waiting > 0, pending
I1023 22:47:37.632460   20104 status_manager.go:340] Status Manager: adding pod: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", with status: ('\x05', {Pending [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:36 -0500 CDT ContainersNotReady containers with unready status: [test-container]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  }]   10.42.10.23 172.17.0.2 2017-10-23 22:47:29 -0500 CDT [] [{test-container {&ContainerStateWaiting{Reason:ContainerCreating,Message:,} nil nil} {nil nil nil} false 0 centos:centos7  }] BestEffort}) to podStatusChannel
I1023 22:47:37.632524   20104 status_manager.go:147] Status Manager: syncing pod: "0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa", with status: (5, {Pending [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:36 -0500 CDT ContainersNotReady containers with unready status: [test-container]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-23 22:47:29 -0500 CDT  }]   10.42.10.23 172.17.0.2 2017-10-23 22:47:29 -0500 CDT [] [{test-container {&ContainerStateWaiting{Reason:ContainerCreating,Message:,} nil nil} {nil nil nil} false 0 centos:centos7  }] BestEffort}) from podStatusChannel
I1023 22:47:37.637291   20104 wrap.go:42] PUT /api/v1/namespaces/default/pods/test/status: (2.072871ms) 200 [[openshift/v1.7.6+a08f5eeb62 (linux/amd64) kubernetes/c84beff] 10.42.10.23:38370]
I1023 22:47:37.637404   20104 config.go:293] Setting pods for source api
I1023 22:47:37.637532   20104 graph_populator.go:63] updatePod default/test, node unchanged
I1023 22:47:37.637633   20104 kubelet.go:1867] SyncLoop (RECONCILE, "api"): "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)"
I1023 22:47:37.637568   20104 status_manager.go:453] Status for pod "test_default(0f3cdbbe-b86e-11e7-891d-1c1b0deeddfa)" updated successfully: (5, {Phase:Pending Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:29 -0500 CDT Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:36 -0500 CDT Reason:ContainersNotReady Message:containers with unready status: [test-container]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2017-10-23 22:47:29 -0500 CDT Reason: Message:}] Message: Reason: HostIP:10.42.10.23 PodIP:172.17.0.2 StartTime:2017-10-23 22:47:29 -0500 CDT InitContainerStatuses:[] ContainerStatuses:[{Name:test-container State:{Waiting:&ContainerStateWaiting{Reason:ContainerCreating,Message:,} Running:nil Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:centos:centos7 ImageID: ContainerID:}] QOSClass:BestEffort})

PLEG picks up on the container moving from running -> exited and the pod is marked as Succeeded.

However, then we failed to get the container status because it doesn't exist anymore, and the pod is seen as orphaned. PLEG goes from exited -> non-existent and a new status is created for the pod starting an Pending.

@sjenning
Copy link
Contributor

cc @joelsmith @frobware

@sjenning
Copy link
Contributor

This is caused by #16913 backported from kube 1.8. This is reproducible in kube release-1.8 upstream.

@sjenning
Copy link
Contributor

Opened upstream issue kubernetes/kubernetes#54499. We'll see if they want to revert or fix. Hopefully we'll get a quick response.

@tnozicka
Copy link
Contributor Author

tnozicka commented Dec 5, 2017

for reference this still seems broken #17595 just now it is Failed -> Succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants