Description
Describe the bug
I'm having an issue where my Instance and it is associated Broker pod seem to disappear and won't come back until I either restart my agent and udev discovery pods, or it seems I can simply modify my Configuration
resource which triggers it to come back.
Output of kubectl get pods,akrii,akric -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/aeotec-gen-5-79e1a1-pod 1/1 Running 0 9m21s 10.244.0.8 talos-192-168-32-2 <none> <none>
pod/akri-agent-daemonset-cv95v 1/1 Running 0 23d 10.244.1.5 talos-192-168-32-3 <none> <none>
pod/akri-agent-daemonset-kpqzq 1/1 Running 0 13h 10.244.0.224 talos-192-168-32-2 <none> <none>
pod/akri-agent-daemonset-pwdj6 1/1 Running 0 25d 10.244.2.8 talos-192-168-32-4 <none> <none>
pod/akri-controller-deployment-5496f7c977-p75v6 1/1 Running 0 17h 10.244.2.85 talos-192-168-32-4 <none> <none>
pod/akri-udev-discovery-daemonset-6kw69 1/1 Running 0 23d 10.244.1.3 talos-192-168-32-3 <none> <none>
pod/akri-udev-discovery-daemonset-gkhzv 1/1 Running 0 25d 10.244.2.7 talos-192-168-32-4 <none> <none>
pod/akri-udev-discovery-daemonset-np5dt 1/1 Running 0 13h 10.244.0.223 talos-192-168-32-2 <none> <none>
pod/chrony-5jqss 1/1 Running 0 17d 10.244.0.253 talos-192-168-32-2 <none> <none>
pod/chrony-5vhn5 1/1 Running 0 17d 10.244.2.2 talos-192-168-32-4 <none> <none>
pod/chrony-94xlv 1/1 Running 0 17d 10.244.1.6 talos-192-168-32-3 <none> <none>
pod/intel-gpu-plugin-2wkdw 1/1 Running 0 46h 10.244.2.3 talos-192-168-32-4 <none> <none>
pod/intel-gpu-plugin-rrj4w 1/1 Running 4 75d 10.244.0.162 talos-192-168-32-2 <none> <none>
pod/intel-gpu-plugin-s8j6k 1/1 Running 0 75d 10.244.1.4 talos-192-168-32-3 <none> <none>
pod/node-feature-discovery-master-7f48f68ccf-v7jb2 1/1 Running 0 17h 10.244.2.75 talos-192-168-32-4 <none> <none>
pod/node-feature-discovery-worker-kssph 1/1 Running 13 75d 10.244.1.2 talos-192-168-32-3 <none> <none>
pod/node-feature-discovery-worker-n7vmv 1/1 Running 8 75d 10.244.2.4 talos-192-168-32-4 <none> <none>
pod/node-feature-discovery-worker-zs5cn 1/1 Running 75 75d 10.244.0.164 talos-192-168-32-2 <none> <none>
NAME CONFIG SHARED NODES AGE
instance.akri.sh/aeotec-gen-5-79e1a1 aeotec-gen-5 false ["talos-192-168-32-2"] 9m24s
NAME CAPACITY AGE
configuration.akri.sh/aeotec-gen-5 1 43d
configuration.akri.sh/conbee-ii 1 43d
configuration.akri.sh/ti-cc2531 1 43d
Kubernetes Version: 1.21.3
Logs
I believe these are applicable logs from the controller. I just had my conbee-ii
instance disappear on me. The discussion we had in Slack points to potential race conditions happening when the Configuration
resources are updated. Typically as a result of Flux updating metadata on the resources to manage its state. So we would likely have multiple resources being updated at nearly the same time.
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] internal_do_instance_watch - aquired sync lock
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance - enter
[2021-09-02T18:00:41Z INFO controller::util::instance_action] handle_instance - deleted Akri Instance aeotec-gen-5-79e1a1: Instance { configuration_name: "aeotec-gen-5", broker_properties: {"UDEV_DEVNODE": "/dev/ttyACM1"}, shared: false, nodes: ["talos-192-168-32-2"], device_usage: {"aeotec-gen-5-79e1a1-0": "talos-192-168-32-2"} }
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - enter Remove
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked from instance={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: NoAction }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - find all pods that have akri.sh/instance=aeotec-gen-5-79e1a1
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=aeotec-gen-5-79e1a1") field_selector=None
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - found 1 pods
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - update actions based on the existing pods
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] select_pod_action phase="Running" action=Remove unknown_node=false
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pod_action phase="Running" action=Remove unknown_node=false trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pods_on_known_nodes phase="Running" action=Remove trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods action=Remove trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods - Running Pod (tracked) ... PodAction::Remove ("aeotec-gen-5-79e1a1-pod")
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked after querying existing pods={"talos-192-168-32-2": PodContext { node_name: Some("talos-192-168-32-2"), namespace: Some("host-system"), action: Remove }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::create_pod_app_name("aeotec-gen-5-79e1a1", "talos-192-168-32-2", false, "pod")
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::remove_pod name="aeotec-gen-5-79e1a1-pod", namespace="host-system"
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] remove_pod enter
[2021-09-02T18:00:41Z INFO akri_shared::k8s::pod] remove_pod pods.delete(...).await?:
[2021-09-02T18:00:41Z INFO akri_shared::k8s::pod] remove_pod pods.delete return: "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::remove_pod succeeded
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - exit
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] internal_do_instance_watch - aquired sync lock
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance - enter
[2021-09-02T18:00:41Z INFO controller::util::instance_action] handle_instance - deleted Akri Instance conbee-ii-55a336: Instance { configuration_name: "conbee-ii", broker_properties: {"UDEV_DEVNODE": "/dev/ttyACM2"}, shared: false, nodes: ["talos-192-168-32-2"], device_usage: {"conbee-ii-55a336-0": "talos-192-168-32-2"} }
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - enter Remove
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked from instance={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: NoAction }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - find all pods that have akri.sh/instance=conbee-ii-55a336
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=conbee-ii-55a336") field_selector=None
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - found 1 pods
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - update actions based on the existing pods
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] select_pod_action phase="Running" action=Remove unknown_node=false
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pod_action phase="Running" action=Remove unknown_node=false trace_node_name="conbee-ii-55a336-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pods_on_known_nodes phase="Running" action=Remove trace_node_name="conbee-ii-55a336-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods action=Remove trace_node_name="conbee-ii-55a336-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods - Running Pod (tracked) ... PodAction::Remove ("conbee-ii-55a336-pod")
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked after querying existing pods={"talos-192-168-32-2": PodContext { node_name: Some("talos-192-168-32-2"), namespace: Some("host-system"), action: Remove }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::create_pod_app_name("conbee-ii-55a336", "talos-192-168-32-2", false, "pod")
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::remove_pod name="conbee-ii-55a336-pod", namespace="host-system"
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] remove_pod enter
[2021-09-02T18:00:41Z INFO akri_shared::k8s::pod] remove_pod pods.delete(...).await?:
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - pod name "conbee-ii-55a336-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:41Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:41Z INFO akri_shared::k8s::pod] remove_pod pods.delete return: "conbee-ii-55a336-pod"
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_deletion_work - pod::remove_pod succeeded
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - exit
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] internal_do_instance_watch - aquired sync lock
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance - enter
[2021-09-02T18:00:41Z INFO controller::util::instance_action] handle_instance - added Akri Instance aeotec-gen-5-79e1a1: Instance { configuration_name: "aeotec-gen-5", broker_properties: {"UDEV_DEVNODE": "/dev/ttyACM1"}, shared: false, nodes: ["talos-192-168-32-2"], device_usage: {"aeotec-gen-5-79e1a1-0": ""} }
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - enter Add
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked from instance={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: Add }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - find all pods that have akri.sh/instance=aeotec-gen-5-79e1a1
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=aeotec-gen-5-79e1a1") field_selector=None
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:41Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - found 1 pods
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - update actions based on the existing pods
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] select_pod_action phase="Running" action=Add unknown_node=false
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pod_action phase="Running" action=Add unknown_node=false trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_pods_on_known_nodes phase="Running" action=Add trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods action=Add trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:41Z TRACE controller::util::pod_action] choice_for_running_pods - Running Pod (tracked) ... PodAction::NoAction ("aeotec-gen-5-79e1a1-pod")
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked after querying existing pods={"talos-192-168-32-2": PodContext { node_name: Some("talos-192-168-32-2"), namespace: Some("host-system"), action: NoAction }}
[2021-09-02T18:00:41Z TRACE controller::util::instance_action] handle_instance_change - exit
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - pod name "conbee-ii-55a336-pod"
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:42Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - pod name "conbee-ii-55a336-pod"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Deleted event]
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - Deleted: "conbee-ii-55a336-pod"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - call handle_non_running_pod
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_non_running_pod - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] get_instance_and_configuration_from_pod - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - find_pods_with_label(akri.sh/instance=conbee-ii-55a336)
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=conbee-ii-55a336") field_selector=None
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - found 0 pods
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - num_non_terminating_pods: 0
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service app_name="conbee-ii-55a336-svc", namespace="host-system"
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service enter
[2021-09-02T18:00:43Z INFO akri_shared::k8s::service] remove_service svcs.create(...).await?:
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service - service already deleted
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service succeeded
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - find_pods_with_label(akri.sh/configuration=conbee-ii)
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/configuration=conbee-ii") field_selector=None
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - found 0 pods
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - num_non_terminating_pods: 0
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service app_name="conbee-ii-svc", namespace="host-system"
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service enter
[2021-09-02T18:00:43Z INFO akri_shared::k8s::service] remove_service svcs.create(...).await?:
[2021-09-02T18:00:43Z INFO akri_shared::k8s::service] remove_service svcs.delete return: "Success"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service succeeded
[2021-09-02T18:00:43Z TRACE akri_shared::akri::instance] find_instance enter
[2021-09-02T18:00:43Z TRACE akri_shared::akri::instance] find_instance kube_client.request::<KubeAkriInstance>(akri_instance_type.get(...)?).await?
[2021-09-02T18:00:43Z TRACE akri_shared::akri::instance] find_instance kube_client.request returned kube error: ErrorResponse { status: "Failure", message: "instances.akri.sh \"conbee-ii-55a336\" not found", reason: "NotFound", code: 404 }
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Deleted event]
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_pod - Deleted: "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - last_known_state: Running
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_deleted_pod_if_needed - call handle_non_running_pod
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] handle_non_running_pod - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] get_instance_and_configuration_from_pod - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - find_pods_with_label(akri.sh/instance=aeotec-gen-5-79e1a1)
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=aeotec-gen-5-79e1a1") field_selector=None
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - found 0 pods
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - num_non_terminating_pods: 0
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service app_name="aeotec-gen-5-79e1a1-svc", namespace="host-system"
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service enter
[2021-09-02T18:00:43Z INFO akri_shared::k8s::service] remove_service svcs.create(...).await?:
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service - service already deleted
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service succeeded
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - enter
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - find_pods_with_label(akri.sh/configuration=aeotec-gen-5)
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/configuration=aeotec-gen-5") field_selector=None
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] find_pods_and_cleanup_svc_if_unsupported - found 0 pods
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - num_non_terminating_pods: 0
[2021-09-02T18:00:43Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service app_name="aeotec-gen-5-svc", namespace="host-system"
[2021-09-02T18:00:43Z TRACE akri_shared::k8s::service] remove_service enter
[2021-09-02T18:00:43Z INFO akri_shared::k8s::service] remove_service svcs.create(...).await?:
[2021-09-02T18:00:44Z INFO akri_shared::k8s::service] remove_service svcs.delete return: "Success"
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] cleanup_svc_if_unsupported - service::remove_service succeeded
[2021-09-02T18:00:44Z TRACE akri_shared::akri::instance] find_instance enter
[2021-09-02T18:00:44Z TRACE akri_shared::akri::instance] find_instance kube_client.request::<KubeAkriInstance>(akri_instance_type.get(...)?).await?
[2021-09-02T18:00:44Z TRACE akri_shared::akri::instance] find_instance return
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - enter Update
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked from instance={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: Add }}
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - find all pods that have akri.sh/instance=aeotec-gen-5-79e1a1
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=aeotec-gen-5-79e1a1") field_selector=None
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - found 0 pods
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - update actions based on the existing pods
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked after querying existing pods={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: Add }}
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - find configuration for "aeotec-gen-5"
[2021-09-02T18:00:44Z TRACE akri_shared::akri::configuration] find_configuration enter
[2021-09-02T18:00:44Z TRACE akri_shared::akri::configuration] find_configuration kube_client.request::<KubeAkriConfig>(akri_config_type.get(...)?).await?
[2021-09-02T18:00:44Z TRACE akri_shared::akri::configuration] find_configuration return
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - found configuration for "aeotec-gen-5"
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_addition_work - Create new Pod for Node="talos-192-168-32-2"
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] create_new_pod_from_spec enter
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] create_new_pod_from_spec return
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_addition_work - New pod spec=Pod { metadata: Some(ObjectMeta { annotations: None, cluster_name: None, creation_timestamp: None, deletion_grace_period_seconds: None, deletion_timestamp: None, finalizers: None, generate_name: None, generation: None, labels: Some({"akri.sh/configuration": "aeotec-gen-5", "akri.sh/instance": "aeotec-gen-5-79e1a1", "akri.sh/target-node": "talos-192-168-32-2", "app": "aeotec-gen-5-79e1a1-pod", "controller": "akri.sh"}), managed_fields: None, name: Some("aeotec-gen-5-79e1a1-pod"), namespace: Some("host-system"), owner_references: Some([OwnerReference { api_version: "akri.sh/v0", block_owner_deletion: Some(true), controller: Some(true), kind: "Instance", name: "aeotec-gen-5-79e1a1", uid: "e2ef5ff0-20a1-447a-8cc6-73244dc20a35" }]), resource_version: None, self_link: None, uid: None }), spec: Some(PodSpec { active_deadline_seconds: None, affinity: Some(Affinity { node_affinity: Some(NodeAffinity { preferred_during_scheduling_ignored_during_execution: None, required_during_scheduling_ignored_during_execution: Some(NodeSelector { node_selector_terms: [NodeSelectorTerm { match_expressions: None, match_fields: Some([NodeSelectorRequirement { key: "metadata.name", operator: "In", values: Some(["talos-192-168-32-2"]) }]) }] }) }), pod_affinity: None, pod_anti_affinity: None }), automount_service_account_token: None, containers: [Container { args: None, command: None, env: Some([EnvVar { name: "DEVICE_OPTIONS", value: Some("9600"), value_from: None }]), env_from: None, image: Some("ammmze/akri-ser2net-broker:1.0.4"), image_pull_policy: Some("IfNotPresent"), lifecycle: None, liveness_probe: None, name: "aeotec-gen-5-broker", ports: Some([ContainerPort { container_port: 2000, host_ip: None, host_port: None, name: Some("device"), protocol: None }]), readiness_probe: None, resources: Some(ResourceRequirements { limits: Some({"akri.sh/aeotec-gen-5-79e1a1": Quantity("1"), "cpu": Quantity("100m"), "memory": Quantity("200Mi")}), requests: Some({"akri.sh/aeotec-gen-5-79e1a1": Quantity("1"), "cpu": Quantity("10m"), "memory": Quantity("50Mi")}) }), security_context: Some(SecurityContext { allow_privilege_escalation: None, capabilities: None, privileged: Some(true), proc_mount: None, read_only_root_filesystem: None, run_as_group: None, run_as_non_root: None, run_as_user: None, se_linux_options: None, windows_options: None }), startup_probe: None, stdin: None, stdin_once: None, termination_message_path: None, termination_message_policy: None, tty: None, volume_devices: None, volume_mounts: None, working_dir: None }], dns_config: None, dns_policy: None, enable_service_links: None, ephemeral_containers: None, host_aliases: None, host_ipc: None, host_network: None, host_pid: None, hostname: None, image_pull_secrets: None, init_containers: None, node_name: None, node_selector: None, overhead: None, preemption_policy: None, priority: None, priority_class_name: None, readiness_gates: None, restart_policy: None, runtime_class_name: None, scheduler_name: None, security_context: None, service_account: None, service_account_name: None, share_process_namespace: None, subdomain: None, termination_grace_period_seconds: None, tolerations: None, topology_spread_constraints: None, volumes: None }), status: None }
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] create_pod enter
[2021-09-02T18:00:44Z INFO akri_shared::k8s::pod] create_pod pods.create(...).await?:
[2021-09-02T18:00:44Z INFO akri_shared::k8s::pod] create_pod pods.create return: "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_addition_work - pod::create_pod succeeded
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_addition_work - POST nodeInfo.SetNode
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - exit
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Added event]
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Pending"
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Pending"
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] internal_do_instance_watch - aquired sync lock
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance - enter
[2021-09-02T18:00:44Z INFO controller::util::instance_action] handle_instance - modified Akri Instance aeotec-gen-5-79e1a1: Instance { configuration_name: "aeotec-gen-5", broker_properties: {"UDEV_DEVNODE": "/dev/ttyACM1"}, shared: false, nodes: ["talos-192-168-32-2"], device_usage: {"aeotec-gen-5-79e1a1-0": "talos-192-168-32-2"} }
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - enter Update
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked from instance={"talos-192-168-32-2": PodContext { node_name: None, namespace: None, action: Add }}
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - find all pods that have akri.sh/instance=aeotec-gen-5-79e1a1
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector with label_selector=Some("akri.sh/instance=aeotec-gen-5-79e1a1") field_selector=None
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector PRE pods.list(...).await?
[2021-09-02T18:00:44Z TRACE akri_shared::k8s::pod] find_pods_with_selector return
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - found 1 pods
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - update actions based on the existing pods
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] select_pod_action phase="Pending" action=Update unknown_node=false
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] choice_for_pod_action phase="Pending" action=Update unknown_node=false trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] choice_for_pods_on_known_nodes phase="Pending" action=Update trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] choice_for_non_running_pods action=Update trace_node_name="aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] time_choice_for_non_running_pods
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] time_choice_for_non_running_pods - no start time found ... give it more time? ("aeotec-gen-5-79e1a1-pod")
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] time_choice_for_non_running_pods - give_it_more_time: (true)
[2021-09-02T18:00:44Z TRACE controller::util::pod_action] time_choice_for_non_running_pods - Pending Pod (tracked) ... PodAction::NoAction ("aeotec-gen-5-79e1a1-pod")
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - nodes tracked after querying existing pods={"talos-192-168-32-2": PodContext { node_name: Some("talos-192-168-32-2"), namespace: Some("host-system"), action: NoAction }}
[2021-09-02T18:00:44Z TRACE controller::util::instance_action] handle_instance_change - exit
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:44Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Pending"
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_pod - enter [event: Modified event]
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_pod - pod name "aeotec-gen-5-79e1a1-pod"
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_pod - pod phase "Running"
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - enter
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - last_known_state: Pending
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_running_pod_if_needed - call handle_running_pod
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] handle_running_pod - enter
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] get_instance_and_configuration_from_pod - enter
[2021-09-02T18:00:45Z TRACE akri_shared::akri::configuration] find_configuration enter
[2021-09-02T18:00:45Z TRACE akri_shared::akri::configuration] find_configuration kube_client.request::<KubeAkriConfig>(akri_config_type.get(...)?).await?
[2021-09-02T18:00:45Z TRACE akri_shared::akri::configuration] find_configuration return
[2021-09-02T18:00:45Z TRACE akri_shared::akri::instance] find_instance enter
[2021-09-02T18:00:45Z TRACE akri_shared::akri::instance] find_instance kube_client.request::<KubeAkriInstance>(akri_instance_type.get(...)?).await?
[2021-09-02T18:00:45Z TRACE akri_shared::akri::instance] find_instance return
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] add_instance_and_configuration_services - instance="aeotec-gen-5-79e1a1"
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] create_or_update_service - instance="aeotec-gen-5-79e1a1" with ownership:OwnershipInfo { object_type: Configuration, object_uid: "2fcf85d6-d59c-4236-8424-5b0f3788c38e", object_name: "aeotec-gen-5" }
[2021-09-02T18:00:45Z TRACE akri_shared::k8s::service] find_services_with_selector with selector="akri.sh/configuration=aeotec-gen-5"
[2021-09-02T18:00:45Z TRACE akri_shared::k8s::service] find_services_with_selector PRE svcs.list(...).await?
[2021-09-02T18:00:45Z TRACE akri_shared::k8s::service] find_services_with_selector return
[2021-09-02T18:00:45Z TRACE controller::util::pod_watcher] create_or_update_service - New instance svc spec=Service { metadata: Some(ObjectMeta { annotations: None, cluster_name: None, creation_timestamp: None, deletion_grace_period_seconds: None, deletion_timestamp: None, finalizers: None, generate_name: None, generation: None, labels: Some({"akri.sh/configuration": "aeotec-gen-5", "app": "aeotec-gen-5-svc", "controller": "akri.sh"}), managed_fields: None, name: Some("aeotec-gen-5-svc"), namespace: Some("host-system"), owner_references: Some([OwnerReference { api_version: "akri.sh/v0", block_owner_deletion: Some(true), controller: Some(true), kind: "Configuration", name: "aeotec-gen-5", uid: "2fcf85d6-d59c-4236-8424-5b0f3788c38e" }]), resource_version: None, self_link: None, uid: None }), spec: Some(ServiceSpec { cluster_ip: None, external_ips: None, external_name: None, external_traffic_policy: None, health_check_node_port: None, ip_family: None, load_balancer_ip: None, load_balancer_source_ranges: None, ports: Some([ServicePort { name: Some("device"), node_port: None, port: 2000, protocol: Some("TCP"), target_port: Some(String("device")) }]), publish_not_ready_addresses: None, selector: Some({"akri.sh/configuration": "aeotec-gen-5", "controller": "akri.sh"}), session_affinity: None, session_affinity_config: None, type_: Some("ClusterIP") }), status: None }
[2021-09-02T18:00:45Z TRACE akri_shared::k8s::service] create_service enter
[2021-09-02T18:00:45Z INFO akri_shared::k8s::service] create_service svcs.create(...).await?:
[2021-09-02T18:00:45Z INFO akri_shared::k8s::service] create_service services.create return: "aeotec-gen-5-svc"
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done