Skip to content
This repository was archived by the owner on Nov 1, 2022. It is now read-only.
This repository was archived by the owner on Nov 1, 2022. It is now read-only.

fluxctl cannot communicate with fluxd after pod rescheduling due to ambiguous pod name #2276

Closed
@zoni

Description

@zoni

Describe the bug

When the node running the flux container fails, kubernetes will eventually reschedule the pod onto a different node.

Until the failed node comes back online and the old flux pod is cleaned up, there will be two pods, one with state Running and one with state Terminating:

➔  kubectl get pods -l name=flux 
NAME                   READY   STATUS        RESTARTS   AGE
flux-66966f499-hrplt   1/1     Terminating   0          143m
flux-66966f499-rpql2   1/1     Running       0          13m

While this situation exists, fluxctl cannot be used due to an ambiguous pod specification:

➔  fluxctl list-workloads                                  
Error: Could not create a dialer: Could not get pod name: Ambiguous pod: found more than one pod for selector: labels "name in (flux,fluxd,weave-flux-agent)"
Run 'fluxctl list-workloads --help' for usage.

To Reproduce

  1. Set up a two-node Kubernetes cluster with flux installed following the "Getting started" documentation.
  2. Pause/stop/terminate the node which has the flux daemon scheduled. kubectl get nodes should then look something like:
NAME                       STATUS     ROLES   AGE    VERSION
aks-agentpool-57623730-0   NotReady   agent   4h8m   v1.13.7
aks-agentpool-57623730-1   Ready      agent   4h8m   v1.13.7
  1. Wait for kubernetes to reschedule the flux pod onto the other node, at which point two pods should show up in kubectl get pods -l name=flux:
➔  kubectl get pods -l name=flux 
NAME                   READY   STATUS        RESTARTS   AGE
flux-66966f499-hrplt   1/1     Terminating   0          143m
flux-66966f499-rpql2   1/1     Running       0          13m
  1. Invoking fluxctl commands, for example fluxctl list-workloads, will now return an error.

Expected behavior

fluxctl should target the running pod and continue work as usual.

Additional context

  • Flux version: 1.13.2
  • Kubernetes version: 1.13.7

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions