Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: "no workflow definition in file" using standard installation and Hello World workflow #5380

Closed
wdma opened this issue Mar 12, 2021 · 4 comments
Labels

Comments

@wdma
Copy link

wdma commented Mar 12, 2021

Summary

I had been successfully using the Argo Quickstart manifest and am now trying to install the secure manifest (https://raw.githubusercontent.com/argoproj/argo-workflows/master/manifests/namespace-install.yaml), but am getting
the error "no workflow definition in file" when trying to run the Hello World example (https://argoproj.github.io/argo-workflows/examples/#hello-world). This is the GitHub Fix that I think I may be running into(?) #1487

Diagnostics

describe the server pod:

Name:         argo-server-644995c544-nfkw4
Namespace:    argo
Priority:     0
Node:         artemis/192.241.129.100
Start Time:   Fri, 12 Mar 2021 16:33:10 +0000
Labels:       app=argo-server
              pod-template-hash=644995c544
Annotations:  cni.projectcalico.org/podIP: 192.168.27.17/32
              cni.projectcalico.org/podIPs: 192.168.27.17/32
Status:       Running
IP:           192.168.27.17
IPs:
  IP:           192.168.27.17
Controlled By:  ReplicaSet/argo-server-644995c544
Containers:
  argo-server:
    Container ID:  docker://22d8fbdd528b743ceaa5b498317de7b1ac1cac56516419e56f7319c1b1e7e112
    Image:         argoproj/argocli:latest
    Image ID:      docker-pullable://argoproj/argocli@sha256:c1fe2ebe6e452f57bb4d0121a7564531ddac9f65c5076998e62db614064961bb
    Port:          2746/TCP
    Host Port:     0/TCP
    Args:
      server
      --namespaced
    State:          Running
      Started:      Fri, 12 Mar 2021 16:33:12 +0000
    Ready:          True
    Restart Count:  0
    Readiness:      http-get https://:2746/ delay=10s timeout=1s period=20s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from argo-server-token-tv8f9 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  argo-server-token-tv8f9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  argo-server-token-tv8f9
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>

describe the workflow controller pod:

Name:         workflow-controller-75797fd868-vvb9h
Namespace:    argo
Priority:     0
Node:         ack2/143.110.149.114
Start Time:   Fri, 12 Mar 2021 16:33:10 +0000
Labels:       app=workflow-controller
              pod-template-hash=75797fd868
Annotations:  cni.projectcalico.org/podIP: 192.168.50.135/32
              cni.projectcalico.org/podIPs: 192.168.50.135/32
Status:       Running
IP:           192.168.50.135
IPs:
  IP:           192.168.50.135
Controlled By:  ReplicaSet/workflow-controller-75797fd868
Containers:
  workflow-controller:
    Container ID:  docker://6996811713d50d7bb6486a7dbb00bfe71f4bc48eed79c6dba90e5dc244703e4d
    Image:         argoproj/workflow-controller:latest
    Image ID:      docker-pullable://argoproj/workflow-controller@sha256:0964c31de02feb3da67371d3e55b3c7b19232ec813f5c5637fe0a4ce8bfd0eb1
    Port:          9090/TCP
    Host Port:     0/TCP
    Command:
      workflow-controller
    Args:
      --configmap
      workflow-controller-configmap
      --executor-image
      argoproj/argoexec:latest
      --namespaced
    State:          Running
      Started:      Fri, 12 Mar 2021 17:33:53 +0000
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 12 Mar 2021 16:33:14 +0000
      Finished:     Fri, 12 Mar 2021 17:33:48 +0000
    Ready:          True
    Restart Count:  1
    Liveness:       http-get http://:metrics/metrics delay=30s timeout=1s period=30s #success=1 #failure=3
    Environment:
      LEADER_ELECTION_IDENTITY:  workflow-controller-75797fd868-vvb9h (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from argo-token-dhh2t (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  argo-token-dhh2t:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  argo-token-dhh2t
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason   Age                 From     Message
  ----    ------   ----                ----     -------
  Normal  Created  48m (x2 over 109m)  kubelet  Created container workflow-controller
  Normal  Started  48m (x2 over 109m)  kubelet  Started container workflow-controller
  Normal  Pulled   48m                 kubelet  Successfully pulled image "argoproj/workflow-controller:latest" in 3.746340422s

What Kubernetes provider are you using?

I am using a tested baremetal installation of Kubernetes with Calico. Here are the pods:

NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE    IP                NODE      NOMINATED NODE   READINESS GATES
argo          argo-server-644995c544-nfkw4           1/1     Running   0          112m   192.168.27.17     artemis   <none>           <none>
argo          workflow-controller-75797fd868-vvb9h   1/1     Running   1          112m   192.168.50.135    ack2      <none>           <none>
kube-system   calico-node-8fpq8                      1/1     Running   0          46h    143.198.57.235    ack1      <none>           <none>
kube-system   calico-node-gq25c                      1/1     Running   0          46h    143.110.149.114   ack2      <none>           <none>
kube-system   calico-node-ttmd6                      1/1     Running   0          46h    192.241.129.100   artemis   <none>           <none>
kube-system   calico-typha-ffff99464-gtb9s           1/1     Running   0          46h    143.198.57.235    ack1      <none>           <none>
kube-system   calico-typha-ffff99464-n5gft           1/1     Running   0          46h    143.110.149.114   ack2      <none>           <none>
kube-system   calico-typha-ffff99464-tghpd           1/1     Running   0          46h    192.241.129.100   artemis   <none>           <none>
kube-system   coredns-74ff55c5b-4qdrx                1/1     Running   2          46h    192.168.27.1      artemis   <none>           <none>
kube-system   coredns-74ff55c5b-c54c5                1/1     Running   2          46h    192.168.27.0      artemis   <none>           <none>
kube-system   etcd-artemis                           1/1     Running   0          46h    192.241.129.100   artemis   <none>           <none>
kube-system   kube-apiserver-artemis                 1/1     Running   0          46h    192.241.129.100   artemis   <none>           <none>
kube-system   kube-controller-manager-artemis        1/1     Running   2          46h    192.241.129.100   artemis   <none>           <none>
kube-system   kube-proxy-94lxq                       1/1     Running   0          46h    143.110.149.114   ack2      <none>           <none>
kube-system   kube-proxy-jl4xk                       1/1     Running   0          46h    143.198.57.235    ack1      <none>           <none>
kube-system   kube-proxy-pz8tw                       1/1     Running   0          46h    192.241.129.100   artemis   <none>           <none>
kube-system   kube-scheduler-artemis                 1/1     Running   1          46h    192.241.129.100   artemis   <none>           <none>

Here are the nodes:

![image](https://user-images.githubusercontent.com/8416524/110982694-d7631680-8336-11eb-94e0-860814f339ef.png)

What version of Argo Workflows are you running?

https://raw.githubusercontent.com/argoproj/argo-workflows/master/manifests/namespace-install.yaml

Paste a workflow that reproduces the bug, including status:

https://argoproj.github.io/argo-workflows/examples/#hello-world

Paste the logs from the workflow controller:
kubectl logs -n argo $(kubectl get pods -l app=workflow-controller -n argo -o name) | grep ${workflow}

image

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@wdma wdma added the type/bug label Mar 12, 2021
@wdma
Copy link
Author

wdma commented Mar 12, 2021

The node status did not paste correctly. Here it is again.

Calico process is running.

IPv4 BGP status
+-----------------+-------------------+-------+------------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+-----------------+-------------------+-------+------------+-------------+
| 143.198.57.235 | node-to-node mesh | up | 2021-03-10 | Established |
| 143.110.149.114 | node-to-node mesh | up | 2021-03-10 | Established |
+-----------------+-------------------+-------+------------+-------------+

IPv6 BGP status
No IPv6 peers found.

@terrytangyuan
Copy link
Member

How did you submit? argo submit examples/hello-world.yaml -n argo --serviceaccount argo worked fine to me.

@wdma
Copy link
Author

wdma commented Mar 15, 2021

The command I used is

argo submit -n argo --watch whalesay.yaml --serviceaccount argo

The error I am getting is:

FATA[2021-03-15T16:00:44.392Z] unknown (get workflows.argoproj.io)

Note, this is a slightly different error than originally reported. In the first case, I was using the namespace install manifest, in the second, I was using the standard install manifest.

image

the result of

 kubectl logs -n argo  whalesay-pb57m-2963895809

is

error: a container name must be specified for pod whalesay-pb57m-2963895809, choose one of: [wait main]

UPDATE: I am able to make my workflow run by using the kubectl administrator account. Question: Is there a way to use a restricted RBAC or is that the function of the serviceaccount? Perhaps this is a good addition to the docs?

Since my workflows are working again, I am closing this thread. Thank you!

@wdma wdma closed this as completed Mar 15, 2021
@alexec
Copy link
Contributor

alexec commented Mar 15, 2021

This is the role you are looking for:

https://github.com/argoproj/argo-workflows/blob/master/manifests/quick-start/base/workflow-role.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants