Skip to content

Error running pipeline: cannot create tfjobs.kubeflow.org 403 #294

Closed
@lluunn

Description

Steps:

  1. followed wiki to deploy pipeline.
  2. I took the code from @amygdala's blog, ran it in jupyter notebook to build the tarball
  3. Upload the tarball, and graph looks correct:

screen shot 2018-11-15 at 14 39 46

  1. Running the pipeline, got error:
INFO:root:Getting credentials for GKE cluster kfp1.
Fetching cluster endpoint and auth data.
kubeconfig entry generated for kfp1.
INFO:root:Generating training template.
INFO:root:Start training.
ERROR:root:Exception when calling DefaultApi->apis_fqdn_v1_namespaces_namespace_resource_post: tfjobs.kubeflow.org is forbidden: User "system:serviceaccount:kubeflow:pipeline-runner" cannot create tfjobs.kubeflow.org in the namespace "kubeflow"
Traceback (most recent call last):
  File "/ml/train.py", line 230, in <module>
    main()
  File "/ml/train.py", line 188, in main
    create_response = tf_job_client.create_tf_job(api_client, content_yaml, version=kf_version)
  File "/tf-operator/py/tf_job_client.py", line 56, in create_tf_job
    raise e
kubernetes.client.rest.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Date': 'Thu, 15 Nov 2018 22:32:05 GMT', 'Audit-Id': '316cc166-62f2-43aa-926e-8014f66f4b2e', 'Content-Length': '318', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"tfjobs.kubeflow.org is forbidden: User \"system:serviceaccount:kubeflow:pipeline-runner\" cannot create tfjobs.kubeflow.org in the namespace \"kubeflow\"","reason":"Forbidden","details":{"group":"kubeflow.org","kind":"tfjobs"},"code":403

screen shot 2018-11-15 at 14 40 58

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions