-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
studyjob-controller start failed #546
Comments
The problem also happened in my env. Any work around or fix solution? Thanks. |
I have tried deploye kubeflow pipeline on my local k8s cluster several times, it appeard occasionally and this error not block basic functions. |
I met the same problem few days ago, and solved it by replace deployment studyjob-controller's image with |
ok, thanks! |
@hackenzheng hi, i'm a new learner for kubeflow, and i also met the same problem. |
Hi @xiaozhouX, |
why not retag the docker image? @zoux86 @KaranKhirsariya |
Hi @KaranKhirsariya and @zoux86 , To replace the studyjob-controller's image, click on studyjob-controller in the workload section of the kubernetes part of GCP. Once on the studyjob-controller page, go near the top and click on the YAML tab. This will bring you to the YAML file. In the YAML file, click on the edit button at the top. Then change the values of image to katib/studyjob-controller@sha256:870c260af5caa8823f9a64fa126a4ddb6ffd3e911417fe73aa924c3ee144ad8e Please let me know if this helps |
All of other pods run succeed, while the studyjob-controller pods always restart. The logs show "no matches for kind "TFJob" in version "kubeflow.org/v1beta1"" . I have confirmed the tf-operator installed successfully. It can start training use tfjob.
The text was updated successfully, but these errors were encountered: