Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load samples returns an error in 0.4 Kubeflow #603

Closed
jlewi opened this issue Dec 30, 2018 · 14 comments
Closed

Load samples returns an error in 0.4 Kubeflow #603

jlewi opened this issue Dec 30, 2018 · 14 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Dec 30, 2018

I deployed Kubeflow from the 0.4 release branch

The ml pipelines container is in an error state. Logs for the pod show

F1228 19:56:00.989700       1 client_manager.go:162] Failed to create a foreign key for RunID in run_metrics table. Error: Error 1022: Can't write; duplicate key in table '#sql-1_4'

(Error 1022: Can't write; duplicate key in table '#sql-1_4') 
[2018-12-28 19:56:00]  
goroutine 1 [running]:
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.stacks(0xc00041c100, 0xc00087e000, 0xb6, 0x1b6)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:769 +0xd4
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.(*loggingT).output(0x263bee0, 0xc000000003, 0xc00060c840, 0x2397334, 0x11, 0xa2, 0x0)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:720 +0x329
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.(*loggingT).printf(0x263bee0, 0x3, 0x163355c, 0x48, 0xc0006c1bd0, 0x1, 0x1)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:655 +0x14b
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.Fatalf(0x163355c, 0x48, 0xc0006c1bd0, 0x1, 0x1)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:1148 +0x67
main.initDBClient(0x53d1ac1000, 0x15)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:162 +0x455
main.(*ClientManager).init(0xc000913d50)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:103 +0x80
main.newClientManager(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:242 +0x90
main.main()
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/main.go:53 +0x88

Related to: kubeflow/kubeflow#2098 0.4 release

/cc @IronPan

@IronPan
Copy link
Member

IronPan commented Dec 31, 2018

Thanks Jeremy, how did you do the deployment? Did you deploy to a clean cluster or upgrading a existing cluster?

@TimZaman
Copy link
Contributor

TimZaman commented Jan 2, 2019

Same here!

  • Deployed following docs with CLI (not GUI)
  • On GKE
  • Fresh project, fresh cluster
  • export KUBEFLOW_TAG=v0.4.0-rc.2
tzaman@Tims-MacBook-Pro kf1337 $ kubectl logs ml-pipelines-load-samples-gwmhn
F0102 19:31:18.788858       1 client_manager.go:162] Failed to create a foreign key for RunID in run_metrics table. Error: Error 1022: Can't write; duplicate key in table '#sql-1_4'

(Error 1022: Can't write; duplicate key in table '#sql-1_4') 
[2019-01-02 19:31:18]  
goroutine 1 [running]:
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.stacks(0xc000018100, 0xc00089e000, 0xb6, 0x1b6)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:769 +0xd4
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.(*loggingT).output(0x263bee0, 0xc000000003, 0xc000628790, 0x2397334, 0x11, 0xa2, 0x0)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:720 +0x329
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.(*loggingT).printf(0x263bee0, 0x3, 0x163355c, 0x48, 0xc0006ddbd0, 0x1, 0x1)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:655 +0x14b
github.com/kubeflow/pipelines/vendor/github.com/golang/glog.Fatalf(0x163355c, 0x48, 0xc0006ddbd0, 0x1, 0x1)
	/go/src/github.com/kubeflow/pipelines/vendor/github.com/golang/glog/glog.go:1148 +0x67
main.initDBClient(0x53d1ac1000, 0x15)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:162 +0x455
main.(*ClientManager).init(0xc000933d50)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:103 +0x80
main.newClientManager(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/client_manager.go:242 +0x90
main.main()
	/go/src/github.com/kubeflow/pipelines/backend/src/apiserver/main.go:53 +0x88

@jlewi
Copy link
Contributor Author

jlewi commented Jan 2, 2019

@IronPan I deployed to a new cluster.

Has pipelines been upgraded to the 0.1.5 on the release branch? Do wee need to cherry pick the fix to #549?

@IronPan
Copy link
Member

IronPan commented Jan 2, 2019

@hongye-sun On surface the error message looks relevant to the metrics table. do you have any idea on this failure?

meanwhile I'll create a cluster too and dive into the issue

@IronPan
Copy link
Member

IronPan commented Jan 2, 2019

It seems the issue was due to the DB initialized twice during both k8s job to sample loading and api server startup. We can remove the DB initialization from the job to load samples

@IronPan
Copy link
Member

IronPan commented Jan 2, 2019

I'll send a PR shortly

@TimZaman
Copy link
Contributor

TimZaman commented Jan 2, 2019 via email

@IronPan
Copy link
Member

IronPan commented Jan 2, 2019

Interestingly I wasn't able to reproduce it from a GKE cluster. Does it fail all the time?

@IronPan
Copy link
Member

IronPan commented Jan 2, 2019

The source code shows it would handles gracefully if the foreign key already exist.
https://github.com/jinzhu/gorm/blob/master/scope.go#L1220

It align with what I observed. Is there any cluster you can share with me to investigate? @jlewi

@TimZaman
Copy link
Contributor

TimZaman commented Jan 4, 2019

I am now on v0.4.0-rc.3 and the issue went away (0 restarts)

@IronPan
Copy link
Member

IronPan commented Jan 10, 2019

Nice I'll close this one

@IronPan
Copy link
Member

IronPan commented Jan 10, 2019

/close

@k8s-ci-robot
Copy link
Contributor

@IronPan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@k8s-ci-robot
Copy link
Contributor

@IronPan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Linchin pushed a commit to Linchin/pipelines that referenced this issue Apr 11, 2023
* Create a new Kubeflow cluster to run auto-deploy v2.

* kubeflow#595 we are creating a new version of the infrastructure to auto-deploy
  Kubeflow from master and release branches

* This PR contains the GCP and Kustomize manifests for setting up a new
  Kubeflow cluster where this app will run.

* We can't use the existing kubeflow-testing cluster because that doesn't
  have ISTIO or workload identity which we want for the auto-deploy app.

* Remove OWNERs files
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this issue Oct 22, 2023
* Fix for the bug kubeflow#602

Clears some verbiage for  kserve/kserve#602

* remove references to installing KNative through manifests

Few people have reported that causes confusion. Either you get KNative and Istio through Kubeflow, or through their respective installation sites

* Update Istio versions

* fiz istio version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants