-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting first pipeline version of a pipeline and attempting pipeline run crashes ml-pipeline #4389
Labels
Comments
ekesken
added a commit
to ekesken/pipelines
that referenced
this issue
Aug 31, 2020
Fixes kubeflow#4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` Scenario described in kubeflow#4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue kubeflow#4389 still need to be addressed.
ekesken
added a commit
to ekesken/pipelines
that referenced
this issue
Sep 1, 2020
Fixes kubeflow#4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` Scenario described in kubeflow#4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue kubeflow#4389 still need to be addressed.
ekesken
added a commit
to ekesken/pipelines
that referenced
this issue
Sep 2, 2020
Fixes kubeflow#4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in kubeflow#4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue kubeflow#4389 still need to be addressed.
k8s-ci-robot
pushed a commit
that referenced
this issue
Sep 2, 2020
…#4389 (#4439) Fixes #4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in #4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue #4389 still need to be addressed.
Bobgy
pushed a commit
to Bobgy/pipelines
that referenced
this issue
Sep 4, 2020
…kubeflow#4389 (kubeflow#4439) Fixes kubeflow#4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in kubeflow#4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue kubeflow#4389 still need to be addressed.
Bobgy
pushed a commit
that referenced
this issue
Sep 4, 2020
…#4389 (#4439) Fixes #4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in #4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue #4389 still need to be addressed.
Jeffwan
pushed a commit
to Jeffwan/pipelines
that referenced
this issue
Dec 9, 2020
…kubeflow#4389 (kubeflow#4439) Fixes kubeflow#4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(*Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(*ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(*Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(*Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in kubeflow#4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue kubeflow#4389 still need to be addressed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What steps did you take:
There may be multiple bugs here.
What happened:
When calling
client.run_pipeline()
the ml-pipeline pod appears to crash instantly. This happens every time. Via the SDK I get the following error message:The pod fails with the following error:
What did you expect to happen:
Environment:
How did you deploy Kubeflow Pipelines (KFP)?
KFP version: 1.0.0
KFP SDK version: master branch
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
/kind bug
The text was updated successfully, but these errors were encountered: