Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to replace partial setters with KptFile #89

Open
jlewi opened this issue Jul 15, 2020 · 14 comments
Open

Need to replace partial setters with KptFile #89

jlewi opened this issue Jul 15, 2020 · 14 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Jul 15, 2020

Here are the logs

INFO|2020-07-15T04:02:18|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 3 fields
INFO|2020-07-15T04:02:19|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp location us-central1-c 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:19|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:20|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:21|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp email kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:21|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:22|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:23|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/stacks/gcp name kf-vbp-0715-a29 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:23|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:24|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 4 fields
INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/stacks/gcp gcloud.core.project kubeflow-ci-deployment 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|72| Error: stat upstream/manifests/stacks/gcp/Kptfile: no such file or directory

I think we might have picked up an updated worker image which might have a newer version of kpt.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
platform/gcp 0.87
kind/bug 0.92

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 15, 2020

Locally I'm running.

kpt version
v0.27.0

@jlewi
Copy link
Contributor Author

jlewi commented Jul 15, 2020

It looks like we are running the kpt command twice

kpt cfg set ./upstream/manifests/gcp gcloud.core.project kubeflow-ci-deployment 
INFO|2020-07-15T04:02:08|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set instance mgmt-ctxt kf-ci-management 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:08|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:09|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 1 fields
INFO|2020-07-15T04:02:10|/workspace/testing-repo/py/kubeflow/testing/create_kf_from_gcp_blueprint.py|116| Using name kf-vbp-0715-a29
INFO|2020-07-15T04:02:10|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: gcloud config get-value account 
cwd=None
INFO|2020-07-15T04:02:10|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:11|/workspace/testing-repo/py/kubeflow/testing/util.py|72| kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com
INFO|2020-07-15T04:02:12|/workspace/testing-repo/py/kubeflow/testing/create_kf_from_gcp_blueprint.py|120| Using email kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com
INFO|2020-07-15T04:02:12|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp name kf-vbp-0715-a29 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:12|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:13|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 145 fields
INFO|2020-07-15T04:02:14|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp gcloud.core.project kubeflow-ci-deployment 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:14|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:15|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 49 fields
INFO|2020-07-15T04:02:16|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp gcloud.compute.zone us-central1-c 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:16|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:18|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 3 fields
INFO|2020-07-15T04:02:19|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp location us-central1-c 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:19|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:20|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:21|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/gcp email kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:21|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:22|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:23|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/stacks/gcp name kf-vbp-0715-a29 
cwd=/workspace/blueprint-repo/kubeflow

INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|46| Running: kpt cfg set ./upstream/manifests/stacks/gcp gcloud.core.project kubeflow-ci-deployment 
cwd=/workspace/blueprint-repo/kubeflow
INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|61| Subprocess output:

INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|72| set 0 fields
INFO|2020-07-15T04:02:25|/workspace/testing-repo/py/kubeflow/testing/util.py|72| Error: stat upstream/manifests/stacks/gcp/Kptfile: no such file or directory

And it succeeds the first time.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 15, 2020

I take that back looks like its two different directories.

@Bobgy
Copy link
Contributor

Bobgy commented Jul 15, 2020

I was also seeing this error message locally.

I installed kpt as part of gcloud cli component, it recently upgraded to 0.30.1~.
After downgrading kpt version back to the previous one -- 0.24.0, the error message disappears.

So I think we need to report it to kpt, something broke between the two versions.

@Bobgy
Copy link
Contributor

Bobgy commented Jul 15, 2020

FYI, working version

$ gcloud version
Google Cloud SDK 299.0.0

Failing version, 301.0.0 which had kpt 0.30+

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/engprod 0.56

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 16, 2020

@Bobgy that makes sense thanks. The other possibility would potentially to add a KptFile.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 17, 2020

It looks like the old setter behavior is deleted so we will need to upgrade our Kptfiles.
In the meantime we can pin to the old version in order to unblock the tests.

@jlewi jlewi changed the title Master auto-deployment is failing Need to replace partial setters with KptFile Jul 17, 2020
jlewi pushed a commit to jlewi/manifests that referenced this issue Jul 17, 2020
* Per GoogleCloudPlatform/kubeflow-distribution#89 we need to get rid of the legacy
  partial setters and move to using a KptFile and substitutions.

* In preparation for that we want to check in a set of test data
  that is the result of running our kpt cfg set with a given set of
  values

* This test data will be used to verify that the refactoring to use
  a KptFile doesn't change the output.

* After adding the KptFile we can simply regenerate the testdata
  and then look at the diff to ensure there are no unexpected changes.
@jlewi
Copy link
Contributor Author

jlewi commented Jul 17, 2020

Running kpt cfg commands has become very slow after refactoring into setters and substitutions.

We should time it and compare it to the old version using partial setters.

@jlewi
Copy link
Contributor Author

jlewi commented Jul 17, 2020

Seems like it might be an issue with my system. even version is taking a long time.

time kpt version
0.31.0

real	0m15.362s
user	0m0.056s
sys	0m0.030s

@jlewi
Copy link
Contributor Author

jlewi commented Jul 17, 2020

Looks like the problem is that kpt tries to contact the current master of the current context. My current context was pointing at a cluster that wasn't responsive. Once I switched the context it worked.

kpt version -v 10
I0717 16:37:35.884974   31578 loader.go:375] Config loaded from file:  /home/jlewi/.kube/config
I0717 16:37:35.885453   31578 round_trippers.go:423] curl -k -v -XGET  -H "Accept: application/json" -H "User-Agent: kpt/0.31.0" 'https://34.73.245.80/openapi/v2'
I0717 16:37:51.261119   31578 round_trippers.go:443] GET https://34.73.245.80/openapi/v2  in 15375 milliseconds
I0717 16:37:51.261183   31578 round_trippers.go:449] Response Headers:
0.31.0

@jlewi
Copy link
Contributor Author

jlewi commented Jul 18, 2020

Filed kptdev/kpt#834

It looks like we need to update gcp-blueprints/kubeflow/instance to use the new setters otherwise it will be broken as well.

jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 18, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to GoogleCloudPlatform#89
k8s-ci-robot pushed a commit to kubeflow/manifests that referenced this issue Jul 20, 2020
* Per GoogleCloudPlatform/kubeflow-distribution#89 we need to get rid of the legacy
  partial setters and move to using a KptFile and substitutions.

* In preparation for that we want to check in a set of test data
  that is the result of running our kpt cfg set with a given set of
  values

* This test data will be used to verify that the refactoring to use
  a KptFile doesn't change the output.

* After adding the KptFile we can simply regenerate the testdata
  and then look at the diff to ensure there are no unexpected changes.
jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 20, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to GoogleCloudPlatform#89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.
jlewi pushed a commit to jlewi/testing that referenced this issue Jul 20, 2020
* Now that we are using kptfiles; kpt will complain if we try to
  set a setter which doesn't exist.

related to GoogleCloudPlatform/kubeflow-distribution#89
k8s-ci-robot pushed a commit to kubeflow/testing that referenced this issue Jul 21, 2020
* Now that we are using kptfiles; kpt will complain if we try to
  set a setter which doesn't exist.

related to GoogleCloudPlatform/kubeflow-distribution#89
jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 21, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to GoogleCloudPlatform#89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.
k8s-ci-robot pushed a commit that referenced this issue Jul 21, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to #89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.
@jlewi
Copy link
Contributor Author

jlewi commented Jul 21, 2020

This is deploying on master:
https://k8s-testgrid.appspot.com/sig-big-data#kubeflow-gcp-blueprints-periodic-master&show-stale-tests=

Need to get this cherry-picked onto the 1.1 branch.

jlewi pushed a commit to jlewi/manifests that referenced this issue Jul 21, 2020
* Per GoogleCloudPlatform/kubeflow-distribution#89 we need to get rid of the legacy
  partial setters and move to using a KptFile and substitutions.

* In preparation for that we want to check in a set of test data
  that is the result of running our kpt cfg set with a given set of
  values

* This test data will be used to verify that the refactoring to use
  a KptFile doesn't change the output.

* After adding the KptFile we can simply regenerate the testdata
  and then look at the diff to ensure there are no unexpected changes.
jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 21, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to GoogleCloudPlatform#89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.
jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 21, 2020
* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to GoogleCloudPlatform#89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.
k8s-ci-robot pushed a commit to kubeflow/manifests that referenced this issue Jul 21, 2020
…ile refactoring #1398: Convert v1 to v2 setters & substituions in gcp Cherry pick of #1393 #1398 on v1.1-branch. #1393: Check in expected kpt output for Kptfile refactoring #1398: Convert v1 to v2 setters & substituions in gcp (#1401)

* Check in expected kpt output for Kptfile refactoring

* Per GoogleCloudPlatform/kubeflow-distribution#89 we need to get rid of the legacy
  partial setters and move to using a KptFile and substitutions.

* In preparation for that we want to check in a set of test data
  that is the result of running our kpt cfg set with a given set of
  values

* This test data will be used to verify that the refactoring to use
  a KptFile doesn't change the output.

* After adding the KptFile we can simply regenerate the testdata
  and then look at the diff to ensure there are no unexpected changes.

* Convert v1 to v2 setters & substituions in gcp

* The latest version of kpt started choking on gcp/v2 because we were
  still using the old style setters and substitutions.

* This PR creates a kptfile to use the new setter and substitutions.

* hack/create_kptfile.py contains a script to generate lot of the setters
  and substitutions.

* kf-vm-sa.yaml shouldn't specify the namespace; this will get set in an overlay

* Move workload identity bindings for kf-admin KSA from kubeflow/instance in gcp blueprints repo into this repository.

related to gcp-blueprints#89

* Fix image mirror substitution.

* Create a KptFile for stacks.

* Add conversion for stacks.

* Add KptFile for stacks/gcp
k8s-ci-robot pushed a commit that referenced this issue Jul 21, 2020
…ck of #95 on v1.1-branch. #95: Convert setters from v1 to v2 (#97)

* Convert setters from v1 to v2

* Get rid of enable-services.yaml as we are now using CNRM and this is
  in the base package.
* Related to #89

* Remove cluster and iam policy patch; these are now in the base manifests.
  Added in kubeflow/manifests#1398

* kubeflow/hack/create_kptfile.py contains most of the commands
  used to generate the setters and substitutions.

* Disable presubmit to deploy Kubeflow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants