Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[testing] Inverse proxy is flaky in test infra #2737

Closed
Bobgy opened this issue Dec 16, 2019 · 4 comments
Closed

[testing] Inverse proxy is flaky in test infra #2737

Bobgy opened this issue Dec 16, 2019 · 4 comments

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Dec 16, 2019

What happened:
In presubmit e2e tests, inverse proxy stucks at CrashLoopBackOff from time to time.
For example: https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/2643/kubeflow-pipeline-e2e-test/1198909726310535169#1:build-log.txt%3A1177

What did you expect to happen:
It should not be flaky.

I briefly did some investigation and it seems when the first time inverse proxy runs, it gets an empty BACKEND_ID: https://github.com/kubeflow/pipelines/blob/master/proxy/attempt-register-vm-on-proxy.sh#L70 and saves it to the configmap.

Therefore, all future runs will try to connect to the empty BACKEND_ID and fails. I think we need some validation if generated configmap is sane before saving it.

Example error log:

kubectl logs proxy-agent-7fdfbddd88-64vhf 
+++ dirname /opt/proxy/attempt-register-vm-on-proxy.sh
++ cd /opt/proxy
++ pwd
+ DIR=/opt/proxy
+ kubectl get configmap inverse-proxy-config
NAME                   DATA   AGE
inverse-proxy-config   3      26m
++ jq -r .data.ProxyUrl
++ kubectl get configmap inverse-proxy-config -o json
+ PROXY_URL=https://datalab-us-east1.cloud.google.com/tun/m/4592f092208ecc84946b8f8f8016274df1b36a14
++ kubectl get configmap inverse-proxy-config -o json
++ jq -r .data.BackendId
+ BACKEND_ID=
+ run-proxy-agent
+ /opt/bin/proxy-forwarding-agent --debug=false --proxy=https://datalab-us-east1.cloud.google.com/tun/m/4592f092208ecc84946b8f8f8016274df1b36a14 --proxy-timeout=60s --backend= --host=10.39.243.16:80 --shim-websockets=true --shim-path=websocket-shim --health-check-path=/ --health-check-interval-seconds=0 --health-check-unhealthy-threshold=2
2019/11/22 07:48:59 You must specify a backend ID
@Bobgy
Copy link
Contributor Author

Bobgy commented Dec 16, 2019

/assign @IronPan
/cc @rmgogogo

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 14, 2020

This is likely already resolved by #2391.
I will verify that

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 15, 2020

Checked some recently merged PRs, like #2696 and #2743. proxy-agent is no longer in crash loop back off state when the test finishes.

/close

@k8s-ci-robot
Copy link
Contributor

@Bobgy: Closing this issue.

In response to this:

Checked some recently merged PRs, like #2696 and #2743. proxy-agent is no longer in crash loop back off state when the test finishes.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Bobgy Bobgy assigned rmgogogo and unassigned IronPan Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants