-
Notifications
You must be signed in to change notification settings - Fork 16.7k
[incubator/vault] Use httpGet instead of TCP Socket for liveness check #9462
Conversation
/assign @linki |
/assign @seanknox |
👀 PTAL |
/assign |
There seem to be incubator/vault PRs that are 2 months old, are these still being looked at / considered? |
My team is encountering a slightly different issue that would have the same fix: On EKS, with the Regardless, switching to |
Bump. Rebased, PTAL 👀 🙇 |
/assign @viglesiasce |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions. |
cc @jpds |
/ok-to-test |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jbialy, mattfarina The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is not right. You shouldn't kill a pod when you receive an error like 501 or 503, since it means Vault is in a sealed state.
I'm specifically getting a 503, since it's a fresh installation, obviously.
https://www.vaultproject.io/api/system/health.html
So with 0.15.0 I cannot unseal Vault since it gets restarted, ending up in a CrashLoopBackOff.
Check describe
output:
Warning Unhealthy 3m (x7 over 4m) kubelet, 159.69.17.129 \
Liveness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 3m (x7 over 4m) kubelet, 159.69.17.129 \
Readiness probe failed: HTTP probe failed with statuscode: 503
Ah, good catch! I didn't foresee this. One way to fix it would be to change the |
@jbialy In upstream? Hmm, doubt it that they'll do it easily. Reverting back to 0.14.9, locally. |
|
helm#9462) * use httpGet instead of TCP socket Signed-off-by: Janusz Bialy <janusz.bialy@qlik.com> * bump chart version Signed-off-by: Janusz Bialy <jbialy@gmail.com>
helm#9462) * use httpGet instead of TCP socket Signed-off-by: Janusz Bialy <janusz.bialy@qlik.com> * bump chart version Signed-off-by: Janusz Bialy <jbialy@gmail.com>
helm#9462) * use httpGet instead of TCP socket Signed-off-by: Janusz Bialy <janusz.bialy@qlik.com> * bump chart version Signed-off-by: Janusz Bialy <jbialy@gmail.com>
What this PR does / why we need it:
Currently, the
liveness
probe usestcpSocket
when determining if the pod is alive or not. This works OK when Vault is listening on HTTP. Although, when TLS is enabled, the liveness probe continues to work but a warning is logged for every check interval:This PR changes the liveness probe to use
httpGet
where the protocol scheme can be configured to HTTPS depending on the values specified invalues.yaml
. This avoids filling the logs with the above warning.The probe checks Vault's
/v1/sys/health
endpoint in accordance with:https://www.vaultproject.io/api/system/health.html#standbyok
Checklist