-
Notifications
You must be signed in to change notification settings - Fork 3.4k
fix: support ALPN and H2C with gRPC. Fixes #14627 #14567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
argoproj#14435 tried to fix the bug in the release-3.6 branch where requests failed with the following error: ``` Service Unavailable: connection error: desc = "transport: authentication handshake failed: credentials: cannot check peer: missing selected ALPN property. If you upgraded from a grpc-go version earlier than 1.67, your TLS connections may have stopped working due to ALPN enforcement. For more details, see: grpc/grpc-go#434" on the UI ``` Unfortunately, this didn't work and broke the readiness probe, so it was reverted in argoproj@1285c11. The problem is the readiness probe makes a GET request to `/`: https://github.com/argoproj/argo-workflows/blob/1e2a87f2afdebbcd0e55069df5a945f5faca9d45/manifests/base/argo-server/argo-server-deployment.yaml#L30-L36 With ALPN enabled, that request was automatically upraded to HTTP/2, which caused it to be forwarded to the gRPC server instead of the HTTP server. Initially, I tried to fix this by changing cmux to only forward requests with `Content-Type: application/grpc` (which is guaranteed to be present per the [gRPC over HTTP2 spec](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md)) to the gRPC server: argoproj/argo-workflows@main...MasonM:argo-workflows:fix-http2-multiplexing but that doesn't work with TLS enabled due to the following limitation mentioned at https://github.com/soheilhy/cmux/: > TLS: net/http uses a type assertion to identify TLS connections; since cmux's lookahead-implementing connection wraps the underlying TLS connection, this type assertion fails. Because of that, you can serve HTTPS using cmux but http.Request.TLS would not be set in your handlers. Instead, this uses the solution proposed in grpc/grpc-go#555 (comment), which relies on the [h2c package](https://pkg.go.dev/golang.org/x/net/http2/h2c). I added basic tests to `argo_server_test.go` for the readiness probe, but that only tests when TLS is disabled. I wrote following script to manually test with both HTTPS and HTTP: ```shell set -euo pipefail BASEURL="$scheme://localhost:2746" docurl() { printf "Testing %s with HTTP/1.1\n" "$*" curl -kv --http1.1 "$@" 2>&1 | egrep -i 'ALPN|^< HTTP|< Content-Type|< Grpc-Metadata-Argo-Version' printf "Testing %s with ALPN\n" "$*" curl -kv --alpn "$@" 2>&1 | egrep -i 'ALPN|^< HTTP|< Content-Type|< Grpc-Metadata-Argo-Version' } printf 'Testing root path\n' docurl "$BASEURL/" printf '\nTesting grpc-gateway\n' docurl "$BASEURL/api/v1/workflow-templates/argo" printf '\nTesting grpc server\n' docurl -XPOST -H 'Content-Type: application/grpc' "$BASEURL/workflowtemplate.WorkflowTemplateService/ListWorkflowTemplates" ``` <details> <summary>Output with HTTP</summary> ``` $ scheme=http ./test_mux.sh Testing root path Testing http://localhost:2746/ with HTTP/1.1 < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 Testing http://localhost:2746/ with ALPN < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 Testing grpc-gateway Testing http://localhost:2746/api/v1/workflow-templates/argo with HTTP/1.1 < HTTP/1.1 200 OK < Content-Type: application/json < Grpc-Metadata-Argo-Version: latest+94f9148.dirty Testing http://localhost:2746/api/v1/workflow-templates/argo with ALPN < HTTP/1.1 200 OK < Content-Type: application/json < Grpc-Metadata-Argo-Version: latest+94f9148.dirty Testing grpc server Testing -XPOST -H Content-Type: application/grpc http://localhost:2746/workflowtemplate.WorkflowTemplateService/ListWorkflowTemplates with HTTP/1.1 < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 Testing -XPOST -H Content-Type: application/grpc http://localhost:2746/workflowtemplate.WorkflowTemplateService/ListWorkflowTemplates with ALPN < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 ``` </details> <details> <summary>Output with HTTPS</summary> ``` $ scheme=https ./test_mux.sh Testing root path Testing https://localhost:2746/ with HTTP/1.1 * ALPN, offering http/1.1 * ALPN, server did not agree to a protocol < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 Testing https://localhost:2746/ with ALPN * ALPN, offering h2 * ALPN, offering http/1.1 * ALPN, server accepted to use h2 < HTTP/2 200 < content-type: text/html; charset=utf-8 Testing grpc-gateway Testing https://localhost:2746/api/v1/workflow-templates/argo with HTTP/1.1 * ALPN, offering http/1.1 * ALPN, server did not agree to a protocol < HTTP/1.1 200 OK < Content-Type: application/json < Grpc-Metadata-Argo-Version: latest+94f9148.dirty Testing https://localhost:2746/api/v1/workflow-templates/argo with ALPN * ALPN, offering h2 * ALPN, offering http/1.1 * ALPN, server accepted to use h2 < HTTP/2 200 < content-type: application/json < grpc-metadata-argo-version: latest+94f9148.dirty Testing grpc server Testing -XPOST -H Content-Type: application/grpc https://localhost:2746/workflowtemplate.WorkflowTemplateService/ListWorkflowTemplates with HTTP/1.1 * ALPN, offering http/1.1 * ALPN, server did not agree to a protocol < HTTP/1.1 200 OK < Content-Type: text/html; charset=utf-8 Testing -XPOST -H Content-Type: application/grpc https://localhost:2746/workflowtemplate.WorkflowTemplateService/ListWorkflowTemplates with ALPN * ALPN, offering h2 * ALPN, offering http/1.1 * ALPN, server accepted to use h2 < HTTP/2 200 < content-type: application/grpc ``` </details> Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
|
/retest |
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
| // to an HTTP/2 connection which is understandable to s.ServeConn. (s.ServeConn | ||
| // understands HTTP/2 except for the h2c part of it.)" | ||
| func NewMuxHandler(grpcServerHandler http.Handler, httpServerHandler http.Handler) http.Handler { | ||
| return h2c.NewHandler(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible concern here is performance, since h2c buffers the entire request in memory, unlike cmux. I'm not sure if that's actually going to be a problem here.
Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com>
|
/retest |
|
This LGTM, and works for me. Should we merge the reversion of the ENV var in the Dockerfile into this PR, or do you think it wants doing separately? |
|
@Joibel Thanks for the review! I was thinking of doing it separately to be extra safe. |
isubasinghe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine, haven't tested it so just leaving a comment, code looks good.
Thanks for the work!
|
/cherry-pick release-3.6 |
|
/cherry-pick release-3.7 |
|
Cherry-pick failed with |
|
Cherry-pick failed with |
…ck 3.7) (#14741) Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>
…ck release-3.6) (#14742) Signed-off-by: Mason Malone <651224+MasonM@users.noreply.github.com> Co-authored-by: Mason Malone <651224+MasonM@users.noreply.github.com>
Fixes #14627.
Motivation
#14435 tried to fix the bug in the
release-3.6branch where requests failed with the following error:Unfortunately, this didn't work and broke the readiness probe, so it was reverted in 1285c11. The problem is the readiness probe makes a GET request to
/:argo-workflows/manifests/base/argo-server/argo-server-deployment.yaml
Lines 30 to 36 in 1e2a87f
That request should be going to the HTTP server, and it does when using HTTP/1.1, since
argoserveris configured to forward all HTTP/1.1 requests to the listener for the HTTP server:argo-workflows/server/apiserver/argoserver.go
Lines 279 to 280 in 1e2a87f
but with ALPN enabled, the client (the kubelet in this case) automatically upgraded the request to HTTP/2, which caused it to be forwarded to the gRPC server instead.
The reason this wasn't caught in #14435 is because the Argo server isn't run inside Kubernetes in either the local dev environment or CI builds, so the readiness probe wasn't being used. Additionally, http://localhost:8080/ in the local dev environment points to the webpack-dev-server, which proxies requests to the Argo server, without ALPN:
argo-workflows/ui/webpack.config.js
Lines 112 to 119 in 7fd6b10
Modifications
Initially, I tried to fix this by changing cmux to only forward requests with
Content-Type: application/grpc(which is guaranteed to be present per the gRPC over HTTP2 spec) to the gRPC server:main...MasonM:argo-workflows:fix-http2-multiplexing
but that doesn't work with TLS enabled due to the following limitation mentioned at https://github.com/soheilhy/cmux/:
Link to the code it's referencing: https://github.com/golang/go/blob/1e756dc5f73dc19eb1cbf038807d18ef1cc54ebc/src/net/http/server.go#L1959
Instead, this uses the solution proposed in grpc/grpc-go#555 (comment), which relies on the h2c package.
I had to partially revert #14588 to restore the changes in
util/tls/tls.gothat enabled ALPN. To be extra safe, I didn't remove theENV GRPC_ENFORCE_ALPN_ENABLED=falseworkaround yet, though it should theoretically be okay to remove now.Verification
I added basic tests to
argo_server_test.gofor the readiness probe, but that only tests when TLS is disabled. I wrote the following script to manually test with both HTTPS and HTTP:Output with HTTP
Output with HTTPS
Documentation
N/A