-
Couldn't load subscription status.
- Fork 41
[WIP] Merge cluster-proxy into core #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,297 @@ | ||
| # Moving cluster-proxy addon into klusterlet | ||
|
|
||
| ## Release Signoff Checklist | ||
|
|
||
| - [ ] Enhancement is `implementable` | ||
| - [ ] Design details are appropriately documented from clear requirements | ||
| - [ ] Test plan is defined | ||
| - [ ] Graduation criteria for dev preview, tech preview, GA | ||
| - [ ] User-facing documentation is created in [website](https://github.com/open-cluster-management-io/open-cluster-management-io.github.io/) | ||
|
|
||
| ## Summary | ||
|
|
||
| This proposal outlines an architectural change to move cluster-proxy capabilities from an addon into the core | ||
| klusterlet agent, enhancing Open Cluster Management's core functionality. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The `cluster-proxy addon` has been widely used in various scenarios that complement the core functionality of | ||
| Open Cluster Management. Use cases include: | ||
| - Users fetch pod logs via `cluster-proxy` | ||
| - Users access VM consoles when `kubevirt` is deployed in a `ManagedCluster` | ||
| - `MultiKueue` integrates with `cluster-proxy` to submit jobs to `ManagedClusters` | ||
| - `MulticlusterGateway` in `KubeVela` uses `cluster-proxy` to push requests to targeted clusters | ||
|
|
||
| Currently, `cluster-proxy` is implemented based on the [ANP (APIServer Network Proxy)](https://github.com/kubernetes-sigs/apiserver-network-proxy) | ||
| server and deployed as an addon. However, we have identified several limitations when using `cluster-proxy`: | ||
|
|
||
| 1. `cluster-proxy` provides a `kconnectivity` interface which is gRPC-based. It is difficult to call directly | ||
| with Kubernetes clients, requiring an HTTP proxy frontend to handle requests. | ||
| 2. `cluster-proxy` leverages the ANP server which transports TCP packets over gRPC tunnels. The agent directly | ||
| proxies TCP packets to the target location, making it difficult for the agent to act as a proxy or delegator | ||
| to mutate and proxy HTTP requests. | ||
| 3. End users must always enable the cluster-proxy addon, introducing additional operational overhead when | ||
| cluster-proxy functionality is commonly needed. | ||
| 4. ANP runs as a binary runtime, offering low flexibility, and users face a "black box" problem. | ||
|
|
||
| We are introducing gRPC as a registration mechanism in OCM, which makes it easier to move cluster-proxy | ||
| capabilities into the klusterlet. | ||
|
|
||
| ### Goals | ||
|
|
||
| - Add a feature flag `ClusterProxy` in both klusterlet and clustermanager. When enabled, a gRPC server will | ||
| start on the hub, and the klusterlet agent will connect to the gRPC server for proxy functionality. | ||
| - When the feature is enabled, ClusterManager starts an HTTP server to proxy requests to the target cluster | ||
| via the klusterlet agent. | ||
| - All existing capabilities of `cluster-proxy` will be preserved. | ||
|
|
||
| ### Non-Goals | ||
|
|
||
| - How to deprecate the `cluster-proxy addon` is not in the scope of this proposal, given that `cluster-proxy` | ||
| is still consumed by `MulticlusterGateway` and other consumers. | ||
| - Enhance `clusteradm` to configure kubeconfig file with proxy's endpoint. | ||
|
|
||
| ## Proposal | ||
|
|
||
| ### User Stories | ||
|
|
||
| #### Story 1 | ||
|
|
||
| Users can enable the `ClusterProxy` feature gate in both clustermanager and klusterlet. Once enabled, users can | ||
| use the HTTP server on the hub cluster to proxy requests to the API server of the targeted cluster. | ||
|
|
||
| #### Story 2 | ||
|
|
||
| Users can authenticate via proxy using either the token for the targeted cluster, or the token for the hub | ||
| cluster if impersonation permissions are granted to the klusterlet agent. | ||
|
|
||
|
|
||
| ### Risks and Mitigation | ||
|
|
||
| N/A | ||
|
|
||
| ## Design Details | ||
|
|
||
| #### Registration | ||
| The connection between klusterlet and hub for proxy functionality will remain gRPC-based. This is independent | ||
| of the gRPC registration driver, requiring a separate gRPC configuration file for the proxy. To obtain the gRPC | ||
| configuration for proxy, the klusterlet agent will send a Certificate Signing Request (CSR) to the hub with the | ||
| signer name `open-cluster-management.io/klusterlet-proxy`. The gRPC server on the hub will approve and sign the | ||
| CSR. The gRPC configuration will be saved in the `hub-kubeconfig-secret` with the key `proxy-grpc.yaml`. The | ||
| klusterlet will only start the proxy when it detects the existence of this configuration file. The proxy | ||
| registration process begins after the cluster registration process and will not impact normal cluster | ||
| registration. | ||
|
|
||
| #### Proxy API | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder whether we should use term "proxy", I got some cases that users get confused about cluster-proxy with other "proxy" settings like addonproxyconfig.proxyconfig. Could we use another term like "tunnel"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tunnel may not be clearly state the functions. Tunnel would mean that agent and hub is establishing the tunnel, but proxy is more accurate which mean user is proxying the request. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. some naming suggestion from chatgpt. grpxy (grpc proxy, short, sysadmin vibe) tunnx (tunnel nginx-style) relayx (captures the agent relay role) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe the complete description of this functionality can be "proxy requests between hub and agent," but it can also be "establish a tunnel between hub and agent that can be used to proxy requests." For users of this functionality, they indeed only need to know about an endpoint that can proxy requests - they don't need to know about the existence of a "tunnel." However, for developers and maintainers, the term "tunnel" helps distinguish this functionality from other proxy-related features. If "tunnel" doesn't work, could we adopt other terms such as: gateway, bridge, relay, etc.? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. relay seems ok to me, but since cluster-proxy has been known to many, I am not quite sure how strong the desire to rename. @jnpacker thought? |
||
|
|
||
| The hub cluster will start an externally accessible HTTP endpoint to proxy requests to the API server of each managed | ||
| cluster. The API path follows the format: `https://<server address>:<server port>/<cluster name>`. | ||
|
|
||
| Consumers can configure their kubeconfig using this API endpoint and the related CA bundle to access the API server | ||
| of the targeted cluster. | ||
|
|
||
| #### Integration with ClusterProfile API | ||
|
|
||
| When this feature enabled, we should be able to provide a tool to build kubeconfig and connect to the targeted cluster | ||
| based on clusterInventory API using https://github.com/kubernetes-sigs/cluster-inventory-api/blob/main/pkg/credentials/config.go: | ||
|
|
||
| - Build a binary or script that could provide credential for the cluster via ManagedServiceAccount or impersonation | ||
| - Update clusterProfile status in the registration controller using proxy endpoint. | ||
|
|
||
| #### Installation | ||
|
|
||
| To enable the feature, the user needs to enable the feature gate in `ClusterManager` and `Klusterlet` API. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In hub side, except for grpc server, we will also need endpoint and certificates for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, it is specified in the following. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don‘t we also need cert and endpoint configuration of this http server also set in cluster-manager? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cluster manager will generate the cert. and we only need to configure endpoint I think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if user want to use their customzied cert as the proxy endpoint maybe exposed in their domain? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. grpc connection is internal used by proxy, but http proxy is facing to user or maybe user's user. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got your point, yes we need at least endpoint of http server. I am not sure we need ca bundle. Do we support setting this in cluster-proxy today? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. updated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, this version of http-server using CA generated by openshift:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The fix looks good to me |
||
| In addition, A new `GRPCConfiguration` field should be added in `ClusterManager` API: | ||
|
|
||
| ```go | ||
| type ServerConfiguration struct { | ||
| // imagePullSpec is the image for the server | ||
| ImagePullSpec string `json:"imagePullSpec,omitempty"` | ||
|
|
||
| // featureGates represents the features enabled for the server | ||
| FeatureGates []FeatureGate `json:"featureGates,omitempty"` | ||
|
|
||
|
|
||
| // endpointsExposure represents the configuration for endpoints exposure of the server. | ||
| // +optional | ||
| EndpointsExposure []EndpointExposure `json:"endpointsExposure,omitempty"` | ||
| } | ||
|
|
||
|
|
||
| type EndpointExposure struct { | ||
| // usage defines the usage of the endpoint. It could be "agentToHub" indicating the endpoint is used | ||
| // for communication between agent and hub, or "consumer" indicating the endpoint is used for external consumer. | ||
| // +optional | ||
| Usage string `json:"usage,omitempty"` | ||
|
|
||
|
|
||
| // protocol is the protocol used for the endpoint, could be https or grpc. | ||
| // +kubebuilder:default:=grpc | ||
| // +kubebuilder:validation:Enum=grpc;https | ||
| // +required | ||
| Protocol string `json:"protocol"` | ||
|
|
||
|
|
||
| // grpc represents the configuration for grpc endpoint. | ||
| GRPC *Endpoint `json:"grpc,omitempty"` | ||
|
|
||
|
|
||
| // https represents the configuration for https endpoint. | ||
| HTTPS *Endpoint `json:"https,omitempty"` | ||
| } | ||
|
|
||
|
|
||
| type Endpoint struct { | ||
| // type specifies how the endpoint is exposed. | ||
| // You may need to apply an object to expose the endpoint, for example: a route. | ||
| // TODO: support loadbalancer. | ||
| // +kubebuilder:default:=hostname | ||
| // +kubebuilder:validation:Enum=hostname | ||
| // +required | ||
| Type EndpointExposureType `json:"type,omitempty"` | ||
|
|
||
|
|
||
| // hostname points to a fixed hostname for serving agents' handshakes. | ||
| // +optional | ||
| Hostname *HostnameConfig `json:"hostname,omitempty"` | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| The following are examples of `ClusterManager` on how to install grpc proxy. | ||
| Example of using proxy with csr registration and proxy would look like: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| serverConfiguration: | ||
| imagePullSpec: <grpc image> | ||
| featureGates: | ||
| - feature: ClusterProxy | ||
| mode: Enabled | ||
| endpointsExposure: | ||
| - usage: user-server | ||
| protocol: https | ||
| https: | ||
| type: hostname | ||
| hostname: https://<external http server> | ||
| - usage: agentToServer | ||
| protocol: grpc | ||
| grpc: | ||
| type: hostname | ||
| hostname: grpc://<external grpc address> | ||
| ``` | ||
|
|
||
| Example of grpc registration with proxy enabled: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| registrationConfiguration: | ||
| registrationDrivers: | ||
| - authType: csr | ||
| - authType: grpc | ||
| serverConfiguration: | ||
| imagePullSpec: <grpc image> | ||
| featureGates: | ||
| - feature: ClusterProxy | ||
| mode: Enabled | ||
| endpointsExposure: | ||
| - usage: user-server | ||
| protocol: https | ||
| https: | ||
| type: hostname | ||
| hostname: https://<external http server> | ||
| - usage: agentToServer | ||
| protocol: grpc | ||
| grpc: | ||
| type: hostname | ||
| hostname: grpc://<external grpc address> | ||
| ``` | ||
|
|
||
| Example of grpc registraion with proxy disabled: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| registrationConfiguration: | ||
| registrationDrivers: | ||
| - authType: csr | ||
| - authType: grpc | ||
| serverConfiguration: | ||
| imagePullSpec: <grpc image> | ||
| - usage: agentToServer | ||
| protocol: grpc | ||
| grpc: | ||
| type: hostname | ||
| hostname: grpc://<external grpc address> | ||
| ``` | ||
|
|
||
| The proxyConfig field will also be added onto `Klusterlet` API: | ||
|
|
||
| ```go | ||
| type ProxyConfig struct { | ||
| GRPC *EndpointExposure `json:"grpcEndpoint"` | ||
|
|
||
| // Authentications defines how the agent authenticates with the cluster. | ||
| // By default it is userToken, but it could also be impersonation or both. | ||
| Authentications []string `json:"authentications,omitempty"` | ||
| } | ||
| ``` | ||
|
|
||
| An example of enabling proxy on klusterlet will be: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| proxyConfig: | ||
| grpcEndpoint: | ||
| endpoint: <grpc://server address> | ||
| caBundle: <base64 encoded ca> | ||
| authentications: | ||
| - userToken | ||
| - impersonation | ||
| ``` | ||
|
|
||
| ### Test Plan | ||
|
|
||
| **Note:** *Section not required until targeted at a release.* | ||
|
|
||
| Consider the following in developing a test plan for this enhancement: | ||
| - When the feature is enabled, the proxy is installed correctly | ||
| - Users can access the target cluster via proxy | ||
| - Users can authenticate using either token or impersonation methods | ||
|
|
||
| ### Graduation Criteria | ||
|
|
||
| **Alpha:** | ||
| - Proxy can be installed and function correctly | ||
| - Proxy works with each registration driver | ||
|
|
||
| **Beta:** | ||
| - At least two consumers are using or migrating from cluster-proxy addon to this feature | ||
| - clusteradm is updated to adopt this feature | ||
| - Integrate with cluster-inventory API. | ||
| - End-to-end tests ensure all use cases of existing cluster-proxy addon are covered | ||
|
|
||
| **GA (Graduate):** | ||
qiujian16 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - All consumers have migrated from cluster-proxy addon to this feature. | ||
| - Metrics are defined and exposed for the proxy server. | ||
| - Scalability testing and performance analysis on throughput, resource consumption, and concurrent requests are completed. | ||
|
|
||
| ### Upgrade / Downgrade Strategy | ||
|
|
||
| TBD | ||
|
|
||
| ### Version Skew Strategy | ||
|
|
||
| There should be no version compatibility issues. | ||
|
|
||
| ## Implementation History | ||
|
|
||
| N/A | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| The proxy agent may consume excessive bandwidth, potentially impacting the existing registration process. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| Enhance the cluster-proxy addon instead of moving it to core. Since the proxy is becoming a fundamental function | ||
| in OCM, maintaining it as a core function will ensure better stability and quality. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| title: workoad-completion | ||
| authors: | ||
| - "@qiujian16" | ||
| reviewers: | ||
| - "@skeeey" | ||
| - "@zhujian7" | ||
| - "@tamal" | ||
| - "@xuezhaojun" | ||
| approvers: | ||
| - "@jnpacker" | ||
| creation-date: 2025-02-24 | ||
| last-updated: 2025-02-24 | ||
| status: provisional | ||
| see-also: | ||
| - "/enhancements/sig-architecture/14-addon-cluster-proxy" | ||
| - "/enhancements/sig-architecture/141-grpc-based-registration" |
Uh oh!
There was an error while loading. Please reload this page.