Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD0185: Kubernetes Access without Certificate-based Routing #47436

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

strideynet
Copy link
Contributor

@strideynet strideynet commented Oct 10, 2024

Part of #40405

Proposal to replace Certificate-attribute based routing in Kubernetes Access for performance and usability improvements. There's a very rough patch of what this change would look like at 47b55a8 courtesy of @tigrato


## Security Considerations

### Per-session MFA
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My outstanding concern here relates to whether or not Per-Session MFA is meant to be locked to a specific target.

Today, the KubernetesCluster attribute seemingly has the side-effect of locking Per-Session MFA certificates to a specific target cluster. Looking back to the original Per-Session MFA RFD, the design does state that that should be the case, but says that a "TargetName" attribute should be used to enforce this - which I can't see any evidence of within the codebase today.

If locking Per-Session MFA certs to a specific cluster is intended/required, then we can retain the KubernetesCluster attribute, but make this intention more explicit by moving the code that enforces this into the authorizer rather than as part of the router/forwarder.

Comment on lines +58 to +59
encode attributes of an identity - information about which cluster you wish to
connect to is not an attribute of identity!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will approve the RFD just because of this comment

```yaml
apiVersion: v1
clusters:
- cluster:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to note that this setup requires a separate cluster entry for each Kubernetes cluster. Currently, our implementation assigns one cluster per Teleport cluster rather than per Kubernetes cluster. This approach has a significant drawback: when two different Kubernetes clusters are part of the same Teleport cluster, kubectl caches objects in the same directory .kube/cache/discovery/teleport_proxy_port. This can lead to major issues due to version mismatches or incompatible CRDs.

The proposed solution addresses all these problems because kubectl will start caching data in .kube/cache/discovery/teleport_proxy_port/v1/teleport/base64_teleport_cluster_name/base64_kube_cluster_name

Check #12731

Kubernetes clients:

- `kubectl`
- Python Kubernetes SDK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth checking javascript client as we have some that also use it for automation.

It would no longer be possible to track usage of Kubernetes Access for analytics
purposes via the certificate generation event. However, we have other events
that indicate Kubernetes Access usage (e.g an event per request), that are
better suited for this purpose - so this seems acceptable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that when kubectl performs discovery, it requests a single certificate but retrieves all API resource definitions. In theory, this could generate duplicate events, but Teleport disregards discovery requests for usage purposes, so nothing will change neither the metrics will be duplicated.

As for billing, since we count Kubernetes requests, this behavior won’t affect anything.

# are provided, then the output will contain the union of the results of the
# selectors.
selectors:
- name: my-cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this generate a separate kubeconfig for each cluster, or a single configuration that includes all clusters?

I prefer the latter, but it might be worth mentioning or offering both options. We should also pay attention to context and file naming. By default, tsh creates context entries using the format {{TeleportCluster}}-{{KubeCluster}}, but this can be customized by providing a template with the context override flag. It would be useful to allow users to adjust these settings to customize their outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - following discussion with [CUSTOMER] - I figure it's easier to give an option between a single kubeconfig and a kubeconfig per cluster. I'm sure one you get into the thousands of clusters range, certain implementations of the Kube SDK will probably dislike the size of the config file.

By default, tsh creates context entries using the format {{TeleportCluster}}-{{KubeCluster}}, but this can be customized by providing a template with the context override flag. It would be useful to allow users to adjust these settings to customize their outputs.

This seems reasonable - we use the same context structure in tbot as well atm fwiw.


This has the following limitations:

- We've already witnessed limited support for specifying SNIs in Kubernetes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main challenge is that by using SNI for routing and supporting kubectl, Teleport would need to generate certificates for *.rootCluster.kube.teleportproxy.com, as well as for leaf clusters *.leafCluster.kube.teleportproxy.com. Over time, this approach could introduce significant issues as the system evolves. Since we must support kubectl, the only viable solution is to send the identification via tls-server-name.

@strideynet strideynet added the no-changelog Indicates that a PR does not require a changelog entry label Oct 11, 2024
@strideynet strideynet marked this pull request as ready for review October 14, 2024 11:00
@github-actions github-actions bot added rfd Request for Discussion size/md labels Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry rfd Request for Discussion size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants