-
-
Notifications
You must be signed in to change notification settings - Fork 12
ADR: Listener Operator #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 14 commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
4a225e6
Added initial snippet
ddb99d4
More text
1c66f91
More text
505930d
More text
113439c
More text
1a7245b
fixed typos and formatting
c4b49da
Update modules/contributor/pages/adr/ADR000-WIP.adoc
fhennig a671c38
Update modules/contributor/pages/adr/ADR000-WIP.adoc
fhennig ae4f440
Added static config files problem
fhennig 1800d2f
Added calico, ARP notes
fhennig 399776d
Clarification
fhennig 8a1ece9
Many updates
fhennig 5f2c789
Many updates
fhennig eb60a79
Merge branch 'main' into lb-operator-adr
fhennig 729a25b
Clarified how clients connect
fhennig e096191
Added note on the name
fhennig 5ef9683
Added a more explicit notes on considered alternatives
fhennig 14d0861
Clarification on 'single address'
fhennig 9c69eee
Expanded context
fhennig 34f8d66
Expanded context
fhennig 62017af
Merge branch 'main' into lb-operator-adr
fhennig c0ae76a
Added authors etc.
fhennig e08f2fa
Merge remote-tracking branch 'refs/remotes/origin/lb-operator-adr' in…
fhennig 4c3f98f
Update modules/contributor/pages/adr/ADR000-WIP.adoc
fhennig 215bce8
Added CRD examples and something about node failure
fhennig 1f323df
Added something on external IPs
fhennig 5c4a97d
Added something about role LoadBalances
fhennig d1a7e8a
Renamed the file and added it to the menu aus ADR024
fhennig 5f7d844
Update modules/contributor/pages/adr/ADR024-out-of-cluster_access.adoc
fhennig 0af83dd
Update modules/contributor/pages/adr/ADR024-out-of-cluster_access.adoc
fhennig 7d45d32
Update modules/contributor/pages/adr/ADR024-out-of-cluster_access.adoc
fhennig 03f2b23
Update modules/contributor/pages/adr/ADR024-out-of-cluster_access.adoc
fhennig d9c7fc4
Some changes
fhennig 85cacac
Update modules/contributor/pages/adr/ADR024-out-of-cluster_access.adoc
fhennig f0cdb20
Merge branch 'main' into lb-operator-adr
fhennig File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
= How to provide stable out-of-cluster access to products | ||
Felix Hennig <felix.hennig@stackable.tech> | ||
v0.1, YYYY-MM-DD | ||
:status: draft | ||
|
||
* Status: {status} | ||
* Deciders: [list everyone involved in the decision] <!-- optional --> | ||
* Date: [YYYY-MM-DD when the decision was last updated] <!-- optional --> | ||
|
||
Technical Story: [description | ticket/issue URL] <!-- optional --> | ||
|
||
== Context and Problem Statement | ||
// Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question. | ||
|
||
Eventually, the products we host in Kubernetes will need to be accessed from outside of the cluster. Our current solution for this is NodePort services. However, the IP and port can change, if a Pod is rescheduled to a different node, or if a ProductCluster is restarted. | ||
|
||
Furthermore, some products like HDFS and Kafka don't use a single router or portal node to access the cluster, but have the client access multiple nodes. For example, HDFS name nodes will tell the client where data can be found (hostname/IP and port), the client is then expected to connect directly to a specific data node. Similarly for Kafka and topic shards. | ||
|
||
Problems: | ||
|
||
* **Unstable addresses** - Clients need stable addresses to connect to, but Kubernetes can move pods around. While the discovery ConfigMap is updated, it's not feasible to ask the client to pull the new info from there every time, clients will want to use static config files with static addresses to connect to. | ||
* **Replicas not addressable** - In our current setup, there's no way to connect to a specific replica in a StatefulSet or Deployement - which is necessary for cases like the data nodes of HDFS. | ||
* **Pods don't know their outside address** - The hostname and IP that the pods know about themselves is from _inside_ the cluster. The IP only works inside the overlay network. This means ProductCluster processes cannot link to other nodes of the cluster. | ||
|
||
== Decision Drivers | ||
// Which criteria are useful to evaluate solutions? | ||
|
||
* At least for HDFS, connections to individual pods will be used to transmit data, this means that performance is relevant. | ||
* On-prem customers will often not have any kind of network-level load balancing (at least not one that is configurable by K8s). | ||
* Cloud customers will often have relatively short-lived K8s nodes. | ||
* The solution should be minimally invasive - no large setups required outside of the cluster. | ||
|
||
vsupalov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
== Implemented Solution | ||
|
||
vsupalov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
A new resource is proposed: Listener. It is handled similarly to storage. There is are ListenerClasses for different types of Listeners - analogous to StorageClass. There are Listener objects - similar to PersistentVolumes. And claims to listeners are made in ProductCluster objects. | ||
|
||
Under the hood a listener-operator runs as a CSI driver with a new `listener.stackable.tech` type. Listener claims in the ProductCluster resource are then converted by the product operator into PersistentVolumeClaims (PVCs) to the storage type. Listener settings are passed along as annotations to the PVC. Initially there will be two Listener types - `private` implemented with NodePorts; and `public` implemented with LoadBalancers. The listener-operator creates the Listeners according to the PVC settings and provides the listener info in the PV into the pods with the PVCs. | ||
vsupalov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Communication flow example using the HDFS Operator: | ||
|
||
* A HDFS cluster resource is created by the user, with a `private` listener setting. | ||
* The HDFS Operator requests a PVC of the listener.stackable.tech type and an annotation to create a `private` listener. | ||
* The listener-operator provisions a NodePort Service for the volume request, which means a Service per Pod. It reads the NodePort IP and port. | ||
* The listener-operator provisions the volumes with files inside containing information about the pods outside address and port - The IP and port of the NodePort Service. Because of the PVC it knows which pod the volume will be mounted into, and can find out the NodePort that belongs to the pod. | ||
* the HDFS operator already provisioned the pod with a script that read the files from the mounted volume into environment variables which are then read by HDFS. This part is product specific. | ||
|
||
The way the product operator requests the volume is identical for all pods of a StatefulSet/Deployment: it always requests a volume with the type (i.e. `nodeport`) that was configured in the ProductCluster. | ||
|
||
== Decision Outcome | ||
|
||
There is only one design, which is already in its implementation. | ||
|
||
|
||
Pros: | ||
|
||
* There is little routing overhead (compared to proxying or similar). | ||
* The listener-operator can be extended to support more ListenerClasses. | ||
* It is a very low-friction solution that doesn't require a lot of permissions to set up. | ||
|
||
Cons: | ||
|
||
* Products like HDFS and Kafka only support having a single address. This means that if outside access with the lb operator is configured, all traffic will be routed that way. | ||
vsupalov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* It is another DaemonSet Operator, which means more stuff that is running. It is also not clear how we will get this certified with OpenShift. | ||
fhennig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
== Other notes | ||
|
||
=== Spiked Alternatives: MetalLB, Calico | ||
See: https://metallb.universe.tf/, https://www.tigera.io/project-calico/ | ||
|
||
MetalLB is a bare metal load balancer that was spiked briefly. However it requires BGP/ARP integration, which is not feasible as a requirement for customer installations. Calico requires BGP. | ||
|
||
With ARP, the LoadBalancers appear as "real" IP addresses in the same subnet as the nodes (with no need to configure custom routing roules). However, this scales poorly (it assumes that all nodes are in the same L2 broadcast domain) and is relatively likely to be blocked by firewalls or network policy. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.