Skip to content

Commit

Permalink
add blog document for feature ServiceNodePortStaticSubrange
Browse files Browse the repository at this point in the history
  • Loading branch information
xuzhenglun committed Mar 22, 2023
1 parent 5a97479 commit 3eb86f1
Showing 1 changed file with 152 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
layout: blog
title: "Kubernetes 1.27: Avoid Collisions Assigning ports to Nodeport Services"
date: 2023-03-08
slug: nodeport-dynamic-and-static-allocation
---

**Author:** xuzhenglun (Alibaba)

In Kubernetes, the Service can be used to provide a unified traffic entry point for
applications running on a set of Pods. Clients can use the virtual IP address provided
by the Service for access, and Kubernetes provides load balancing for traffic accessing
different back-end Pods, but ClusterIP-type Service is limited to providing access to
nodes within the cluster, and traffic from outside the cluster cannot be routed.
One way to solve this problem is to use NodePort-type Service, which maps a VIP's port
to a specific port of all nodes in the cluster, thus redirecting traffic from the
outside to the inside of the cluster.

## How NodePort's ports are allocated?

When a NodePort-type Service is created, its corresponding port is allocated in two ways.

**Dynamic** : If the Service type is NodePort and the corresponding `service.spec.ports.nodePort`
is not specified, the Kubernetes controlpalne will automatically allocat a unused port
to it at creation time.

**Static** : In addition to the dynamic auto-assignment described above, you can also
explicitly assign a port that is within the nodeport port configuration.

the `service.spec.ports.nodePort` must be unique for each NodePort-type Service across the cluster.
Attempting to create a NodePort-type Service with a port already allocated will return an error.

## Why do you need to reserve ports of NodePort-type Service?
Sometimes, you may want to have the NodePort-type Service running on well-known ports
so that other components and users inside and outside the cluster can use them.

In some complex cluster deployments with a mix of K8S nodes and non-K8S nodes, it may be
necessary to rely on some pre-defined port for communicating. In particular, some fundamental
components cannot rely on the VIPs which provided by LoadBalancer-type services,
because the load balancing itself that provides the VIPs may also rely on these fundamental components.

Now suppose we need to expose the Minio object storage service on K8S to services located on non-K8S nodes,
and the agreed port is `30009`, we need to create a Service as follows:

```yaml
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: minio
name: minio
namespace: kube-system
spec:
ports:
- name: http
nodePort: 30009
port: 9000
protocol: TCP
targetPort: 9000
selector:
app: minio
type: NodePort
```
However, as mentioned before, if the port `30009` required for the minio Service is not reserved,
and another NodePort Service is created and dynamically allocated before or concurrently with minio
Service, `30009` can be allocated to those NodePort Service, and minio Service will fail to be
created due to a port conflict.

## How can you avoid NodePort-type Service port conflicts?
In the Kubernetes 1.24, Service ClusterIP had already divided the CIDRs into two blocks, using different
allocation policies to reduce the risk of conflicts. In Kubernetes 1.27, a similar policy can be adopted
for NodePort. You can enable a new featuregate `ServiceNodePortStaticSubrange`. Turning this on allows you
to use a different port allocation strategy for NodePort-type Services, and reducing the risk of collision.

The port range for `NodePort` will be divided, based on the formula `min(max(16, nodeport-size / 32), 128)`,
which can be described as _never less than 16 or more than 128 with a graduated step between them_.

Dynamic port assignment will use the upper band by default, once this has been exhausted it will use the lower range.
This will allow users to use static allocations on the lower band with a low risk of collision.

## Examples:

### default range: 30000-32767
service-node-port-range: 30000-32767
Band Offset: `min(max(16, 2768/32), 128)` = `min(max(16, 86), 128)` = `min(86, 128)` = 86
Static band start: 30000
Static band end: 30085
Range end: 32767

{{< mermaid >}}
pie showData
title 30000-32767
"Static" : 86
"Dynamic" : 2682
{{< /mermaid >}}

### very small range: 30000-30015
service-node-port-range: 30000-30015
Band Offset: 0
Static band start: 30000
Static band end: 30000
Range end: 30015

{{< mermaid >}}
pie showData
title 30000-30015
"Static" : 0
"Dynamic" : 16
{{< /mermaid >}}

### small(lower boundary) range: 30000-30127
service-node-port-range: 30000-30127
Band Offset: `min(max(16, 128/32), 128)` = `min(max(16, 4), 128)` = `min(16, 128)` = 16
Static band start: 30000
Static band end: 30015
Range end: 30127

{{< mermaid >}}
pie showData
title 30000-30127
"Static" : 16
"Dynamic" : 112
{{< /mermaid >}}

### large(upper boundary) range: 30000-34095
service-node-port-range: 30000-34095
Band Offset: `min(max(16, 4096/32), 128)` = `min(max(16, 128), 128)` = `min(128, 128)` = 128
Static band start: 30000
Static band end: 30127
Range end: 34095

{{< mermaid >}}
pie showData
title 30000-34095
"Static" : 128
"Dynamic" : 3968
{{< /mermaid >}}

### very large range: 30000-38191
service-node-port-range: 30000-38191
Band Offset: `min(max(16, 8192/32), 128)` = `min(max(16, 256), 128)` =` min(256, 128)` = 128
Static band start: 30000
Static band end: 30127
Range end: 38191

{{< mermaid >}}
pie showData
title 30000-38191
"Static" : 128
"Dynamic" : 3968
{{< /mermaid >}}

0 comments on commit 3eb86f1

Please sign in to comment.