Skip to content

Commit

Permalink
Support simple ping mesh in Agent (#6120)
Browse files Browse the repository at this point in the history
We introduce a new feature to measure inter-Node latency in a K8s
cluster running Antrea. The feature is currently Alpha and uses the
NodeLatencyMonitor FeatureGate.

In addition to the FeatureGate, enablement of the feature is controlled
by a new CRD, called NodeLatencyMonitor. This CRD supports at most one
CR instance, which must be named "default". When the CR exists, Antrea
Agents will start "pinging" each other to take latency measurements.

Each Agent only stores the latest measured value (at least at the
moment), we do not store time series data.

We support both IPv4 and IPv6. When an oberlay is used by Antrea, the
ping is sent over the tunnel (by using the gateway IP as the
destination).

This change does not add any functionality besides collecting latency
data at each Agent. A follow-up change will take care of reporting the
latency data to the Antrea Controller, so it can be consumed via an
APIService.

For #5514

Signed-off-by: IRONICBo <boironic@gmail.com>
Signed-off-by: Asklv <boironic@gmail.com>
  • Loading branch information
IRONICBo authored May 31, 2024
1 parent f7cce73 commit c3103a9
Show file tree
Hide file tree
Showing 28 changed files with 1,860 additions and 10 deletions.
3 changes: 3 additions & 0 deletions build/charts/antrea/conf/antrea-agent.conf
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,9 @@ featureGates:
# Enable L7FlowExporter on Pods and Namespaces to export the application layer flows such as HTTP flows.
{{- include "featureGate" (dict "featureGates" .Values.featureGates "name" "L7FlowExporter" "default" false) }}

# Enable NodeLatencyMonitor to monitor the latency between Nodes.
{{- include "featureGate" (dict "featureGates" .Values.featureGates "name" "NodeLatencyMonitor" "default" false) }}

# Name of the OpenVSwitch bridge antrea-agent will create and use.
# Make sure it doesn't conflict with your existing OpenVSwitch bridges.
ovsBridge: {{ .Values.ovs.bridgeName | quote }}
Expand Down
48 changes: 48 additions & 0 deletions build/charts/antrea/crds/nodelatencymonitor.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodelatencymonitors.crd.antrea.io
spec:
group: crd.antrea.io
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- pingIntervalSeconds
properties:
pingIntervalSeconds:
type: integer
format: int32
minimum: 1
description: "Ping interval in seconds, must be at least 1."
default: 60
metadata:
type: object
properties:
name:
type: string
pattern: '^default$'
additionalPrinterColumns:
- description: Specifies the interval between pings.
jsonPath: .spec.pingIntervalSeconds
name: PingIntervalSeconds
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
scope: Cluster
names:
plural: nodelatencymonitors
singular: nodelatencymonitor
kind: NodeLatencyMonitor
shortNames:
- nlm
1 change: 1 addition & 0 deletions build/charts/antrea/templates/agent/clusterrole.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ rules:
- externalippools
- ippools
- trafficcontrols
- nodelatencymonitors
verbs:
- get
- watch
Expand Down
59 changes: 57 additions & 2 deletions build/yamls/antrea-aks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2686,6 +2686,57 @@ spec:
# Deprecated shortName and shall be removed in Antrea v1.14.0
- anp

---
# Source: crds/nodelatencymonitor.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodelatencymonitors.crd.antrea.io
spec:
group: crd.antrea.io
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- pingIntervalSeconds
properties:
pingIntervalSeconds:
type: integer
format: int32
minimum: 1
description: "Ping interval in seconds, must be at least 1."
default: 60
metadata:
type: object
properties:
name:
type: string
pattern: '^default$'
additionalPrinterColumns:
- description: Specifies the interval between pings.
jsonPath: .spec.pingIntervalSeconds
name: PingIntervalSeconds
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
scope: Cluster
names:
plural: nodelatencymonitors
singular: nodelatencymonitor
kind: NodeLatencyMonitor
shortNames:
- nlm

---
# Source: crds/supportbundlecollection.yaml
apiVersion: apiextensions.k8s.io/v1
Expand Down Expand Up @@ -3624,6 +3675,9 @@ data:
# Enable L7FlowExporter on Pods and Namespaces to export the application layer flows such as HTTP flows.
# L7FlowExporter: false
# Enable NodeLatencyMonitor to monitor the latency between Nodes.
# NodeLatencyMonitor: false
# Name of the OpenVSwitch bridge antrea-agent will create and use.
# Make sure it doesn't conflict with your existing OpenVSwitch bridges.
ovsBridge: "br-int"
Expand Down Expand Up @@ -4259,6 +4313,7 @@ rules:
- externalippools
- ippools
- trafficcontrols
- nodelatencymonitors
verbs:
- get
- watch
Expand Down Expand Up @@ -4920,7 +4975,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 30843b57762c91dfcffb560917191e3bc7e662c06552759bac2a173bc060b82c
checksum/config: 47a8888bb99a5b1a08dea61e9315bacf613d869d718712ad0eb9964bb73dc0ec
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5158,7 +5213,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 30843b57762c91dfcffb560917191e3bc7e662c06552759bac2a173bc060b82c
checksum/config: 47a8888bb99a5b1a08dea61e9315bacf613d869d718712ad0eb9964bb73dc0ec
labels:
app: antrea
component: antrea-controller
Expand Down
49 changes: 49 additions & 0 deletions build/yamls/antrea-crds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2667,6 +2667,55 @@ spec:
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodelatencymonitors.crd.antrea.io
spec:
group: crd.antrea.io
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- pingIntervalSeconds
properties:
pingIntervalSeconds:
type: integer
format: int32
minimum: 1
description: "Ping interval in seconds, must be at least 1."
default: 60
metadata:
type: object
properties:
name:
type: string
pattern: '^default$'
additionalPrinterColumns:
- description: Specifies the interval between pings.
jsonPath: .spec.pingIntervalSeconds
name: PingIntervalSeconds
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
scope: Cluster
names:
plural: nodelatencymonitors
singular: nodelatencymonitor
kind: NodeLatencyMonitor
shortNames:
- nlm
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: supportbundlecollections.crd.antrea.io
spec:
Expand Down
59 changes: 57 additions & 2 deletions build/yamls/antrea-eks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2686,6 +2686,57 @@ spec:
# Deprecated shortName and shall be removed in Antrea v1.14.0
- anp

---
# Source: crds/nodelatencymonitor.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodelatencymonitors.crd.antrea.io
spec:
group: crd.antrea.io
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- pingIntervalSeconds
properties:
pingIntervalSeconds:
type: integer
format: int32
minimum: 1
description: "Ping interval in seconds, must be at least 1."
default: 60
metadata:
type: object
properties:
name:
type: string
pattern: '^default$'
additionalPrinterColumns:
- description: Specifies the interval between pings.
jsonPath: .spec.pingIntervalSeconds
name: PingIntervalSeconds
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
scope: Cluster
names:
plural: nodelatencymonitors
singular: nodelatencymonitor
kind: NodeLatencyMonitor
shortNames:
- nlm

---
# Source: crds/supportbundlecollection.yaml
apiVersion: apiextensions.k8s.io/v1
Expand Down Expand Up @@ -3624,6 +3675,9 @@ data:
# Enable L7FlowExporter on Pods and Namespaces to export the application layer flows such as HTTP flows.
# L7FlowExporter: false
# Enable NodeLatencyMonitor to monitor the latency between Nodes.
# NodeLatencyMonitor: false
# Name of the OpenVSwitch bridge antrea-agent will create and use.
# Make sure it doesn't conflict with your existing OpenVSwitch bridges.
ovsBridge: "br-int"
Expand Down Expand Up @@ -4259,6 +4313,7 @@ rules:
- externalippools
- ippools
- trafficcontrols
- nodelatencymonitors
verbs:
- get
- watch
Expand Down Expand Up @@ -4920,7 +4975,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 30843b57762c91dfcffb560917191e3bc7e662c06552759bac2a173bc060b82c
checksum/config: 47a8888bb99a5b1a08dea61e9315bacf613d869d718712ad0eb9964bb73dc0ec
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5159,7 +5214,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 30843b57762c91dfcffb560917191e3bc7e662c06552759bac2a173bc060b82c
checksum/config: 47a8888bb99a5b1a08dea61e9315bacf613d869d718712ad0eb9964bb73dc0ec
labels:
app: antrea
component: antrea-controller
Expand Down
Loading

0 comments on commit c3103a9

Please sign in to comment.