Skip to content

Errors of Gateway API Canary Deployment in A/B testing #1871

@pluniov99

Description

@pluniov99

Describe the bug

Errors when trying to start Gateway API Canary Deployment in A/B testing.

To Reproduce

Try to create Gateway API Canary Deployment according to the documentation for A/B testing with the following manifests:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test-gateway
  namespace: default
spec:
  gatewayClassName: istio
  listeners:
  - name: http
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Same
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-service
  namespace: default
  labels:
    app: test-service
spec:
  minReadySeconds: 5
  revisionHistoryLimit: 5
  progressDeadlineSeconds: 60
  strategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
  selector:
    matchLabels:
      app: test-service
  template:
    metadata:
      namespace: default
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9797"
      labels:
        app: test-service
    spec:
      containers:
      - name: podinfod
        image: ghcr.io/stefanprodan/podinfo:6.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 9898
          protocol: TCP
        - name: http-metrics
          containerPort: 9797
          protocol: TCP
        - name: grpc
          containerPort: 9999
          protocol: TCP
        command:
        - ./podinfo
        - --port=9898
        - --port-metrics=9797
        - --grpc-port=9999
        - --grpc-service-name=podinfo
        - --level=info
        - --random-delay=false
        - --random-error=false
        env:
        - name: PODINFO_UI_COLOR
          value: "#34577c"
        livenessProbe:
          exec:
            command:
            - podcli
            - check
            - http
            - localhost:9898/healthz
          initialDelaySeconds: 5
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command:
            - podcli
            - check
            - http
            - localhost:9898/readyz
          initialDelaySeconds: 5
          timeoutSeconds: 5
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: test-service
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  minReplicas: 1
  maxReplicas: 1
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          # scale up if usage is above
          # averageUtilization% of the requested CPU
          averageUtilization: 99
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: test-service
  namespace: default
spec:
  provider: gatewayapi:v1
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    name: test-service
  service:
    # service port number
    port: 80
    # container port number or name (optional)
    targetPort: 9898
    # Reference to the Gateway that the generated HTTPRoute would attach to.
    gatewayRefs:
      - name: test-gateway
        namespace: default
  analysis:
    # schedule interval (default 60s)
    interval: 1m
    # total number of iterations
    iterations: 10
    # max number of failed iterations before rollback
    threshold: 2
    # canary match condition
    match:
      - headers:
          test: 
            exact: "test-service"

As a result, we see all resources, created by canary

# generated 
deployment.apps/test-service-primary
horizontalpodautoscaler.autoscaling/test-service-primary
service/test-service
service/test-service-canary
service/test-service-primary
httproutes.gateway.networking.k8s.io/test-service

Canary has initialized status. All resources are ready. HTTProute has the following spec:

spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: test-gateway
    namespace: default
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: test-service-primary
      port: 80
      weight: 100
    - group: ""
      kind: Service
      name: test-service-canary
      port: 80
      weight: 0
    matches:
    - headers:
      - name: test
        type: Exact
        value: test-service
      path:
        type: PathPrefix
        value: /
  - backendRefs:
    - group: ""
      kind: Service
      name: test-service-primary
      port: 80
      weight: 100
    matches:
    - path:
        type: PathPrefix
        value: /

After that when we try to start deployment of new version in flagger logs we see the next error:

HTTPRoute test-service.default update error: HTTPRoute.gateway.networking.k8s.io "test-service" is invalid: spec.rules[1].timeouts.request: Invalid value: "": spec.rules[1].timeouts.request in body should match '^([0-9]{1,5}(h|m|s|ms)){1,4}$' while setting weights

And canary stuck in progressing status.

Expected behavior

Successful canary deployment

Additional context

  • Flagger version: 1.42.0
  • Kubernetes version: 1.34
  • Service Mesh provider: gatewayapi:v1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions