Description
Describe the bug
When using NGINX Plus with NGINX Gateway Fabric, the upstreams are applied correctly when HTTPRoute is applied via the API, but when the deployment is scaled, upstreams are removed incorrectly.
To Reproduce
Steps to reproduce the behavior:
- Deploy NGINX Gateway Fabric with NGINX Plus.
- Port-Forward the traffic for NGF using:
kubectl port-forward -n nginx-gateway nginx-gateway-nginx-gateway-fabric-67fd757b54-tlvvc 8765:8765 &
- Apply this example with updating
cafe.yaml
to
apiVersion: apps/v1
kind: Deployment
metadata:
name: coffee
spec:
replicas: 1
selector:
matchLabels:
app: coffee
template:
metadata:
labels:
app: coffee
spec:
containers:
- name: coffee
image: nginxdemos/nginx-hello:plain-text
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: coffee
spec:
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
selector:
app: coffee
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: tea
spec:
replicas: 1
selector:
matchLabels:
app: tea
template:
metadata:
labels:
app: tea
spec:
containers:
- name: tea
image: nginxdemos/nginx-hello:plain-text
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /
port: 1234
---
apiVersion: v1
kind: Service
metadata:
name: tea
spec:
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
selector:
app: tea
- Check pods and tea pod will not be ready due to the readiness probe at port 1234
kubectl get pods
coffee-56b44d4c55-j4wpq 1/1 Running 0 130m
tea-7f9b79bc55-lbz95 0/1 Running 0 130m
Check the NGINX Plus Dashboard at the forwarded port. You should see upstreams for both tea
and coffee
.
- Now, scale the tea deployment
kubectl scale deploy tea --replicas=2
Check the dashboard again, you will see only upstream for coffee
and not tea
.
When sending a curl request to tea
, it return 502 which is expected behavior
curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea
http://cafe.example.com:$GW_PORT/tea
Handling connection for 8080
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.25.5</center>
</body>
</html>
Check nginx logs
kubectl logs -n nginx-gateway nginx-gateway-nginx-gateway-fabric-67fd757b54-tlvvc -c nginx
You should see error about no upstream available
2024/06/04 22:03:08 [info] 184#184: *1777 client 127.0.0.1 closed keepalive connection
2024/06/04 22:03:49 [error] 179#179: *1782 no live upstreams while connecting to upstream, client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", upstream: "http://default_tea_80/tea", host: "cafe.example.com:8080"
127.0.0.1 - - [04/Jun/2024:22:03:49 +0000] "GET /tea HTTP/1.1" 502 157 "-" "curl/8.4.0"
2024/06/04 22:03:50 [info] 179#179: *1782 client 127.0.0.1 closed keepalive connection
Expected behavior
The expected behavior is that upstreams are added correctly when HTTPRoute is added from API as well as when it is scaled. The curl request to tea should return bad gateway, but no errors should be reported in nginx error logs.
Your environment
- Version of the NGINX Gateway Fabric - release version 1.2
- Version of Kubernetes
Client Version: version.Info{Major:"1", Minor:"26+", GitVersion:"v1.26.13-dispatcher", GitCommit:"eb237aa977a4a0cf4fcec65fc730b4d96af37ccf", GitTreeState:"clean", BuildDate:"2024-02-08T00:19:39Z", GoVersion:"go1.20.13", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.0", GitCommit:"7c48c2bd72b9bf5c44d21d7338cc7bea77d0ad2a", GitTreeState:"clean", BuildDate:"2024-05-13T22:02:25Z", GoVersion:"go1.22.2", Compiler:"gc", Platform:"linux/arm64"}
- Kubernetes platform - Kind
- Details on how you expose the NGINX Gateway Fabric Pod - Added in reproducing steps
- Logs of NGINX container:
kubectl -n nginx-gateway logs -l app=nginx-gateway -c nginx
2024/06/04 22:03:08 [info] 184#184: *1777 client 127.0.0.1 closed keepalive connection
2024/06/04 22:03:49 [error] 179#179: *1782 no live upstreams while connecting to upstream, client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", upstream: "http://default_tea_80/tea", host: "cafe.example.com:8080"
127.0.0.1 - - [04/Jun/2024:22:03:49 +0000] "GET /tea HTTP/1.1" 502 157 "-" "curl/8.4.0"
2024/06/04 22:03:50 [info] 179#179: *1782 client 127.0.0.1 closed keepalive connection
- NGINX Configuration:
kubectl -n nginx-gateway exec <gateway-pod> -c nginx -- nginx [-T
](ngf.txt)
Additional context
Add any other context about the problem here. Any log files you want to share.
Acceptance
- The Graceful Recovery NFR test is updated to remove the filter for the error above.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status