-
Notifications
You must be signed in to change notification settings - Fork 694
Prevent LoadBalancer updates on follower. #6872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8b2af7d
to
920d0ce
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6872 +/- ##
==========================================
- Coverage 80.77% 80.73% -0.05%
==========================================
Files 131 131
Lines 19936 19946 +10
==========================================
Hits 16104 16104
- Misses 3540 3550 +10
Partials 292 292
🚀 New features to boost your workflow:
|
internal/k8s/statusaddress.go
Outdated
func (s *ServiceStatusLoadBalancerWatcher) notify(lbstatus core_v1.LoadBalancerStatus) { | ||
s.LBStatus <- lbstatus | ||
if s.leader.Load() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to lose the envoy service event in this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Currently the event processing logic relies on Added/Updated/Deleted events alone. It works for other resource types as they will be processed by all Contour instances but I think new approach is needed with the load balancer status updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can simply handle it. Notify the latest envoy service when it becomes Leader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the approach to start the service watcher only after the instance becomes the leader.
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
The Contour project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack |
fc9581e
to
a2a792e
Compare
Start watching Service updates only after becoming leader, when starting loadBalancerStatusWriter. This avoids piling up unprocessed Service updates on follower instance, consuming memory and eventually causing an out-of-memory condition and killing Contour process. Signed-off-by: Tero Saarni <tero.saarni@est.tech>
a2a792e
to
0ea6f5c
Compare
Some notes for manual testing: Prepare environment with modified Contour
Then create
Set IP address for service
Check that the address is reflected on
Scale down replicas to zero and change address while Contour is down
Scale up and check that leader picks up the latest address and updates the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix! yeah the load balancer status writer already has other informers for updating resources that need the load balancer status and it makes sense to start the load balancer service watcher there too, since it only needs to run on the leader instance
This PR fixes a memory leak that was triggered by LoadBalancer status updates. Only the leader instance runs
loadBalancerStatusWriter
and therefore follower does not have anyone reading from the channel that receives status updates. The LoadBalancer status updates are still watched by followers and sent to the channel, causing the go routine callingServiceStatusLoadBalancerWatcher.notify()
to block. This led toLoadBalancerStatus
updates piling up and consuming memory, eventually causing an out-of-memory condition and killing the Contour process.With this change,
Service
updates are watched only after the instance becomes the leader and starts theloadBalancerStatusWriter
.Fixes #6860