[target-allocator] Targets remain assigned to terminating pod until restart is complete #1048
Description
When load testing what would happen in a statefulset of 2 collector pods, I noticed some unexpected behavior. The goal of the test was to see what happened when a pod dies in a pool of 2 collector pods. I used an avalanche pod to generate a lot of metrics in a single target (this will ensure only one of the collector pods is experiencing the load spike).
e.g.
% k get po
NAME READY STATUS RESTARTS AGE
curl-moh 1/1 Running 0 144m
lightstep-collector-collector-0 0/1 CrashLoopBackOff 6 (4m17s ago) 52m
lightstep-collector-collector-1 1/1 Running 2 (22m ago) 36m
lightstep-collector-targetallocator-b6865b5bb-p4rnt 1/1 Running 0 20s
opentelemetry-operator-controller-manager-7f7bcf896d-wjgmd 2/2 Running 4 (149m ago) 2d11h
and
$ k describe po lightstep-collector-collector-0
...
...
...
State: Running
Started: Sun, 31 Jul 2022 23:32:33 -0700
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Sun, 31 Jul 2022 23:27:31 -0700
Finished: Sun, 31 Jul 2022 23:32:32 -0700
Ready: True
Restart Count: 1
Limits:
cpu: 50m
memory: 512Mi
Requests:
cpu: 50m
memory: 512Mi
What I expected:
I expected the load test targets to be assigned to the healthy pod while one pod was in the CrashLoopBackOff
state due to OOMKilled
What actually happened:
target allocator does not reassign targets until killed pod is restarted. This isn't really problematic when the restart happens quickly but occasionally the killed pod will be down for several minutes (like in the above output of k get po
) and in this case the target allocator does not reassign targets until this pod comes back up. I would have expected the targets to be assigned to the healthy pod immediately to reduce the chance of having metrics dropping for several minutes.