-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache and singleflight requests to kubelet #5408
Cache and singleflight requests to kubelet #5408
Conversation
I hope no one minds, I wanted to play around with Spire on the weekend and I saw the issue about bursts in memory usage that didn't seem to have any activity since May. Thought I'd hack on the issue a bit and see what turned out. Testing was done by scaling the deployment in https://github.com/spiffe/spire-tutorials/blob/main/k8s/quickstart/client-deployment.yaml to 100 replicas. And a delete of all pods in the namespace a couple of times. On my homelab: Notes:
|
|
||
p.cachedPodList = podList | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, kicking off this goroutine without any soft of lifetime management should be ok given its current responsibilities, but in general we push for goroutine hygiene as much as possible. If we can, it would be good to make sure that this goroutine does not outlive the plugin, e.g. when the plugin is being unloaded.
One way to accomplish this is to:
- Implement io.Closer on *Plugin. The plugin framework will invoke Close() when the plugin is unloaded
- Add a wait group to the plugin that tracks the lifetime of this goroutine that close waits for
It might be overkill, considering this goroutine will only live at most ~250ms at the time of close, but bonus points, if you wanted to add a context that could be cancelled inside of Close to immediately bring it down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I should be able to post a fix this afternoon. I didn't bother for the same reasons you indicated, shouldn't take long to add in and test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the io.Closer and a context. I added the Context to the struct, but understand that can be debatable in different projects, happy to swap it out if there is a preference not to use a context within a struct in this project.
Also, I didn't link the Close function to wait for any in progress Attest calls to complete. I didn't see the behaviour in any other plugins, and I didn't deeply investigate whether the service is cancelling and waiting for any other in-progress Attest calls before unloading plugins. If an Attest call is able to still be in progress there is a small window where the cache will be saved and then immediately released. I'm happy to address if you'd like.
Thanks very much for this @knisbet! There are several people who will be very excited for this change. I dropped one comment on goroutine lifetimes but other than that this is looking great. |
@azdagron I think I addressed your concern. Also, I don't recall seeing anywhere in the committer docs if there is a preference to Squash my changes into a single Commit. Let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks again, @knisbet.
As far as squashing, the maintainers typically do that when we merge. SPIRE requires DCO sign-off on all commits. Would you mind amending your commits and adding that? |
…8s workload attestor Adds a short lived cache for the responses from Kubelet reducing memory and CPU usage of the k8s workload attestor plugin. Signed-off-by: Kevin Nisbet <kevin.nisbet+github@xybyte.com>
a9a0d73
to
d0d1228
Compare
Oops, sorry my bad. Should be fixed now. |
Add microcaching and merging of parallel requests to kubelet in the k8s workload attestor.
Pull Request check list
Affected functionality
Reduces memory usage of the agent plugin for k8s workload attestation. #5111
Description of change
Which issue this PR fixes
Fixes #5111