Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NRT garbage collector #1024

Merged

Conversation

PiotrProkop
Copy link
Contributor

@PiotrProkop PiotrProkop commented Jan 4, 2023

Fixes: #928

I am still missing RBAC configuration. Also I am wondering whether this garbage collector should only be started when some flag is set to true. Let me know what you think.

@netlify
Copy link

netlify bot commented Jan 4, 2023

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit 3143faf
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-nfd/deploys/63be7e43480ec90008381176
😎 Deploy Preview https://deploy-preview-1024--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 4, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @PiotrProkop!

It looks like this is your first PR to kubernetes-sigs/node-feature-discovery 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/node-feature-discovery has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 4, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @PiotrProkop. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 4, 2023
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this!
very initial pass, only shallow review.

Some open questions for discussion, I don't have strong opinions here yet

  • should this be (yet another) separate tiny component or do we want to merge this on nfd-master?
  • I concur the GC should be enabled only if a flag is given - note: if we move the GC to its own tiny server we get this for free
  • and probably the resync loop interval as well should be tunable using an option
  • we may want to add, perhaps in the future, a check if a NRT is orphaned, meaning there is no daemon (as in daemonset) for a given node. So even if we node is still here, we may want to delete that NRT (this may be governed by another option)

pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc_test.go Outdated Show resolved Hide resolved
@eero-t
Copy link

eero-t commented Jan 4, 2023

* I concur the GC should be enabled only if a flag is given - note: if we move the GC to its own tiny server we get this for free
* and probably the resync loop interval as well should be tunable using an option

Another option would be running GC only e.g. at NFD master startup & exit.

* we may want to add, perhaps in the future, a check if a NRT is orphaned, meaning there is no daemon (as in daemonset) for a given node. So even if we node is still here, we may want to delete that NRT (this may be governed by another option)

@ffromani
Copy link
Contributor

ffromani commented Jan 4, 2023

* I concur the GC should be enabled only if a flag is given - note: if we move the GC to its own tiny server we get this for free
* and probably the resync loop interval as well should be tunable using an option

Another option would be running GC only e.g. at NFD master startup & exit.

Designwise I have no objections, but in practical terms AFAIU nfd-master is supposed to stay running for very long periods, possibly forever (bar bugs/upgrades), and in this case the actual GC will happen very seldomly.

@ffromani
Copy link
Contributor

ffromani commented Jan 4, 2023

Thanks for implementing this! very initial pass, only shallow review.

Some open questions for discussion, I don't have strong opinions here yet

* should this be (yet another) separate tiny component or do we want to merge this on nfd-master?

Or, we can add this option to nfd-topology-updater, and add manifests to reuse this binary as GC. It can work like this:

  1. we add code to handle GC, but instead of calling it into nfd-master, we call into nfd-topology-updater
  2. by default, the GC code is disabled
  3. nfd-topology-updater becomes EITHER updater (= like today) or does GC
  4. we add new manifests to run nfd-topology-updater as GC agent

benefits:

  1. all the NRT handling logic is neatly included into nfd-topology-updater, no spillover on nfd-master
  2. very clear when GC is enabled: only if you run nfd-topology-updater --gc-enabled --gc-period 5m
  3. very clear separation of concerns
  4. no new images (but this is also true for nfd-master integration)

@PiotrProkop @marquiz WDYT?
PLEASE NOTE I don't really have anything against the current direction and use nfd-master, so I'm not against it. I feel the proposed approach is a little cleaner though.

@PiotrProkop PiotrProkop force-pushed the nrt-garbage-collector branch 3 times, most recently from 60296f0 to bf96421 Compare January 4, 2023 12:49
@PiotrProkop
Copy link
Contributor Author

I have implemented this as part of nfd-master initially based on comments in the original issue but this implementation is easy to move to any component.
I have no strong feeling where it should reside. But I think it would be consistent to move this to nfd-topology-updater to keep strong separation between NodeResourceTopology components and rest of nfd.

@eero-t
Copy link

eero-t commented Jan 4, 2023

I'm also in favor of moving this to nfd-topology-updater.

Another option would be running GC only e.g. at NFD master startup & exit.

Designwise I have no objections, but in practical terms AFAIU nfd-master is supposed to stay running for very long periods, possibly forever (bar bugs/upgrades), and in this case the actual GC will happen very seldomly.

One more option would have been doing GC only when notified, but to my amazement, k8s still does not have an API for that, despite 6 years of discussions:

@PiotrProkop
Copy link
Contributor Author

Ok, I'll move nrtGC to nfd-topology-updater, it can be enabled when started with:

$ nfd-topology-updater --gc-enabled

I think we should also expose the informer's defaultResync period but I can' think of a good name for it gc-period will be misleading, because removing old NRTs happens on startup and then nrtGC is listening for Node deletion events and removes corresponding NRT object.

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PiotrProkop for PR number 2^10 😄 I left some code comments. Thanks also @fromanirh and @eero-t for the reviews
/ok-to-test

About @fromanirh's suggestionm

Or, we can add this option to nfd-topology-updater, and add manifests to reuse this binary as GC. It can work like this:

  1. we add code to handle GC, but instead of calling it into nfd-master, we call into nfd-topology-updater
  2. by default, the GC code is disabled
  3. nfd-topology-updater becomes EITHER updater (= like today) or does GC
  4. we add new manifests to run nfd-topology-updater as GC agent

benefits:

  1. all the NRT handling logic is neatly included into nfd-topology-updater, no spillover on nfd-master
  2. very clear when GC is enabled: only if you run nfd-topology-updater --gc-enabled --gc-period 5m
  3. very clear separation of concerns
  4. no new images (but this is also true for nfd-master integration)

I kinda like this idea 👍 For the reasons (benefits) you mentioned. I think the nrt-gc could live in nfd-master but your suggestion would make it cleaner and also simpler/more understandable to deploy e.g. if one wants to deploy only topology-updater, without the nfd labeler at all.

Then there's the question whether it should be a separate tiny binary (as @fromanirh suggested in some other comment above) or bolt this into the current topology-updater binary under some separate cmdline flag. I don't have strong opinion about that but maybe a separate super simple binary would be cleaner. Any though? We need separate deployment files (deployment yamls, rbac rules etc) anyway in either case.

pkg/nfd-master/nfd-nrt-gc_test.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
pkg/nfd-master/nfd-nrt-gc.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 5, 2023
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 5, 2023
@PiotrProkop PiotrProkop force-pushed the nrt-garbage-collector branch 2 times, most recently from 7d99e4a to 59d50c1 Compare January 5, 2023 10:50
@PiotrProkop
Copy link
Contributor Author

@marquiz Thanks for the review! For now I have move this GC to nfd-topology-updater and have it enabled via gc-enabled flag(I have also added proper yaml manifest to deploy it). I can move it to seperate binary if necessary.
I have also added triggering gc based on gc-period flag. In my testing on kind cluster when I removed node by just simply running:

$ kubect delete node worker`

topology-updater was still alive for some period of time and recreated NRT custom resources, so this periodic trigger should mitigate this in real clusters 😄

Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks! Only minor nits, please consider them only if you resubmit for other reasons. Once @marquiz is happy we can merge.

Convey("When parsing command line arguments", t, func() {
flags := flag.NewFlagSet(ProgramName, flag.ExitOnError)

Convey("When valid -gc-inteval is specified", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only if you resubmit): typo "inteval"

args := parseArgs(flags,
"-gc-interval=30s")

Convey("args.sources is set to appropriate values", func() {
Copy link
Contributor

@ffromani ffromani Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only if you resubmit) either generalize the text or replace "sources" with "GCPeriod"

n.factory.Start(n.stopChan)
n.factory.WaitForCacheSync(n.stopChan)

klog.Infof("Running initial GC")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only if you resubmit) drop "initial", so it reads "Running GC"

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a6249fa59f772901471a95774e535d7700e0f786

@marquiz
Copy link
Contributor

marquiz commented Jan 10, 2023

We still need the docs 😅

And one more thing I started to ponder about the Helm deployment. It would be nice for the Helm to also deploy topology-gc by default (if topology-updater is enabled). Probably the nicest/easiest way to do this would be to simply deploy topology-gc only if topology-updater is enabled, too. That is, deploy if topologyUpdater.enabled==true && topologyGC.enabled==true. This would prevent stand-alone deployment of topology-gc (i.e. without topology-updater) but I don't think that would be an interesting use case in any practical scenario. I think otherwise it gets a bit hairier with the {{ if }} statements without much practical benefit. WDYT?

@PiotrProkop
Copy link
Contributor Author

@marquiz I remember about docs 😄 I think it is a sane default to deploy topologyGC only if topologyUpdater is enabled. We can always change it in the future to something more sophisticated if someone raise issue about that. I'll change helm chart, add docs and apply @ffromani review right now.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2023
@PiotrProkop
Copy link
Contributor Author

@marquiz @ffromani docs added and review applied.

@ffromani
Copy link
Contributor

@marquiz @ffromani docs added and review applied.

thanks a lot! I'll review again (and likely LGTM) once CI is happy.

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @PiotrProkop. I had a few more comments but after those I think we should get this merged

deployment/helm/node-feature-discovery/values.yaml Outdated Show resolved Hide resolved
docs/deployment/helm.md Outdated Show resolved Hide resolved
docs/reference/topology-gc-commandline-reference.md Outdated Show resolved Hide resolved
docs/usage/nfd-topology-gc.md Outdated Show resolved Hide resolved
docs/usage/nfd-topology-gc.md Outdated Show resolved Hide resolved
NodeResourceTopology(aka NRT) custom resource is used to enable NUMA aware Scheduling in Kubernetes.
As of now node-feature-discovery daemons are used to advertise those
resources but there is no service responsible for removing obsolete
objects(without corresponding Kubernetes node).

This patch adds new daemon called nfd-topology-gc which removes old
NRTs.

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
@marquiz
Copy link
Contributor

marquiz commented Jan 11, 2023

From #1024 (comment)

Is it make sense to enable topologyGC by default when topologyUpdater is disabled by default(as well as creating NRT CRDs)?

A side note, talking about CRDs, how do you guys normally deploy the NRT CRD? Now this CRD is not properly/normally managed by the NFD Helm chart. It is not deployed by default (only created when topologyUpdater.createCRDs==true). Is this fine or should we move it under deployment/helm/node-feature-discovery/crds/ so that it would be managed by Helm and always created (unless --skip-crds is specified). WDYT @ffromani @PiotrProkop ? Anyway, this is a separate issue and should be addressed in a separate PR.

@PiotrProkop
Copy link
Contributor Author

@marquiz To be honest I don't use helm to deploy NFD and CRDs but kustomize 😄 but I agree that NRT CRDs should be probably moved to crds directory and installed even when topologyUpdater.enable is set to false. But we should create seperate issue/PR for that.

@ffromani
Copy link
Contributor

From #1024 (comment)

Is it make sense to enable topologyGC by default when topologyUpdater is disabled by default(as well as creating NRT CRDs)?

A side note, talking about CRDs, how do you guys normally deploy the NRT CRD? Now this CRD is not properly/normally managed by the NFD Helm chart. It is not deployed by default (only created when topologyUpdater.createCRDs==true). Is this fine or should we move it under deployment/helm/node-feature-discovery/crds/ so that it would be managed by Helm and always created (unless --skip-crds is specified). WDYT @ffromani @PiotrProkop ? Anyway, this is a separate issue and should be addressed in a separate PR.

I fully agree deploying the CRD is a separate topic. For the record:

  • I kinda assumed nfd-topology-updater assumes the CRD is already in the cluster. I think at this stage is fair game to have this assumption yet, possibly we need to make this explicit in the docs (I'll check later)
  • in general we still need to figure out who owns the CRD (it's a cross-component/cross-subproject topic), meaning which component is in charge of deploying the CRD and making sure it's the right version. I don't have strong feelings not a clear idea yet and I would surely not mind if NFD agrees to take ownership here, we can discuss separately.

@marquiz
Copy link
Contributor

marquiz commented Jan 11, 2023

  • I kinda assumed nfd-topology-updater assumes the CRD is already in the cluster. I think at this stage is fair game to have this assumption yet, possibly we need to make this explicit in the docs (I'll check later)

K, that's fine. Just started to think about that 😊 And yes, we should document clearly what is the expected method of getting the CRD deployed

  • in general we still need to figure out who owns the CRD (it's a cross-component/cross-subproject topic), meaning which component is in charge of deploying the CRD and making sure it's the right version. I don't have strong feelings not a clear idea yet and I would surely not mind if NFD agrees to take ownership here, we can discuss separately.

Yeah, I agree.

But let's take these separately, not belonging into this PR

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PiotrProkop for working on this. I think this PR is ready to get in

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marquiz, PiotrProkop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 11, 2023
@ffromani
Copy link
Contributor

Thanks @PiotrProkop for working on this. I think this PR is ready to get in

agreed, thanks @PiotrProkop !

@ffromani
Copy link
Contributor

/lgtm
all my comments were addressed

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 11, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: c5fe413bc2d1beb1826a3d46ceefc38ced76e349

@k8s-ci-robot k8s-ci-robot merged commit ea921a8 into kubernetes-sigs:master Jan 11, 2023
@marquiz marquiz mentioned this pull request Apr 12, 2023
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

topology-updater: NRT CRD resourses are not cleaned up, on node deletion(scale-in)
5 participants