Skip to content

feat: fix manager memory usage#965

Merged
ashnamehrotra merged 53 commits into
eraser-dev:mainfrom
ashnamehrotra:manager-memory-fix
Feb 1, 2024
Merged

feat: fix manager memory usage#965
ashnamehrotra merged 53 commits into
eraser-dev:mainfrom
ashnamehrotra:manager-memory-fix

Conversation

@ashnamehrotra

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:
Utilizes cache.BuilderWithOptions in the manager to free up space from watching unnecessary resources and changes ImageJob controller to use nodeNames to retrieve each node rather than storing all nodes in a list.

Which issue(s) this PR fixes (optional, using fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when the PR gets merged):
Fixes #798

Special notes for your reviewer:
Eraser load test results for cluster with 500 vuln images, 50 nodes, cpu request unset, and high usage daemonset deployed

Before changes:
MAX CPU Manager: 973.889567 m
MAX MEM Manager: 351.07 Mi

After changes:
MAX CPU Manager: 441.082336 m
MAX MEM Manager: 32.94 Mi

@sozercan

Copy link
Copy Markdown
Member

@ashnamehrotra this is awesome! looks like e2e has a conflict due to artifact names and please sign the DCO

ashnamehrotra and others added 27 commits January 25, 2024 09:43
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ev#899)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ocs (eraser-dev#923)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#945)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…-dev#919)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#921)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
dependabot Bot and others added 12 commits January 25, 2024 09:50
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Comment thread main.go
SelectorsByObject: cache.SelectorsByObject{
// to watch eraser pods
&corev1.Pod{}: {
Field: fields.OneTermEqualSelector("metadata.namespace", utils.GetNamespace()),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still making a cluster scoped watch and having API server filter them down. You could confirm that one way or the other by reducing the RBAC of the eraser components to use role bindings to properly scope the access.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC is already scoped to the namespace for the manager role, but removing the field to filter by namespace increases the memory again. I think that could be because for the manager we have ClusterRole instead of Role, but we can't change that since we need to watch ImageJobs/ImageLists in default and eraser pods in eraser-system.

It was the same problem testing with MultiNamespacedCacheBuilder and keeping only the eraser-system and default namespaces, it didn't have an affect on memory consumption.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should create separate cluster roles/roles/cluster roles bindings/role bindings to express the exact permissions of each component. There is no requirement that everything be done via a single cluster role/cluster role binding.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://youtu.be/Nw1ymxcLIDI is a good resource to understand RBAC fundamentals.

@ashnamehrotra ashnamehrotra Jan 31, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you this was really helpful! I changed the role to create a separate Role and ClusterRole

Comment thread controllers/imagejob/imagejob_controller.go Outdated
@sozercan sozercan requested a review from enj January 26, 2024 18:26
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>

@enj enj left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, RBAC LGTM.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster role in manifest_staging/charts/eraser/templates/eraser-imagejob-pods-cluster-role-clusterrole.yaml is empty, should and the associated binding it be deleted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup we should only need the service account

- update
- watch
- apiGroups:
- eraser.sh

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the eraser resources, does your controller need access to create and similar permissions on the resource itself (i.e. not /status)?

@ashnamehrotra ashnamehrotra Jan 31, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does for ImageJob, but not for ImageList. Removed create/delete/update/patch permissions for ImageList. Removed update/patch permissions for ImageJob.

kind: RoleBinding
metadata:
name: manager-rolebinding
namespace: system

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be eraser-system?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default the roles are in "system" and during deployment it changes to "eraser-system" since the namespace is configurable

@sozercan sozercan Jan 31, 2024

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubebuilder/kustomize automatically prefixes these

namePrefix: eraser-

Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>

//+kubebuilder:rbac:groups=eraser.sh,resources=imagejobs,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",namespace="system",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure system doesn't break since kubebuilder adds that prefix in generation but not sure if it applies here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only generates the files under rbac/ which all use system. Then in manifest staging/ it is replaced to eraser-system through the templates.

Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>

@sozercan sozercan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! LGTM

@ashnamehrotra ashnamehrotra merged commit 1e4401b into eraser-dev:main Feb 1, 2024
ashnamehrotra added a commit to ashnamehrotra/eraser that referenced this pull request Feb 1, 2024
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
ashnamehrotra added a commit that referenced this pull request Feb 1, 2024
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix manager memory usage

4 participants