Performance addon operator code base move to NTO #322

yanirq · 2022-03-15T14:01:20Z

This PR includes :

PAO code base copy to NTO . See snapshot here: WIP: code base snapshot to be moved to NTO openshift-kni/performance-addon-operators#862
PAO code invocation and manifests (taken from the PAO vendoring PR: Vendor and run performance-addon-operator as part of the controller-runtime NTO manager #314

Additional notes:

First 2 commits are to preserve git values and enable a reasonable method for cherry-picks
PAO code moved to NTO current paths and structure
Most of the vendoring added in one additional commit

all operations in this commit where done with git mv to preserve git references for future cherry picks

jmencak

The structure overall looks reasonable to me. A couple of notes though:

Can the /testdata be placed somewhere under /test to keep the root a bit more tidy?
How will /config/pao/rbac/role_binding.yaml and /config/pao/rbac/role.yaml get instantiated? Should we place them in /manifests instead to leave this up to CVO?
Why is /assets/pao/configs plural and /config singlular? I.e. consistency.
I like the addition of /docs. We should shorten the NTO /README.md and move most of the docs from the /README.md to that folder alongside PAO. Not necessarily as part of this PR.

yanirq · 2022-03-18T18:15:36Z

The structure overall looks reasonable to me. A couple of notes though:

 Can the `/testdata` be placed somewhere under `/test` to keep the root a bit more tidy?

Yes, I will work on that on a cleanup separate commit (maybe after some more comments)

* How will `/config/pao/rbac/role_binding.yaml` and `/config/pao/rbac/role.yaml` get instantiated?  Should we place them in `/manifests` instead to leave this up to CVO?

Actually I might be able to remove that folder completely , our test data depends on that and there is still work to be done in the code generators (not in this PR at this point of time)
Next commit should have the relevant manifests under manifests folder.

* Why is `/assets/pao/configs` plural and `/config` singlular?  I.e. consistency.

ack

* I like the addition of `/docs`.  We should shorten the NTO `/README.md` and move most of the docs from the `/README.md` to that folder alongside PAO.  Not necessarily as part of this PR.

I will need to really cleanup the docs here . The content should be adjusted + the links.
We can either have it in this PR and fix in a follow up or leave it to a different PR.

yanirq · 2022-03-20T20:04:14Z

/hold cancel
/cc @cynepco3hahue @MarSik @fromanirh @Tal-or

Makefile

cynepco3hahue · 2022-03-21T10:02:36Z

Some initial review.

Why do we need to copy config/pao directory? It is used mostly to generate CSV.
Probably the doc should now have a more clear hierarchy, same regarding examples.
We should update NTO README with the relevant PAO section.
Probably we do not need pkg/pao/utils/csvtools.
Probably the pao directory hierarchy can be more clear like if it is placed under e2e we do not really need functests under directory names.

yanirq · 2022-03-21T11:01:36Z

Some initial review.

1. Why do we need to copy `config/pao` directory? It is used mostly to generate CSV.

There is some code still using it, examining how to remove it completely atm

2. Probably the doc should now have a more clear hierarchy, same regarding examples.

Docs will be removed from this PR and will be handled in a separate one.

3. We should update NTO README with the relevant PAO section.

Same comment as above.

4. Probably we do not need `pkg/pao/utils/csvtools`.

There is a section that still uses a tool there. I will see if it can be removed

5. Probably the pao directory hierarchy can be more clear like if it is placed under e2e we do not really need `functests`

I think the current hierarchy can be kept. Mainly if considering cherry picks and familiarity

Makefile

dagrayvid · 2022-03-21T18:41:32Z

Overall, I think the structure of the PR looks good.

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

jmencak · 2022-03-21T19:16:31Z

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

In my view this is not necessary, but if you feel it should be done no-one will stop you.

dagrayvid · 2022-03-21T19:40:26Z

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

In my view this is not necessary, but if you feel it should be done no-one will stop you.

Agreed. I will not run these tests.

ffromani · 2022-03-22T16:01:18Z

I think we should get rid of csvtools. You can just inline the MarhsallObject method in the cmd/performance-profile-creator/cmd/root.go file (PAO paths), in a separate commit

ffromani · 2022-03-22T16:33:43Z

Additionally, I don't think we need tools/. We may still need tools/tools.go to make sure we vendor deps we use during the build process, but we can just have it as hack/tools.go.
EDIT --
ok, maybe we need docs-generator, but I'm pretty sure we can get rid of imgpull-tool

ffromani · 2022-03-23T09:11:38Z

the overall direction looks good to me and after a first pass the most import bits of PAO seem in the right places.
I'll have another pass later to look at the PAO side of things.

…)" This reverts commit 7e304f0.

The functionality will be provided by NTO and PAO must not interfere with it in case someone installs this version on OCP 4.11. See the following: openshift/cluster-node-tuning-operator#322

* direct copy of pao code tree * git mv pao content according to nto structure all operations in this commit where done with git mv to preserve git references for future cherry picks * pao code base alignment into nto Performance addon operator code base copied and aligned in NTO. The code base copied over resides under pao subfolders for pkg,apis and tests. This also includes: - PAO controller functionality invoked and run (as runnable) by controller runtime lib manager in the same process as NTO. - OLM upgrade support: remove PAO OLM operator and artifacts in upgrade from ocp 4.10 and below. - Required PAO crd,rbac,webhook and operator configurations under manifests folder. - PAO modules vendoring - Update to kubernetes 1.23.3 - Makefile additions for running PAO tests and CRDs.

…)" This reverts commit 7e304f0.

* direct copy of pao code tree * git mv pao content according to nto structure all operations in this commit where done with git mv to preserve git references for future cherry picks * pao code base alignment into nto Performance addon operator code base copied and aligned in NTO. The code base copied over resides under pao subfolders for pkg,apis and tests. This also includes: - PAO controller functionality invoked and run (as runnable) by controller runtime lib manager in the same process as NTO. - OLM upgrade support: remove PAO OLM operator and artifacts in upgrade from ocp 4.10 and below. - Required PAO crd,rbac,webhook and operator configurations under manifests folder. - PAO modules vendoring - Update to kubernetes 1.23.3 - Makefile additions for running PAO tests and CRDs.

…)" This reverts commit 7e304f0.

This PR adds support for collecting data related to the PerformanceProfile CR (provided by a core OCP operator - Cluster Node Tuning Operator) operation and creation [4]. The code used to live in a separate repository [5], but the Performance Profile logic became part of NTO in OCP 4.11 ([1][2]). There are two major pieces added: The gather_ppc script (ppc stands for Performance profile controller) which is a pretty common gather script with one exception. It starts a daemon set re-using the must gather image itself, but with extra host volumes (/proc, /sys, ..) to be able to collect some low level hardware data. Some it collects directly using oc exec, some it uses gather_sysinfo for. The gather_sysinfo binary that uses the ghw [3] library to collect some extra low level data about the node hardware and cpu topology. This binary is built as part of the must-gather Dockerfile build process (two stage build) as it is just another gather "script", although in a binary form. There is no other source to get this binary from. [1] openshift/cluster-node-tuning-operator#322 [2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md [3] https://github.com/jaypipes/ghw [4] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles [5] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc script added about 2 MiB to the total size and took about 20 seconds to complete. [must-gather-hvtql] OUT total size is 1.974.701 speedup is 3,79 Signed-off-by: Martin Sivak <msivak@redhat.com>

This PR adds support for collecting data related to the PerformanceProfile CR (provided by a core OCP operator - Cluster Node Tuning Operator) operation and creation [3]. The code used to live in a separate repository [4], but the Performance Profile logic became part of NTO in OCP 4.11 ([1][2]). There major piece added: The gather_ppc script (ppc stands for Performance profile controller) which is a pretty common gather script with one exception. It starts a daemon set re-using the NTO image with the needed tools, but with extra host volumes (/proc, /sys, ..) to be able to collect some low level hardware data. Some it collects directly using oc exec, some it uses gather_sysinfo for. [1] openshift/cluster-node-tuning-operator#322 [2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md [3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles [4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc script added about 2 MiB to the total size and took about 20 seconds to complete. [must-gather-hvtql] OUT total size is 1.974.701 speedup is 3,79 Signed-off-by: Martin Sivak <msivak@redhat.com>

yanirq added 2 commits March 14, 2022 18:51

direct copy of pao code tree

d459632

git mv pao content according to nto structure

e6370c4

all operations in this commit where done with git mv to preserve git references for future cherry picks

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 15, 2022

openshift-ci bot requested review from dagrayvid and kpouget March 15, 2022 14:03

yanirq force-pushed the pao_copy branch 2 times, most recently from affb093 to f65de7e Compare March 18, 2022 09:54

jmencak reviewed Mar 18, 2022

View reviewed changes

yanirq changed the title ~~WIP: Performance addons operator code base move to NTO~~ Performance addons operator code base move to NTO Mar 20, 2022

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 20, 2022

openshift-ci bot requested review from cynepco3hahue, ffromani, MarSik and Tal-or March 20, 2022 20:04

yanirq force-pushed the pao_copy branch from 0fa649f to 4328d77 Compare March 20, 2022 20:39

jmencak reviewed Mar 21, 2022

View reviewed changes

Makefile Outdated Show resolved Hide resolved

jmencak reviewed Mar 21, 2022

View reviewed changes

Makefile Outdated Show resolved Hide resolved

yanirq mentioned this pull request Mar 21, 2022

NTO: add PAO e2e ci test lanes and fix unit tests target openshift/release#27140

Merged

dagrayvid reviewed Mar 21, 2022

View reviewed changes

Makefile Outdated Show resolved Hide resolved

yanirq force-pushed the pao_copy branch from 0f0c065 to e2daeaf Compare March 21, 2022 18:33

openshift-merge-robot merged commit 7e304f0 into openshift:master Mar 27, 2022

jlojosnegros mentioned this pull request Mar 28, 2022

Add ci-ppc test to NTO as part of PAO migration openshift/release#27342

Merged

stbenjam mentioned this pull request Mar 28, 2022

Bump allowed watches for node-tuning-operator openshift/origin#26944

Merged

stbenjam added a commit to stbenjam/cluster-node-tuning-operator that referenced this pull request Mar 29, 2022

Revert "Performance addon operator code base move to NTO (openshift#322…

126fdf8

…)" This reverts commit 7e304f0.

stbenjam mentioned this pull request Mar 29, 2022

Revert PAO and later changes #330

Merged

MarSik mentioned this pull request Apr 4, 2022

[release-4.10] Self-hibernate on OCP 4.11 openshift-kni/performance-addon-operators#874

Merged

marioferh mentioned this pull request Dec 9, 2022

Add Performance Profile Controler must-gather openshift/must-gather#341

Closed

4 tasks

MarSik mentioned this pull request Jan 6, 2023

Collect information relevant to PerformanceProfile and low latency tuning openshift/must-gather#345

Merged

IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request May 23, 2023

Revert "Performance addon operator code base move to NTO (openshift#322…

ebd4390

…)" This reverts commit 7e304f0.

IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request Jun 13, 2023

Revert "Performance addon operator code base move to NTO (openshift#322…

25054b5

…)" This reverts commit 7e304f0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance addon operator code base move to NTO #322

Performance addon operator code base move to NTO #322

yanirq commented Mar 15, 2022

jmencak left a comment

yanirq commented Mar 18, 2022

yanirq commented Mar 20, 2022

cynepco3hahue commented Mar 21, 2022 •

edited

Loading

yanirq commented Mar 21, 2022

dagrayvid commented Mar 21, 2022

jmencak commented Mar 21, 2022

dagrayvid commented Mar 21, 2022

ffromani commented Mar 22, 2022

ffromani commented Mar 22, 2022 •

edited

Loading

ffromani commented Mar 23, 2022

Performance addon operator code base move to NTO #322

Performance addon operator code base move to NTO #322

Conversation

yanirq commented Mar 15, 2022

jmencak left a comment

Choose a reason for hiding this comment

yanirq commented Mar 18, 2022

yanirq commented Mar 20, 2022

cynepco3hahue commented Mar 21, 2022 • edited Loading

yanirq commented Mar 21, 2022

dagrayvid commented Mar 21, 2022

jmencak commented Mar 21, 2022

dagrayvid commented Mar 21, 2022

ffromani commented Mar 22, 2022

ffromani commented Mar 22, 2022 • edited Loading

ffromani commented Mar 23, 2022

cynepco3hahue commented Mar 21, 2022 •

edited

Loading

ffromani commented Mar 22, 2022 •

edited

Loading