Skip to content

Performance addon operator code base move to NTO #322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 27, 2022

Conversation

yanirq
Copy link
Contributor

@yanirq yanirq commented Mar 15, 2022

This PR includes :

Additional notes:

  • First 2 commits are to preserve git values and enable a reasonable method for cherry-picks
  • PAO code moved to NTO current paths and structure
  • Most of the vendoring added in one additional commit

yanirq added 2 commits March 14, 2022 18:51
all operations in this commit where done with git mv
to preserve git references for future cherry picks
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 15, 2022
@openshift-ci openshift-ci bot requested review from dagrayvid and kpouget March 15, 2022 14:03
@yanirq yanirq force-pushed the pao_copy branch 2 times, most recently from affb093 to f65de7e Compare March 18, 2022 09:54
Copy link
Contributor

@jmencak jmencak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure overall looks reasonable to me. A couple of notes though:

  • Can the /testdata be placed somewhere under /test to keep the root a bit more tidy?
  • How will /config/pao/rbac/role_binding.yaml and /config/pao/rbac/role.yaml get instantiated? Should we place them in /manifests instead to leave this up to CVO?
  • Why is /assets/pao/configs plural and /config singlular? I.e. consistency.
  • I like the addition of /docs. We should shorten the NTO /README.md and move most of the docs from the /README.md to that folder alongside PAO. Not necessarily as part of this PR.

@yanirq
Copy link
Contributor Author

yanirq commented Mar 18, 2022

The structure overall looks reasonable to me. A couple of notes though:

 Can the `/testdata` be placed somewhere under `/test` to keep the root a bit more tidy?

Yes, I will work on that on a cleanup separate commit (maybe after some more comments)

* How will `/config/pao/rbac/role_binding.yaml` and `/config/pao/rbac/role.yaml` get instantiated?  Should we place them in `/manifests` instead to leave this up to CVO?

Actually I might be able to remove that folder completely , our test data depends on that and there is still work to be done in the code generators (not in this PR at this point of time)
Next commit should have the relevant manifests under manifests folder.

* Why is `/assets/pao/configs` plural and `/config` singlular?  I.e. consistency.

ack

* I like the addition of `/docs`.  We should shorten the NTO `/README.md` and move most of the docs from the `/README.md` to that folder alongside PAO.  Not necessarily as part of this PR.

I will need to really cleanup the docs here . The content should be adjusted + the links.
We can either have it in this PR and fix in a follow up or leave it to a different PR.

@yanirq yanirq changed the title WIP: Performance addons operator code base move to NTO Performance addons operator code base move to NTO Mar 20, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 20, 2022
@yanirq
Copy link
Contributor Author

yanirq commented Mar 20, 2022

/hold cancel
/cc @cynepco3hahue @MarSik @fromanirh @Tal-or

@cynepco3hahue
Copy link

cynepco3hahue commented Mar 21, 2022

Some initial review.

  1. Why do we need to copy config/pao directory? It is used mostly to generate CSV.
  2. Probably the doc should now have a more clear hierarchy, same regarding examples.
  3. We should update NTO README with the relevant PAO section.
  4. Probably we do not need pkg/pao/utils/csvtools.
  5. Probably the pao directory hierarchy can be more clear like if it is placed under e2e we do not really need functests under directory names.

@yanirq
Copy link
Contributor Author

yanirq commented Mar 21, 2022

Some initial review.

1. Why do we need to copy `config/pao` directory? It is used mostly to generate CSV.

There is some code still using it, examining how to remove it completely atm

2. Probably the doc should now have a more clear hierarchy, same regarding examples.

Docs will be removed from this PR and will be handled in a separate one.

3. We should update NTO README with the relevant PAO section.

Same comment as above.

4. Probably we do not need `pkg/pao/utils/csvtools`.

There is a section that still uses a tool there. I will see if it can be removed

5. Probably the pao directory hierarchy can be more clear like if it is placed under e2e we do not really need `functests` 

I think the current hierarchy can be kept. Mainly if considering cherry picks and familiarity

@dagrayvid
Copy link
Contributor

Overall, I think the structure of the PR looks good.

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

@jmencak
Copy link
Contributor

jmencak commented Mar 21, 2022

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

In my view this is not necessary, but if you feel it should be done no-one will stop you.

@dagrayvid
Copy link
Contributor

@jmencak @yanirq do you think we should repeat the performance tests that we ran on some of the previous PRs for CPU / memory utilization? I'm guessing without creating any PerformanceProfiles to reconcile, this PR should not have much impact.

In my view this is not necessary, but if you feel it should be done no-one will stop you.

Agreed. I will not run these tests.

@ffromani
Copy link
Contributor

I think we should get rid of csvtools. You can just inline the MarhsallObject method in the cmd/performance-profile-creator/cmd/root.go file (PAO paths), in a separate commit

@ffromani
Copy link
Contributor

ffromani commented Mar 22, 2022

Additionally, I don't think we need tools/. We may still need tools/tools.go to make sure we vendor deps we use during the build process, but we can just have it as hack/tools.go.
EDIT --
ok, maybe we need docs-generator, but I'm pretty sure we can get rid of imgpull-tool

@ffromani
Copy link
Contributor

the overall direction looks good to me and after a first pass the most import bits of PAO seem in the right places.
I'll have another pass later to look at the PAO side of things.

@openshift-merge-robot openshift-merge-robot merged commit 7e304f0 into openshift:master Mar 27, 2022
stbenjam added a commit to stbenjam/cluster-node-tuning-operator that referenced this pull request Mar 29, 2022
MarSik added a commit to MarSik/performance-addon-operators that referenced this pull request Apr 4, 2022
The functionality will be provided by NTO and PAO must
not interfere with it  in case someone installs this
version on OCP 4.11.

See the following:

openshift/cluster-node-tuning-operator#322
shajmakh pushed a commit to shajmakh/performance-addon-operators that referenced this pull request May 24, 2022
The functionality will be provided by NTO and PAO must
not interfere with it  in case someone installs this
version on OCP 4.11.

See the following:

openshift/cluster-node-tuning-operator#322
shajmakh pushed a commit to shajmakh/performance-addon-operators that referenced this pull request May 24, 2022
The functionality will be provided by NTO and PAO must
not interfere with it  in case someone installs this
version on OCP 4.11.

See the following:

openshift/cluster-node-tuning-operator#322
shajmakh pushed a commit to shajmakh/performance-addon-operators that referenced this pull request Jun 1, 2022
The functionality will be provided by NTO and PAO must
not interfere with it  in case someone installs this
version on OCP 4.11.

See the following:

openshift/cluster-node-tuning-operator#322
IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request May 23, 2023
* direct copy of pao code tree

* git mv pao content according to nto structure

all operations in this commit where done with git mv
to preserve git references for future cherry picks

* pao code base alignment into nto

Performance addon operator code base copied and aligned in NTO.
The code base copied over resides under pao subfolders for pkg,apis and
tests.

This also includes:
- PAO controller functionality invoked and run (as runnable)
  by controller runtime lib manager in the same process as NTO.
- OLM upgrade support: remove PAO OLM operator and artifacts in upgrade from ocp 4.10
  and below.
- Required PAO crd,rbac,webhook and operator configurations under manifests
  folder.
- PAO modules vendoring
- Update to kubernetes 1.23.3
- Makefile additions for running PAO tests and CRDs.
IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request May 23, 2023
IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request Jun 13, 2023
* direct copy of pao code tree

* git mv pao content according to nto structure

all operations in this commit where done with git mv
to preserve git references for future cherry picks

* pao code base alignment into nto

Performance addon operator code base copied and aligned in NTO.
The code base copied over resides under pao subfolders for pkg,apis and
tests.

This also includes:
- PAO controller functionality invoked and run (as runnable)
  by controller runtime lib manager in the same process as NTO.
- OLM upgrade support: remove PAO OLM operator and artifacts in upgrade from ocp 4.10
  and below.
- Required PAO crd,rbac,webhook and operator configurations under manifests
  folder.
- PAO modules vendoring
- Update to kubernetes 1.23.3
- Makefile additions for running PAO tests and CRDs.
IlyaTyomkin pushed a commit to IlyaTyomkin/cluster-node-tuning-operator that referenced this pull request Jun 13, 2023
MarSik pushed a commit to MarSik/openshift-must-gather that referenced this pull request Jun 21, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [4].

The code used to live in a separate repository [5], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There are two major pieces added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the must gather image itself,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

    The gather_sysinfo binary that uses the ghw [3] library to collect
    some extra low level data about the node hardware and cpu topology.
    This binary is built as part of the must-gather Dockerfile build
    process (two stage build) as it is just another gather "script",
    although in a binary form. There is no other source to get this
    binary from.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://github.com/jaypipes/ghw
[4] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[5] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jun 21, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [4].

The code used to live in a separate repository [5], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There are two major pieces added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the must gather image itself,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

    The gather_sysinfo binary that uses the ghw [3] library to collect
    some extra low level data about the node hardware and cpu topology.
    This binary is built as part of the must-gather Dockerfile build
    process (two stage build) as it is just another gather "script",
    although in a binary form. There is no other source to get this
    binary from.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://github.com/jaypipes/ghw
[4] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[5] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jun 30, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [4].

The code used to live in a separate repository [5], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There are two major pieces added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the must gather image itself,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

    The gather_sysinfo binary that uses the ghw [3] library to collect
    some extra low level data about the node hardware and cpu topology.
    This binary is built as part of the must-gather Dockerfile build
    process (two stage build) as it is just another gather "script",
    although in a binary form. There is no other source to get this
    binary from.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://github.com/jaypipes/ghw
[4] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[5] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jun 30, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [4].

The code used to live in a separate repository [5], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There are two major pieces added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the must gather image itself,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

    The gather_sysinfo binary that uses the ghw [3] library to collect
    some extra low level data about the node hardware and cpu topology.
    This binary is built as part of the must-gather Dockerfile build
    process (two stage build) as it is just another gather "script",
    although in a binary form. There is no other source to get this
    binary from.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://github.com/jaypipes/ghw
[4] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[5] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 7, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 7, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 10, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 10, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 10, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 10, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 10, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
MarSik added a commit to MarSik/openshift-must-gather that referenced this pull request Jul 12, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
rbaturov pushed a commit to rbaturov/must-gather that referenced this pull request Jul 30, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
rbaturov pushed a commit to rbaturov/must-gather that referenced this pull request Aug 16, 2023
This PR adds support for collecting data related to the PerformanceProfile CR
(provided by a core OCP operator - Cluster Node Tuning Operator) operation
and creation [3].

The code used to live in a separate repository [4], but the Performance
Profile logic became part of NTO in OCP 4.11 ([1][2]).

There major piece added:

    The gather_ppc script (ppc stands for Performance profile controller)
    which is a pretty common gather script with one exception.
    It starts a daemon set re-using the NTO image with the needed tools,
    but with extra host volumes (/proc, /sys, ..) to be able to collect
    some low level hardware data. Some it collects directly using
    oc exec, some it uses gather_sysinfo for.

[1] openshift/cluster-node-tuning-operator#322
[2] https://github.com/openshift/enhancements/blob/fc2f2e9bf046559ea105b5f64e83da090d1f6915/enhancements/node-tuning/pao-in-nto.md
[3] https://docs.openshift.com/container-platform/4.11/scalability_and_performance/cnf-create-performance-profiles.html#gathering-data-about-your-cluster-using-must-gather_cnf-create-performance-profiles
[4] https://github.com/openshift-kni/performance-addon-operators/tree/master/must-gather

On a standard 3 masters + 3 workers AWS CI cluster the gather_ppc
script added about 2 MiB to the total size and took about
20 seconds to complete.

[must-gather-hvtql] OUT total size is 1.974.701  speedup is 3,79

Signed-off-by: Martin Sivak <msivak@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants