Skip to content

Conversation

@awgreene
Copy link
Member

No description provided.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 18, 2021
@openshift-ci openshift-ci bot requested review from anik120 and hasbro17 June 18, 2021 14:09
Comment on lines 57 to 75
$ export CERT_PATH=/etc/olm-serving-certs # Define where to mount the certs.
$ kubectl patch deployment olm-operator -n olm --type json -p='[
# Mount the secret to the pod
{"op": "add", "path": "/spec/template/spec/volumes", "value":[{"name": "olm-serving-cert", "secret": {"secretName": "olm-serving-cert"}}]},
{"op": "add", "path": "/spec/template/spec/containers/0/volumeMounts", "value":[{"name": "olm-serving-cert", "mountPath": '$CERT_PATH'}]},

# Add startup arguments
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"--client-ca"},
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"'$CERT_PATH'/tls.crt"},
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"--tls-key"},
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"'$CERT_PATH'/tls.key"},
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"--tls-cert"},
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"'$CERT_PATH'/tls.crt"},

# Replace port 8080 with 8443
{"op": "replace", "path": "/spec/template/spec/containers/0/ports/0", "value":{"containerPort": 8443}},
{"op": "replace", "path": "/spec/template/spec/containers/0/livenessProbe/httpGet/port", "value":8443},
{"op": "replace", "path": "/spec/template/spec/containers/0/readinessProbe/httpGet/port", "value":8443},

# Update livenessProbe and readinessProbe to use HTTPS
{"op": "replace", "path": "/spec/template/spec/containers/0/readinessProbe/httpGet/scheme", "value":"HTTPS"},
{"op": "replace", "path": "/spec/template/spec/containers/0/livenessProbe/httpGet/scheme", "value":"HTTPS"},
]'
deployment.apps/olm-operator patched
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could generate the manifests that include these changes as part of the OLM release.

apiVersion: v1
kind: Secret
metadata:
name: olm-serving-secret
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: olm-serving-secret
name: olm-serving-cert

@awgreene awgreene marked this pull request as ready for review July 23, 2021 19:45
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 23, 2021
title: "Performance Profiling Metrics"
weight: 10
description: >
The goal of this document is to familiarize you with the steps to enable and review OLM's performance profiling metrics.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The goal of this document is to familiarize you with the steps to enable and review OLM's performance profiling metrics.
The goal of this document is to familiarize you with the steps to enable and review OLM's performance profiling instrumentation.


## Background

OLM utilizes the [pprof package](https://golang.org/pkg/net/http/pprof/) from the standard go library to expose performance profiles for the OLM Operator, the Catalog Operator, and Registry Servers. Due to the sensitive nature of this data, OLM must be configured to use TLS Certificates before performance profiling can be enabled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and Registry Servers

Not registry-server today.

must be configured to use TLS Certificates before performance profiling can be enabled

For clarity, since the profiling endpoints are always enabled now, this could probably be combined with the following paragraph and say something like "due to the sensitive nature of profiling data, the profiling endpoints will reject any clients that do not present a verifiable certificate. Both operators must be configured with a serving certificate and a client CA bundle in order to access the profiling endpoints".

$ export PRIVATE_KEY_FILENAME=private.key # Replace with the name of the file that contains the private key you generated.
$ export PUBLIC_KEY_FILENAME=certificate.crt # Replace with the name of the file that contains the public key you generated.

$ cat << EOF | kubectl apply -f -

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can also do:

kubectl -n my-namespace create secret tls my-name --cert=certfile --key=keyfile

EOF
```

### Updating OLM to Use the TLS Secret

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the chart is sort of an internal detail of the release process, but since the audience for this doc is skewed towards developers, maybe it'd be worth putting some commented-out-with-explanation lines into the values.yaml and mentioning them here as well?

Patch the OLM Deployment's pod template to use the generated TLS secret:

- Defining a volume and volumeMount
- Adding the `client-ca`, `tls-key` and `tls-crt` arguments

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Adding the `client-ca`, `tls-key` and `tls-crt` arguments
- Adding the `client-ca`, `tls-key` and `tls-cert` arguments

@@ -0,0 +1,106 @@
---
title: "Performance Profiling Metrics"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine readers finding this page while trying to understand the performance of their own operator versus the OLM operators. Making the title more specific might save them some confusion or at least a click.


### Creating a Certificate

A valid server certiciate must be created for each component before Performance Profiling can be enabled. If you are unfamiliar with certificate generation, I recomend using the [OpenSSL](https://www.openssl.org/) tool-kit and refer to the [request certificate](https://www.openssl.org/docs/manmaster/man1/openssl-req.html) documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A valid server certiciate must be created for each component before Performance Profiling can be enabled. If you are unfamiliar with certificate generation, I recomend using the [OpenSSL](https://www.openssl.org/) tool-kit and refer to the [request certificate](https://www.openssl.org/docs/manmaster/man1/openssl-req.html) documentation.
A valid server certificate must be created for each component before the Performance Profiling functionality can be enabled. If you are unfamiliar with certificate generation, it's recommend to use the [OpenSSL](https://www.openssl.org/) tool-kit and refer to the [request certificate](https://www.openssl.org/docs/man1.0.2/man1/openssl-req.html) documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Requests against the performance profiling endpoint will be rejected unless the client certificate is validated by OLM. Unfortunately, Kubernetes does not provide a native way to prevent pods on cluster from iterating over the list of available ports and retrieving the data exposed. Without authenticating the requests, OLM could leak customer usage statistics on multitenant clusters.

This document will dive into the steps to [enable olm performance profiling](enable-performance-profiling) and retrieving pprof data from each component.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a broken link to the next header section.

Suggested change
This document will dive into the steps to [enable olm performance profiling](enable-performance-profiling) and retrieving pprof data from each component.
This document will dive into the steps to [enable olm performance profiling](#enabling-performance-profiling) and retrieving pprof data from each component.

- Defining a volume and volumeMount
- Adding the `client-ca`, `tls-key` and `tls-crt` arguments
- Replacing all mentions of port `8080` with `8443`
- Updating the `livenessProbe` and `readinessProbe` to use HTTPS as the scheme.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Updating the `livenessProbe` and `readinessProbe` to use HTTPS as the scheme.
- Updating the `livenessProbe` and `readinessProbe` to use HTTPS as the scheme

- Replacing all mentions of port `8080` with `8443`
- Updating the `livenessProbe` and `readinessProbe` to use HTTPS as the scheme.

This can by generating patching an existing OLM deployment:
Copy link
Member

@timflannagan timflannagan Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This can by generating patching an existing OLM deployment:
This can be generated by patching an existing OLM deployment:

@awgreene awgreene force-pushed the pprof-docs branch 4 times, most recently from b6caa89 to 7b76263 Compare August 9, 2021 17:21
Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awgreene this doc looks more like it belongs to the advanced task section to me. Otherwise lgtm

@awgreene
Copy link
Member Author

@awgreene this doc looks more like it belongs to the advanced task section to me. Otherwise lgtm

Agreed @anik120 - moved the document to the advanced task directory.

@anik120
Copy link
Contributor

anik120 commented Aug 10, 2021

/approve

@openshift-ci
Copy link

openshift-ci bot commented Aug 10, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: anik120, awgreene

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 10, 2021
Comment on lines 29 to 30

$ export TLS_SECRET=my-tls-secret
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable isn't used anywhere in this codeblock. Is it supposed to be the secret name?

@anik120
Copy link
Contributor

anik120 commented Dec 2, 2021

Closing due to inactivity

@anik120 anik120 closed this Dec 2, 2021
@anik120
Copy link
Contributor

anik120 commented Dec 2, 2021

Closed this by mistake oops.

@anik120 anik120 reopened this Dec 2, 2021
@grokspawn grokspawn added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 11, 2024
@perdasilva perdasilva force-pushed the pprof-docs branch 2 times, most recently from 4459d7c to 730a0de Compare November 25, 2024 10:08
This commit introduces documentation detailing how to retrieve
PPROF data from the OLM and Catalog operators.

Signed-off-by: Per Goncalves da Silva <pegoncal@redhat.com>
@perdasilva perdasilva merged commit 22a32d2 into operator-framework:master Nov 25, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants