Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MON-2807: Use bearer token file for remote write authentication with telemeter #1733

Merged

Conversation

simonpasquier
Copy link
Contributor

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 5, 2022
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2022
@simonpasquier
Copy link
Contributor Author

/test e2e-agnostic

1 similar comment
@simonpasquier
Copy link
Contributor Author

/test e2e-agnostic

@simonpasquier simonpasquier changed the title [WIP] Test /metrics/v1/receive write path Use bearer token file for remote write authentication with telemeter Sep 2, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 2, 2022
@JoaoBraveCoding
Copy link
Contributor

/skip

Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart from a small comment, also do we have a Jira issue or a BZ with more content for the change?

@@ -1657,13 +1677,12 @@ func (f *Factory) PrometheusK8s(grpcTLS *v1.Secret, trustedCABundleCM *v1.Config
Replacement: "alerts",
},
},
MetadataConfig: &monv1.MetadataConfig{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to not be related, was it intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but it should be commented. Basically we don't want/need to forward metadata information to the telemeter server.

It comes from #1416 which was the previous PR that exercised the receive path of telemeter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Telemeter forwards rw requests to Thanos, and Thanos doesn't have metadata ingestion yet. We can still send it, but it would just be dropped. 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you still want to add a comment for this @simonpasquier ?

@JoaoBraveCoding
Copy link
Contributor

For traceability I'm also linking this here since AFAIK these two PRs are related openshift/telemeter#430

@simonpasquier
Copy link
Contributor Author

@jan--f I'm wondering if we shouldn't add an end-to-end test to validate the /receive path. Even though we don't support it yet in OCP, it would help detecting regressions on the server side. WDYT?

Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the questions looks good 👌


// TestTelemeterRemoteWrite verifies that the monitoring stack can send data to
// the telemeter server using the native Prometheus remote write endpoint.
func TestTelemeterRemoteWrite(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small curiosity questions:

  1. Where does CMO get the config/info that Prom is supposed to remote-write to https://infogw.api.openshift.com.+? I see we are enabling remote-write but I don't get how it knows it's supposed to send them there.
  2. Why are we modifying the deployment instead of using the ConfigMap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thank you 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we have no way to turn on native remote write without setting CMO into unmanaged state and hacking the deployment args.

@simonpasquier
Copy link
Contributor Author

/skip

@simonpasquier
Copy link
Contributor Author

/retest

@simonpasquier
Copy link
Contributor Author

/assign @jan--f

clusterID := f.config.ClusterMonitoringConfiguration.TelemeterClientConfig.ClusterID
if telemetryEnabled && f.config.RemoteWrite {

if telemetrySecret != nil {
selectorRelabelConfig, err := promqlgen.LabelSelectorsToRelabelConfig(f.config.ClusterMonitoringConfiguration.PrometheusK8sConfig.TelemetryMatches)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed separately: Future readers of this code would benefit from checking the actual config condition here (if telemetryEnabled && f.config.RemoteWrite) and assuming the secret was created instead of coupling this to the secret creating.

@jan--f
Copy link
Contributor

jan--f commented Sep 19, 2022

/retitle: MON-2807: Use bearer token file for remote write authentication with telemeter

@openshift-ci openshift-ci bot changed the title Use bearer token file for remote write authentication with telemeter : MON-2807: Use bearer token file for remote write authentication with telemeter Sep 19, 2022
@jan--f
Copy link
Contributor

jan--f commented Sep 19, 2022

Once change request discussed offline added here as a comment, otherwise this looks good.

@jan--f
Copy link
Contributor

jan--f commented Sep 19, 2022

/retitle MON-2807: Use bearer token file for remote write authentication with telemeter

@openshift-ci openshift-ci bot changed the title : MON-2807: Use bearer token file for remote write authentication with telemeter MON-2807: Use bearer token file for remote write authentication with telemeter Sep 19, 2022
@simonpasquier simonpasquier force-pushed the test-telemeter-receive branch 2 times, most recently from 43ab1e6 to c6d21a3 Compare September 21, 2022 15:17
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2022
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2022
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 22, 2022
@simonpasquier
Copy link
Contributor Author

/test e2e-agnostic-operator

@simonpasquier
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2023
For security concerns, it's better to pass the bearer token via a Secret
rather than sticking it in the Prometheus custom resource.

Added TestTelemeterRemoteWrite to the end-to-end tests to ensure that
the Telemetry remote-write path is tested though it isn't (yet) enabled.
@simonpasquier
Copy link
Contributor Author

/test e2e-agnostic

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 5, 2023

@simonpasquier: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jan--f
Copy link
Contributor

jan--f commented Jan 5, 2023

/lgtm
I think the commend about MetadataConfig would still be useful, but I can live without it.

On a side note, I was wondering about the cluster_id field in the secret...seems like that has been there since day 1 of this feature. Does anyone have an idea what the purpose is?

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 5, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 5, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jan--f, simonpasquier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jan--f,simonpasquier]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jan--f
Copy link
Contributor

jan--f commented Jan 5, 2023

/label docs-approved
/label px-approved
/label qe-approved
Adding labels for an addition to our e2e test suite.

@openshift-ci openshift-ci bot added docs-approved Signifies that Docs has signed off on this PR px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR labels Jan 5, 2023
@openshift-merge-robot openshift-merge-robot merged commit 2f1182a into openshift:master Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants