Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Basic/Bearer Authorization for prometheus metrics query #2961

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

trsteel
Copy link

@trsteel trsteel commented Jul 12, 2023

For security reasons, Prometheus generally adds some permission verification, such as Basic Authorization, Bearer Authorization. Someone may have encountered similar problems, so I support this feature.

Signed-off-by: trsteel <trsteel@github.com>
@william-wang
Copy link
Member

@trsteel Thanks for your contribution. Would you provide test result in the description? Great thanks.

@trsteel
Copy link
Author

trsteel commented Jul 13, 2023

@trsteel Thanks for your contribution. Would you provide test result in the description? Great thanks.

I'm not sure what kind of test result should i provide here.
In my scenario,the bearer token is required to query prometheus.
Such as: curl -H 'Authorization: Bearer xxx' -v -X GET http://${prom-gateway-ip}:9090/api/v1/query?query=mem_usage_avg_5m.
If bearer token is miss, you can see some err log in volcano-scheduler(kubectl logs -n volcano-system volcano-scheduler): E0713 14:51:57.020468 39736 metrics_client_prometheus.go:85] Error querying Prometheus: client_error: client error: 401.

@lowang-bh
Copy link
Member

In my mind, query in promethuse server usally does not need authorization. It depends on the configuration of promethuse server. So the authorization conf should be optional.

@trsteel
Copy link
Author

trsteel commented Jul 14, 2023

In my mind, query in promethuse server usally does not need authorization. It depends on the configuration of promethuse server. So the authorization conf should be optional.

If you do not config username/password/bearertoken/bearertokenfile, transport.HTTPWrappersForConfig will do nothing.

@lowang-bh
Copy link
Member

Please update relative docs at docs/design/usage-based-scheduling.md

### Scheduler Configuration
```
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- name: usage # usage based scheduling plugin
arguments:
thresholds:
CPUUsageAvg.5m: 90 # The node whose average usage in 5 minute is higher than 90% will be filtered in predicating stage
MEMUsageAvg.5m: 80 # The node whose average usage in 5 minute is higher than 80% will be filtered in predicating stage
- plugins:
- name: overcommit
- name: drf
- name: predicates
- name: proportion
- name: nodeorder
- name: binpack
metrics: # metrics server related configuration
type: prometheus # Optional, The metrics source type, prometheus by default, support prometheus and elasticsearch
address: http://192.168.0.10:9090 # Mandatory, The metrics source address
interval: 30s # Optional, The scheduler pull metrics from Prometheus with this interval, 5s by default
tls: # Optional, The tls configuration
insecureSkipVerify: "false" # Optional, Skip the certificate verification, false by default
elasticsearch: # Optional, The elasticsearch configuration
index: "custom-index-name" # Optional, The elasticsearch index name, "metricbeat-*" by default
username: "" # Optional, The elasticsearch username
password: "" # Optional, The elasticsearch password
hostnameFieldName: "host.hostname" # Optional, The elasticsearch hostname field name, "host.hostname" by default

Signed-off-by: trsteel <trsteel@github.com>
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign wpeng102
You can assign the PR to them by writing /assign @wpeng102 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 14, 2023
@trsteel
Copy link
Author

trsteel commented Jul 14, 2023

Please update relative docs at docs/design/usage-based-scheduling.md

### Scheduler Configuration
```
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- name: usage # usage based scheduling plugin
arguments:
thresholds:
CPUUsageAvg.5m: 90 # The node whose average usage in 5 minute is higher than 90% will be filtered in predicating stage
MEMUsageAvg.5m: 80 # The node whose average usage in 5 minute is higher than 80% will be filtered in predicating stage
- plugins:
- name: overcommit
- name: drf
- name: predicates
- name: proportion
- name: nodeorder
- name: binpack
metrics: # metrics server related configuration
type: prometheus # Optional, The metrics source type, prometheus by default, support prometheus and elasticsearch
address: http://192.168.0.10:9090 # Mandatory, The metrics source address
interval: 30s # Optional, The scheduler pull metrics from Prometheus with this interval, 5s by default
tls: # Optional, The tls configuration
insecureSkipVerify: "false" # Optional, Skip the certificate verification, false by default
elasticsearch: # Optional, The elasticsearch configuration
index: "custom-index-name" # Optional, The elasticsearch index name, "metricbeat-*" by default
username: "" # Optional, The elasticsearch username
password: "" # Optional, The elasticsearch password
hostnameFieldName: "host.hostname" # Optional, The elasticsearch hostname field name, "host.hostname" by default

I pushed a new commit to update docs/design/usage-based-scheduling.md, please review again.

@lowang-bh
Copy link
Member

I pushed a new commit to update docs/design/usage-based-scheduling.md, please review again.

that's ok. How about another review-comment?

@trsteel
Copy link
Author

trsteel commented Jul 14, 2023

I pushed a new commit to update docs/design/usage-based-scheduling.md, please review again.

that's ok. How about another review-comment?

the first review-comment I already replied. The auth config is aleady optional. If prometheus do not need auth, you just leave auth config empty. I'm not sure if there have any other comment I missed.
Screenshot 2023-07-14 at 17 26 52

@lowang-bh
Copy link
Member

lowang-bh commented Jul 14, 2023

Hi, @trsteel , can you see this comment as the following screenshot? I mean if both basic auth and token auth are set, there will be a error, right? Maybe we need to do some check about the conf?

And another comment about UT: you use different path to test the promethuse server, but in product env the promthuse server address is configurated at root path. The UT can not cover the code. If the basic auth and token auth are both set, there may will be error, right?

image

tConf := &transport.Config{
Username: p.conf["username"],
Password: p.conf["password"],
BearerToken: p.conf["bearertoken"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would you please update the relative help docs?
  2. How about both basic auth and token auth are set? I find HTTPWrappersForConfig will check it and return error. link: https://github.com/kubernetes/client-go/blob/0c7c900fd55e3f77909f123388a3171d56346d7d/transport/round_trippers.go#L48-L49

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. already updated
  2. yes,we can trust the check in HTTPWrapperForConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't handle this error, then the node metrics will not take effect. How bout do some check on the conf?

Copy link
Author

@trsteel trsteel Jul 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is aleady many checks in HTTPWrappersForConfig, our check will same with it. Add a check function, and returns an error if check failed. I think it change nothing and improve nothing.

        // After adding checkSomehing function, we just return another error(different with `HTTPWrappersForConfig`) if auth config wrong.
	if err := checkSomehing(p.conf); err != nil {
		return nil, err
	}
	tConf := &transport.Config{
		Username:        p.conf["username"],
		Password:        p.conf["password"],
		BearerToken:     p.conf["bearertoken"],
		BearerTokenFile: p.conf["bearertokenfile"],
	}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check function, and returns an error if check failed. I think it change nothing and improve nothing.

How about give a default setting to make it works if both basic auth and token auth are set?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good ideal. I create a new commit for it. If basic auth is set, token will be clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok. How about the UT? I think UT should test server with root path and let conf with only basic auth, token auth, or both basic auth and token auth are set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UT is added now.

defer testServer.Close()
// test basic auth
conf := map[string]string{"username": basicAuthUser, "password": basicAuthPwd}
metricsClient, _ := NewPrometheusMetricsClient(testServer.URL+"/basic-auth", conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use different path to test the promethuse server, but in product env the promthuse server address is configurated at root path. If the basic auth and token auth are both set, there may will be error, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewPrometheusMetricsClient will returns an error if auth config wrong. My test focus on checking auth configs appear in http req Header.

@lowang-bh
Copy link
Member

Hi, @hwdef @wangyang0616 , would you please help to review it? thanks.

trsteel added 2 commits August 14, 2023 15:52
Signed-off-by: trsteel <trsteel@github.com>
Signed-off-by: trsteel <trsteel@github.com>
@stale
Copy link

stale bot commented Oct 15, 2023

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 15, 2023
@stale stale bot closed this Dec 15, 2023
@hwdef
Copy link
Member

hwdef commented Dec 15, 2023

/reopen
/assign @hwdef

@volcano-sh-bot
Copy link
Contributor

@hwdef: Reopened this PR.

In response to this:

/reopen
/assign @hwdef

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stale stale bot removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 15, 2023
Copy link

stale bot commented Mar 17, 2024

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2024
@Monokaix
Copy link
Member

Monokaix commented Apr 7, 2024

/ok-to-test

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2024
@volcano-sh-bot volcano-sh-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 7, 2024
@volcano-sh-bot
Copy link
Contributor

@trsteel: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants