Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics Server: use gRPC connection to get metrics from Operator #3861

Merged
merged 8 commits into from
Nov 29, 2022

Conversation

zroubalik
Copy link
Member

@zroubalik zroubalik commented Nov 15, 2022

Signed-off-by: Zbynek Roubalik zroubalik@gmail.com

Changes

  • gRPC Server runs on Operator on port 9666 and provides metrics for Metrics Server, we are reusing scalers cache (with a connection to external service) that is already present on Operator, this should reduce the number of opened connections from KEDA approximately by half.
  • The old approach (with querying metrics directly from Metrics Server) is maintained and could be enabled on Metrics Server by setting env variable KEDA_USE_METRICS_SERVICE_GRPC to false, this option is deprecated and should be removed in the future releases
  • Prometheus Metrics on Operator now provides all metrics, Metrics Server still provides the metrics it did before, this option is deprecated and should be removed in the future releases
  • remove MetricSelector field from Scaler interface (and thus from all scalers) since it hasn't had really any use
  • moved fallback out of the provider package

Checklist

Fixes #3920
Fixes #3919

Relates to #2282

@zroubalik zroubalik requested a review from a team as a code owner November 15, 2022 20:52
@zroubalik
Copy link
Member Author

zroubalik commented Nov 15, 2022

/run-e2e
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 15, 2022

/run-e2e
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e kafka*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e kafka*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e aws*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e azure*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e influx*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e aws*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e kafka*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 16, 2022

/run-e2e
Update: You can check the progress here

@zroubalik
Copy link
Member Author

/run-e2e Update: You can check the progress here

🎉

The only failed test is ./internals/prometheus_metrics/prometheus_metrics_test.go which is expected, because the Prometheus stuff is not yet migrated.

@zroubalik
Copy link
Member Author

zroubalik commented Nov 28, 2022

/run-e2e Update: You can check the progress here

🎉

The only failed test is ./internals/prometheus_metrics/prometheus_metrics_test.go which is expected, because the Prometheus stuff is not yet migrated.

We are no longer querying the metrics values in Metrics Server but in the Operator and this affects exposed Prometheus metrics https://keda.sh/docs/2.8/operate/prometheus/#metrics-adapter

I'd deprecate the exposed metrics in Metrics Server and migrate them to Operator -> this way we have one place for all Prom metrics - the Operator #1281

I'd like to avoid transferring the metrics locally from Operator to MS, it could be done to maintain the current state, but I don't like this solution.

@kedacore/keda-maintainers WDYT?

@JorTurFer
Copy link
Member

I'd like to avoid transferring the metrics locally from Operator to MS, it could be done to maintain the current state, but I don't like this solution.

@kedacore/keda-maintainers WDYT?

I'd document this as a breaking change and I'd continue, this is a really nice improvement for the performance and if we maintain the same names and we update the annotations in deployment, metrics will be there...

@zroubalik
Copy link
Member Author

I'd document this as a breaking change and I'd continue, this is a really nice improvement for the performance and if we maintain the same names and we update the annotations in deployment, metrics will be there...

Yeah, another big plus for me is that we will have all Prom metrics in one place. And if we implement the multi-instances support for cluster, it would benefit from this also. We can always introduce some other Prom metrics on MS side if needed.

@zroubalik
Copy link
Member Author

zroubalik commented Nov 28, 2022

/run-e2e internals*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 29, 2022

/run-e2e internals*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

zroubalik commented Nov 29, 2022

/run-e2e
Update: You can check the progress here

@zroubalik
Copy link
Member Author

I will also send PR for docs and chart. Once this is merged I will try to tackle the caching part.

@JorTurFer
Copy link
Member

JorTurFer commented Nov 29, 2022

I'm going to tackle this e2e improvement, hopefully today, does it make sense to wait till is done, in order to test this change with that test?

@zroubalik
Copy link
Member Author

zroubalik commented Nov 29, 2022

I'm going to tackle this e2e improvement, hopefully today, does it make sense to wait till is done, in order to test this change with that test?

I'd prefer if we go ahead and merge this, I've tested this change manually and it is working for me. I would like to follow up with other features and don't want to make this PR huge (as it already is :)) )

@JorTurFer
Copy link
Member

okey, I'll review this PR before starting with the other issue later on today

@JorTurFer
Copy link
Member

Should we update docs to explain KEDA_USE_METRICS_SERVICE_GRPC?

@zroubalik
Copy link
Member Author

zroubalik commented Nov 29, 2022

Should we update docs to explain KEDA_USE_METRICS_SERVICE_GRPC?

Yeah, that's the plan, I will send docs PR later today. But I am still not decided if we should expose this to users. Ideally I'd like to remove this part of code in the next release. This is just a back up.

Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job!!!
only some nits inline, apart from them, could we add the version when the code will be removed or when has been deprecated in the [[ Deprecated ]] comments? just to remember it, something like [[ Deprecated in v2.9]]

pkg/prommetrics/prommetrics.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
…ault service address

Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
Signed-off-by: Zbynek Roubalik <zroubalik@gmail.com>
@zroubalik
Copy link
Member Author

Awesome job!!! only some nits inline, apart from them, could we add the version when the code will be removed or when has been deprecated in the [[ Deprecated ]] comments? just to remember it, something like [[ Deprecated in v2.9]]

I will create an issue to track this.

@zroubalik
Copy link
Member Author

zroubalik commented Nov 29, 2022

/run-e2e internals*
Update: You can check the progress here

@zroubalik
Copy link
Member Author

Awesome job!!! only some nits inline, apart from them, could we add the version when the code will be removed or when has been deprecated in the [[ Deprecated ]] comments? just to remember it, something like [[ Deprecated in v2.9]]

#3930

@zroubalik
Copy link
Member Author

@v-shenoy would be great if you can take a look on this, more eyes more see. We would like to merge this soon and follow up.

@JorTurFer JorTurFer merged commit 2492a43 into kedacore:main Nov 29, 2022
@v-shenoy
Copy link
Contributor

v-shenoy commented Nov 30, 2022

@v-shenoy would be great if you can take a look on this, more eyes more see. We would like to merge this soon and follow up.

Sorry it was night time here, so I missed it.

@v-shenoy
Copy link
Contributor

I know the PR has been merged, but leaving a few minor comments none the less.

@@ -44,7 +43,7 @@ func init() {
type Scaler interface {

// The scaler returns the metric values for a metric Name and criteria matching the selector
GetMetrics(ctx context.Context, metricName string, metricSelector labels.Selector) ([]external_metrics.ExternalMetricValue, error)
GetMetrics(ctx context.Context, metricName string) ([]external_metrics.ExternalMetricValue, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

apis/keda/v1alpha1/indentifier.go Show resolved Hide resolved
@v-shenoy
Copy link
Contributor

Just added two minor comments. Overall, a great job on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants