-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data source HTTP metrics #1130
Data source HTTP metrics #1130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work thanks for following up on this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general. Some minor things left and would like to know if native histograms for external datasource plugins works as expected.
NativeHistogramBucketFactor: 1.1, | ||
NativeHistogramMaxBucketNumber: 100, | ||
NativeHistogramMinResetDuration: time.Hour, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to verify this works as expected for an external datasource plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this doesn't look like it works atm, maybe best to remove?
Namespace: "plugins", | ||
Name: "datasource_request_duration_seconds", | ||
Help: "histogram of durations of outgoing external data source requests sent from Grafana", | ||
Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10, 25, 50, 100}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to consider native histogram here as well if it works for datasource_response_size_bytes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
histogram_quantile(0.99, sum(rate(plugins_datasource_response_size_bytes{datasource_type="$plugin"}[$__rate_interval])) by (le))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope 😞 definitely shows native histograms aren't supported atm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
var executeMiddlewareFunc = executeMiddleware | ||
|
||
func DataSourceMetricsMiddleware() Middleware { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this could be re-used by Grafana itself here https://github.com/grafana/grafana/blob/f062c66f89b09fca7716438ac488a76bf81ed7f8/pkg/infra/httpclient/httpclientprovider/http_client_provider.go#L28, perhaps a follow up. But then it would require an option to configure the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed although I'm not sure how to change the namespace without m moving the creation of the metrics into a function which then causes duplicate register errors.
backend/admission.go
Outdated
@@ -3,15 +3,16 @@ package backend | |||
import ( | |||
"context" | |||
|
|||
ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I missed this somewhere, but why the need for a new package for this? Is it due to requiring the backend
package in httpclient
?
Yes this same old pesky problem. I recommend we find a better name for this just so we don't always have to alias the import. Though I dislike utility/helper packages, maybe ctxutil
? In the meantime I might do a quick look to see how much work it would be to shift some things around to avoid this entirely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it, yes, a cyclic dependency because of this line:
labels["endpoint"] = ctxHelpers.EndpointFromContext(ctx).String()
My suggestion to avoid the breaking change:
- Create a package under
backend/internal/endpointctx
and exposeEndpointCtxKeyType
andEndpointCtxKey
. Since it's within aninternal
folder, it won't be exposed outside of the repository. - Change
backend/endpoint.go
to use the above. - Change
backend/httpclient/datasource_metrics_middleware.go
and manually read the exposed key. Something like:
labels["endpoint"] = ""
if ep := ctx.Value(endpointctx.EndpointCtxKey); ep != nil {
labels["endpoint"] = ep.(string)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good alternative, I will implement and request a re-review 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @andresmgot - I think that'll do well for now. Ket's keep an eye on if this becomes more common, then we may look at a broader change to these packages perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented changes 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f4c5c5f
to
ae33eca
Compare
ctx := r.Context() | ||
labels["endpoint"] = "" | ||
if ep := ctx.Value(endpointctx.EndpointCtxKey); ep != nil { | ||
labels["endpoint"] = ep.(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this is not working as expected, sorry. I'm seeing some panics in my local instance:
logger=plugin.example-httpbackend-datasource t=2024-11-06T12:46:05.012205+01:00 level=error msg="panic triggered" error="interface conversion: interface {} is backend.Endpoint, not string"
Can you verify this?
We're currently lacking insight into key metrics that will aid us in understanding how HTTP data sources are used.
This PR ports the datasource_metrics_middleware from core Grafana which adds the following metrics under the
plugins
namespace:datasource_request_total
- a counter of the total number of outgoing requests for the data sourcedatasource_request_duration_seconds
- a histogram of the duration of outgoing requests for the data sourcedatasource_response_size_bytes
- a histogram of the size of the response body for the outgoing requestsdatasource_request_in_flight
- a gauge of the requests in flight for the data source